1. Introduction

Toolbox is a platform to boost your Proxmox VE infrastructure.

One of the main design goals was to make use as simple as possible. You can use Toolbox for a single node or cluster of many nodes. All management tasks can be performed using web-based management interface and even a novice user can set up and install Toolbox in minutes.

1.1. Why?

The reasons that led us to develop this environment for Proxmox VE are: - the lack of functionality we need - the needs of the IT Manager for a simplified and total management of the infrastructure

Examples to better understand:

How do we know in a Backup Task which ones have been successful or not, the occupation execution times or other?

Everything is clearly written in the log if there are more than 5 backups it becomes very difficult to have this information

But the backups in my Task overlap with others and how is the load of the nodes?

The log contains the dates and durations and on which host it is running

How can I see occupancy of a VM at the storage level?

I go to the VM I check the storage, I check its backups

Would I like to take snapshots every 15 minutes?

it is not possible

We have implemented simple solutions to get information immediately and in a simple way.

1.2. Web-based Management Interface

Toolbox è semplice da usare. Le attività di gestione possono essere eseguite tramite l’interfaccia di gestione basata sul Web inclusa: non è necessario installare uno strumento di gestione separato. Lo strumento ti consente di gestire l’intero cluster. La gestione centralizzata basata sul web consente di controllare tutte le funzionalità dalla GUI.

1.3. Your benefits with Toolbox

  • Fast installation and easy-to-use

  • Web-based management interface

  • Low administration costs and simple deployment

1.4. Project History

The project started in 2016, with command line cv4pve-tools. The tools are Open Source on the site GitHub and are still maintained today. Initially we used shell commands inside Proxmox VE. We immediately understood the fragility of the system.

We have developed the API for various languages and then we have re-implemented all the tools. This gave us the ability to be independent from the internal modifications of Proxmox VE, and made it possible to run outside of Proxmox VE.

The GUI version started 2019. cv4pve-tools

2. Installing Toolbox

Toolbox is an application developed in .Net. For deploy is used Docker image.

2.1. Install

Toolbox runs as a lightweight container on a Docker engine or within a Swarm cluster. Deploy Toolbox on a standalone LINUX Docker host/single node swarm cluster (or Windows 10 Docker Host running in “Linux containers” mode). Use the following Docker commands to deploy the Toolbox: Use the following Docker commands to deploy:

Volume creation for data
# docker volume create cv4pve_toolbox_data
Creation of container image

To replace the part xx.xx.x see the latest version.

# docker run -d -p 8001:443 -e TZ=Europe/Rome --restart=always -v cv4pve_toolbox_data:/app/storage --name cv4pve-toolbox corsinvest/cv4pve-toolbox:xx.xx.x

“TZ = Europe / Rome” is needed to synchronize the time zone. List of time zone.

Access via a web browser the IP address of the Docker machine to port 8001 (example: https://IP address:8001/). The default credentials are: user admin password admin.

2.2. Update

During the update phase, the configuration data will not be lost because it is saved in the volume created in the first installation phase.

On the portal, if available, new updates will be reported through a notification. Updates are available according to maintenance. If you perform an update on an expired maintenance the system will not work.

Let’s stop the container
# docker stop cv4pve-toolbox
Delete the container
# docker rm cv4pve-toolbox
Launch the new version

To replace the part xx.xx.x see the latest version.

# docker run -d -p 8001:443 -e TZ=Europe/Rome --restart=always -v cv4pve_toolbox_data:/app/storage --name cv4pve-toolbox corsinvest/cv4pve-toolbox:xx.xx.x

3. Graphical User Interface

Toolbox is simple. There is no need to install a separate management tool, and everything can be done through your web browser (Latest Firefox or Google Chrome is preferred).

You can use the web-based administration interface with any modern browser.

The web interface can be reached via https://youripaddress:8001 (default login is: admin, and the password is admin).

3.1. Features

  • AJAX technologies for dynamic updates of resources

  • All grids export data

  • Secure access to all via SSL encryption (https)

3.2. Login

login

When you connect to the Toolbox, you will first see the login window.

3.3. GUI Overview

summary

The Toolbox user interface consists of three regions.

Header

On top. Shows status information and contains buttons for most important actions.

Modules tree

At the left side. A navigation tree where you can select specific module.

Content Panel

Center region. Selected module display content.

You can reduce and expand the size of the tree. This can be useful when you work on small screens and want more space to view other content.

3.3.1. Header

At the top left, the button to expand and reduce the tree or view modules if listed. Next to the module name.

The rightmost part of the header contains four buttons:

Notify

Notify service activity.

Help

Opens a new browser window showing the reference documentation.

Support

Opens a new browser window showing the support portal.

User menu

actions for user.

3.3.2. Modules tree

This is the main navigation tree. It is divided into contexts and contains all the available modules.

3.3.3. Content Panels

When you select a module in the tree, the corresponding shows status information in the content. Refer to the individual chapters of the modules within the reference documentation for more detailed information.

4. Modules

4.1. Home

Show status of cluster Proxmox VE and Toolbox

4.1.1. Proxmox VE

Green state ok, orange there are problems.

pve

4.1.2. Toolbox

Show status modules and detail.

toolbox

4.2. Status

This module show status of cluster Proxmox VE and any objects.

4.2.1. Summary

Status of cluster Proxmox VE.

summary

4.2.2. Objects

Grid with status objects Proxmox VE and resources usage.

objects

4.3. Autosnap

4.3.1. Need

Take a snapshot of a VM/CT at a fixed interval with a retention.

4.3.2. Implementation

We started in 2016 with command line development Github and then implemented the web portal.

4.3.3. Note

Do not schedule less than 15 minutes, there may be performance problems.
If you include memory, the space and time will increase considerably. Test it before deploying to production environment.
Pay attention to the timeout time.
The location of the autosnap storage space is the same where the VM disk is located. This reduces the total storage capacity.

4.3.4. General

all

Tasks

Grid with status objects Proxmox VE and resources usage.

task

New and edit Task

task new edit

Hook

Create hook using web request.

task hook

New and edit Hook

task hook new edit

The environments are replace with values.

Example for InfluxDb for cv4pve-autosnap

Url

http://INFLUXDB_HOST:8086/write?db=INFLUXDB_NAME

Data

cv4pve-autosnap,vmid=%CV4PVE_AUTOSNAP_VMID%,type=%CV4PVE_AUTOSNAP_VMTYPE%,label=%CV4PVE_AUTOSNAP_LABEL%,vmname=%CV4PVE_AUTOSNAP_VMNAME%,success=%CV4PVE_AUTOSNAP_STATE% success=%CV4PVE_AUTOSNAP_STATE%duration=%CV4PVE_AUTOSNAP_DURATION%

History

Show history of job autosnap

task history

Status

Show status scheduling autosnap

task status

In Error

Grid with snapshots with error.

in error

All Auto Snapshot

All auto snapshots in Proxmox VE.

all auto snapshots

Some words about Snapshot consistency and what qemu-guest-agent can do for you

Bear in mind, that when taking a snapshot of a running VM, it’s basically like if you have a server which gets pulled away from the Power. Often this is not cathastrophic as the next fsck will try to fix Filesystem Issues, but in the worst case this could leave you with a severely damaged Filesystem, or even worse, half written Inodes which were in-flight when the power failed lead to silent data corruption. To overcome these things, we have the qemu-guest-agent to improve the consistency of the Filesystem while taking a snapshot. It won’t leave you a clean filesystem, but it sync()'s outstanding writes and halts all i/o until the snapshot is complete. Still, there might me issues on the Application layer. Databases processes might have unwritten data in memory, which is the most common case. Here you have the opportunity to do additional tuning, and use hooks to tell your vital processes things to do prio and post freezes.

First, you want to make sure that your guest has the qemu-guest-agent running and is working properly. Now we use custom hooks to tell your services with volatile data, to flush all unwritten data to disk. On debian based linux systems the hook file can be set in ```/etc/default/qemu-guest-agent``` and could simply contain this line:

DAEMON_ARGS="-F/etc/qemu/fsfreeze-hook"

Create "/etc/qemu/fsfreeze-hook" and make ist look like:

#!/bin/sh

# This script is executed when a guest agent receives fsfreeze-freeze and
# fsfreeze-thaw command, if it is specified in --fsfreeze-hook (-F)
# option of qemu-ga or placed in default path (/etc/qemu/fsfreeze-hook).
# When the agent receives fsfreeze-freeze request, this script is issued with
# "freeze" argument before the filesystem is frozen. And for fsfreeze-thaw
# request, it is issued with "thaw" argument after filesystem is thawed.

LOGFILE=/var/log/qga-fsfreeze-hook.log
FSFREEZE_D=$(dirname -- "$0")/fsfreeze-hook.d

# Check whether file $1 is a backup or rpm-generated file and should be ignored
is_ignored_file() {
    case "$1" in
        *~ | *.bak | *.orig | *.rpmnew | *.rpmorig | *.rpmsave | *.sample | *.dpkg-old | *.dpkg-new | *.dpkg-tmp | *.dpkg-dist |
*.dpkg-bak | *.dpkg-backup | *.dpkg-remove)
            return 0 ;;
    esac
    return 1
}

# Iterate executables in directory "fsfreeze-hook.d" with the specified args
[ ! -d "$FSFREEZE_D" ] && exit 0
for file in "$FSFREEZE_D"/* ; do
    is_ignored_file "$file" && continue
    [ -x "$file" ] || continue
    printf "$(date): execute $file $@\n" >>$LOGFILE
    "$file" "$@" >>$LOGFILE 2>&1
    STATUS=$?
    printf "$(date): $file finished with status=$STATUS\n" >>$LOGFILE
done

exit 0

For testing purposes place this into ```/etc/qemu/fsfreeze-hook.d/10-info```:

#!/bin/bash
dt=$(date +%s)

case "$1" in
    freeze)
        echo "frozen on $dt" | tee >(cat >/tmp/fsfreeze)
    ;;
    thaw)
        echo "thawed on $dt" | tee >(cat >>/tmp/fsfreeze)
    ;;
esac

Now you can place files for different Services in ```/etc/qemu/fsfreeze-hook.d/``` that tell those services what to to prior and post snapshots. A very common example is mysql. Create a file ```/etc/qemu/fsfreeze-hook.d/20-mysql``` containing

#!/bin/sh

# Flush MySQL tables to the disk before the filesystem is frozen.
# At the same time, this keeps a read lock in order to avoid write accesses
# from the other clients until the filesystem is thawed.

MYSQL="/usr/bin/mysql"
#MYSQL_OPTS="-uroot" #"-prootpassword"
MYSQL_OPTS="--defaults-extra-file=/etc/mysql/debian.cnf"
FIFO=/var/run/mysql-flush.fifo

# Check mysql is installed and the server running
[ -x "$MYSQL" ] && "$MYSQL" $MYSQL_OPTS < /dev/null || exit 0

flush_and_wait() {
    printf "FLUSH TABLES WITH READ LOCK \\G\n"
    trap 'printf "$(date): $0 is killed\n">&2' HUP INT QUIT ALRM TERM
    read < $FIFO
    printf "UNLOCK TABLES \\G\n"
    rm -f $FIFO
}

case "$1" in
    freeze)
        mkfifo $FIFO || exit 1
        flush_and_wait | "$MYSQL" $MYSQL_OPTS &
        # wait until every block is flushed
        while [ "$(echo 'SHOW STATUS LIKE "Key_blocks_not_flushed"' |\
                 "$MYSQL" $MYSQL_OPTS | tail -1 | cut -f 2)" -gt 0 ]; do
            sleep 1
        done
        # for InnoDB, wait until every log is flushed
        INNODB_STATUS=$(mktemp /tmp/mysql-flush.XXXXXX)
        [ $? -ne 0 ] && exit 2
        trap "rm -f $INNODB_STATUS; exit 1" HUP INT QUIT ALRM TERM
        while :; do
            printf "SHOW ENGINE INNODB STATUS \\G" |\
                "$MYSQL" $MYSQL_OPTS > $INNODB_STATUS
            LOG_CURRENT=$(grep 'Log sequence number' $INNODB_STATUS |\
                          tr -s ' ' | cut -d' ' -f4)
            LOG_FLUSHED=$(grep 'Log flushed up to' $INNODB_STATUS |\
                          tr -s ' ' | cut -d' ' -f5)
            [ "$LOG_CURRENT" = "$LOG_FLUSHED" ] && break
            sleep 1
        done
        rm -f $INNODB_STATUS
        ;;

    thaw)
        [ ! -p $FIFO ] && exit 1
        echo > $FIFO
        ;;

    *)
        exit 1
        ;;
esac

4.4. Node Protect

4.4.1. Need

Proxmox VE perfectly performs VM/CT backups. The rest of the system configurations are not saved.

4.4.2. Implementation

We started in 2019 with command line development Github and then implemented the web portal.

4.4.3. General

grid

4.5. Diagnostic

4.5.1. Need

Have a diagnostic that detects the errors that are commonly made in infrastructure management. Example:

  • backups not configured

  • mounted cd-roms

  • unused and orphaned discs

  • more, more, more

4.5.2. Implementation

We started in 2019 with command line development Github and then implemented the web portal.

4.5.3. General

History

history

Ignore Issues definitions

Show status scheduling diagnostic

ignore issues definitions

4.6. Disks

4.6.1. Need

Locate a disk in array controller and blink led.

Grid

Show all physical disk for every nodes.

grid

If expand row apparir the row of S.M.A.R.T. tools information.

4.7. Qemu Monitor

4.7.1. Need

Proxmox VE does not allow the operating system to view IOPS for VM. With this it is easy to identify the virtual machines and solve the problem.

Grid

grid

4.8. Inventor

4.8.1. Need

Snapshot of the entire cluster, resources, information, status and hardware, for later comparison. This allows you to monitor changes.

Grid

grid

Cluster Info

cluster info

4.9. Storage Usage

4.9.1. Need

Have a glance at the storage occupation status for VM/CT and vice versa.

4.9.2. By Storage

grid by storage

Expanded

grid by storage expand

4.9.3. By VM/CT

grid by vm

Expanded

grid by vm expand

4.10. Replication Trend

4.10.1. Need

Have a history of replicas for VM/CT with information on status, size, duration.

4.10.2. General

Job

job

Scheduled

scheduled

Simulation

simulation

4.10.3. Show data

Range

range

VM/CT

vm ct

Size

size

Duration

duration

4.11. VzDump Trend

4.11.1. Need

Have a history of backups for VM/CT with information on status, size, duration, speed. A recurring problem is reading the logs to extract information.

4.11.2. General

Job

job

Scheduled

scheduled

Scheduled

not scheduled

Simulation

simulation

Backup inline

backups inline

4.11.3. Show data

Range

range

Storage

storage

VM/CT

vm ct

Size

size

Speed

speed

Duration

duration

4.12. File Manager

4.12.1. Need

Manage host files and folders in graphical mode without using a command line ssh shell.

4.12.2. General

Manager

manager

Favorites

favorites

4.13. Misc Utility

4.13.1. VM Locked

Need

Unlock VM T that are in a locked state without using the shell

General

vm unlock

4.13.2. Free Memory Nodes

Need

Free the memory of the nodes to start a VM without using a shell.

The message that is displayed "kvm: failed to initialize KVM: Cannot allocate memory"

General

free memory nodes

5. Support

5.1. Subscription

5.1.1. Commercial Support

Corsinvest Srl also offers enterprise support available as Subscription Service Plans. All users with a subscription get access to the Toolbox Customer Portal. The customer portal provides help and support with guaranteed response times from the Toolbox developers.

For volume discounts, or more information in general, please contact support@corsinvest.it. = Frequently Asked Questions :title: FAQ

New FAQs are appended to the bottom of this section. = Roadmap :title: Roadmap