Hello again libvirt community!
This summer, I once again worked on a GSoC project with libvirt. This
time, the goal of my project was to improve the libvirt integration with
SaltStack and, in particular, the support for VM migration.
SaltStack is an incredibly valuable tool that helps with automation and
infrastructure management. This is accomplished with master/minion
topology where a master acts as a central control bus for the clients
(minions), and the minions connect back to the master. The Salt virt
module provides support for core cloud operations such as VM lifecycle
management, including VM migration, by allowing minions to connect with
libvirt virtualization host.
The Salt virt.migrate interface allows users to migrate a VM from one
host to another. However, the implementation of this function used to be
based on the virsh command-line tool (instead of the libvirt Python API
bindings), which had resulted in an increased code complexity (e.g.,
string concatenation, invoking Popen.subprocess, etc) and had a limited
number of migration options exposed to users.
My project was roughly separated into three phases. The aim of the first
phase was to improve the test coverage of the virt module by increasing
the number of integration tests (i.e., before applying any major code
changes). However, SaltStack is designed to run on different platforms,
and it has been extended over the years to support many applications and
tools. Consequently, the SaltStack test suite has grown in complexity
and number of dependencies as well. As a result, not all of them (e.g.,
libvirt and qemu) are available in the Jenkins CI environment. In
addition, our goal was to only be able to effectively test VM lifecycle
management features (start, destroy, list, etc.) but also to test VM
migration. To achieve this we packaged libvirt, qemu, and all required
SaltStack minion dependencies in a container image that allows us to
spawn multiple virtualization host instances on a single machine (with
low virtualization overhead).
In the second phase of the project, now with a test environment in
place, we were able to refactor the virt.migrate implementation to
utilize the Python bindings of libvirt and add support for additional
migration options (described in the “What's new?” section below).
The third phase was primarily focused on refining the set patches,
resolving merge conflicts with upstream changes or other related PRs.
One of the major challenges was to revisit the set of tests to make them
compatible with pytest as well to collaborate with the SaltStack
community to ensure that all proposed changes will be easy to maintain
in the future.
Contributions
============
Although the development work was done in several iterations, with code
reviews on regular basis, the final result was consolidated in a single
GitHub pull request: https://github.com/saltstack/salt/pull/57947
What’s new?
===========
https://docs.saltstack.com/en/master/ref/modules/all/salt.modules.virt.html#salt.modules.virt.migrate
The virt.migrate interface of SaltStack has been extended to support
target URI format:
$ salt src virt.migrate guest qemu+ssh://dst/system
$ salt src virt.migrate guest qemu+tcp://dst/system
$ salt src virt.migrate guest qemu+tls://dst/system
with preserved backward compatibility:
$ salt src virt.migrate guest dst
$ salt src virt.migrate guest dst True
Support for the following migration options was introduced:
Disable live migration
$ salt src virt.migrate guest qemu+ssh://dst/system live=False
Leave the migrated VM transient on destination host
$ salt src virt.migrate guest qemu+ssh://dst/system persistent=False
Leave the domain defined on the source host
$ salt src virt.migrate guest qemu+ssh://dst/system undefinesource=False
Offline migration
$ salt src virt.migrate guest qemu+ssh://dst/system offline=False
Set maximum bandwidth (in MiB/s)
$ salt src virt.migrate guest qemu+tls://dst/system max_bandwidth=10
Set maximum tolerable downtime for live-migration (i.e. a number of
milliseconds the VM is allowed to be down at the end of live migration)
$ salt src virt.migrate guest qemu+tls://dst/system max_downtime=100
Set a number of parallel network connections for live migration
$ salt src virt.migrate guest qemu+tls://dst/system parallel_connections=10
Live migration with enabled compression
$ salt src virt.migrate guest qemu+tls://dst/system \
compressed=True \
comp_methods=mt \
comp_mt_level=5 \
comp_mt_threads=4 \
comp_mt_dthreads=4
$ salt src virt.migrate guest qemu+tls://dst/system \
compressed=True \
comp_methods=xbzrle \
comp_xbzrle_cache=1024
Migrate non-shared storage
(Full disk copy)
$ salt src virt.migrate guest qemu+tls://dst/system copy_storage=all
(Incremental copy)
$ salt src virt.migrate guest qemu+tls://dst/system copy_storage=inc
Using post-copy migration
$ salt src virt.migrate guest qemu+tls://dst/system postcopy=True
$ salt src virt.migrate_start_postcopy guest
Using post-copy migration with bandwidth limit (MiB/s)
$ salt src virt.migrate guest qemu+tls://dst/system \
postcopy=True \
postcopy_bandwidth=1
$ salt src virt.migrate_start_postcopy guest
Acknowledgments
===============
I would like to sincerely thank all those who provided me with their
time, assistance, and guidance during the course of this project, for
which I am enormously grateful.
I would like to thank Cedric Bosdonnat (cbosdo), Pedro Algarvio
(s0undt3ch), Wayne Werner (waynew), Charles McMarrow (cmcmarrow) and
Daniel Wozniak (dwoz) who offered a huge amount of support, expertise,
guidance and code reviews.
I would also like to thank Michal Privoznik (mprivozn), Martin
Kletzander (mkletzan) and the rest of the libvirt and SaltStack
communities who have been extremely supportive.
Best wishes,
Radostin