Greetings! We've been deploying with Kolla on CentOS 7 now for a while, and we've recently noticed a rather troubling behavior when we shutdown hypervisors.
Somewhere between systemd and libvirt's systemd-machined integration, we see that guests get killed aggressively by SIGTERM'ing all of the qemu-kvm processes. This seems to happen because they are scoped into machine.slice, but systemd-machined is killed which drops those scopes and thus results in killing off the machines. In the past, we've used the libvirt-guests service when our libvirt was running outside of containers. This worked splendidly, as we could have it wait 5 minutes for VMs to attempt a graceful shutdown, avoiding interrupting any running processes. But this service isn't available on the host OS, as it won't be able to talk to libvirt inside the container. The solution I've come up with for now is this: [Unit] Description=Manage libvirt guests in kolla safely After=docker.service systemd-machined.service Requires=docker.service [Install] WantedBy=sysinit.target [Service] Type=oneshot RemainAfterExit=yes TimeoutStopSec=400 ExecStart=/usr/bin/docker exec nova_libvirt /usr/libexec/libvirt-guests.sh start ExecStart=/usr/bin/docker start nova_compute ExecStop=/usr/bin/docker stop nova_compute ExecStop=/usr/bin/docker exec nova_libvirt /usr/libexec/libvirt-guests.sh shutdown This doesn't seem to work, though I'm still trying to work out the ordering and such. It should ensure that before we stop the systemd-machined and destroy all of its scopes (thus, killing all the vms), we run the libvirt-guests.sh script to try and shut them down. The TimeoutStopSec=400 is because the script itself waits 300 seconds for any VM that refuses to shutdown cleanly, so this gives it a chance to wait for at least one of those. This is an imperfect solution but it allows us to move forward after having made a reasonable attempt at clean shutdowns. Anyway, just wondering if anybody else using kolla-ansible or kolla containers in general have run into this problem, and whether or not there are better/known solutions. Thanks! __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev