Steps to reproduce on Focal (shutdown-on-init)
---

LXD virtual machine

        lxc exec lp2059272-focal -- su - ubuntu
        lxc exec lp2059272-focal -- su - ubuntu

Latest Packages and Debug Symbols:

        cat <<EOF | sudo tee /etc/apt/sources.list.d/proposed.list
        deb http://archive.ubuntu.com/ubuntu focal-proposed main universe
        deb http://ddebs.ubuntu.com focal-proposed main restricted
        EOF

        cat <<EOF | sudo tee /etc/apt/preferences.d/proposed
        Package: *
        Pin: release a=focal-proposed
        Pin-Priority: 400
        EOF

        sudo apt install --yes --no-install-recommends gdb qemu-system-x86 
ubuntu-dbgsym-keyring
        sudo apt install --yes --no-install-recommends -t focal-proposed 
libvirt{0,-daemon{,-driver-qemu,-system}}{,-dbgsym} libvirt-clients
 
        $ dpkg -l | awk '$2 ~ /^libvirt/ { print $2, $3 }'
        libvirt-clients 6.0.0-0ubuntu8.17
        libvirt-daemon 6.0.0-0ubuntu8.17
        libvirt-daemon-dbgsym 6.0.0-0ubuntu8.17
        libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.17
        libvirt-daemon-driver-qemu-dbgsym 6.0.0-0ubuntu8.17
        libvirt-daemon-system 6.0.0-0ubuntu8.17
        libvirt-daemon-system-systemd 6.0.0-0ubuntu8.17
        libvirt0:amd64 6.0.0-0ubuntu8.17
        libvirt0-dbgsym:amd64 6.0.0-0ubuntu8.17

Start test VM

        cat <<-EOF >test-vm.xml
        <domain type='qemu'>
          <name>test-vm</name>
          <os>
            <type>hvm</type>
          </os>
          <memory unit='MiB'>32</memory>
          <vcpu>1</vcpu>
        </domain>
        EOF

        virsh define test-vm.xml
        virsh start test-vm

        $ virsh list
         Id   Name      State
        -------------------------
         1    test-vm   running

Stop libvirt systemd units

        sudo systemctl stop 'libvirtd*'

Start libvirt in GDB

        sudo gdb \
          -iex 'set confirm off' \
          -iex 'set pagination off' \
          -ex 'set non-stop on' \
          -ex 'handle SIGTERM nostop noprint pass' \
          -ex 'add-symbol-file /usr/sbin/libvirtd' \
          -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \
          -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \
          -ex 'add-symbol-file 
/usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \
          /usr/sbin/libvirtd

Add breakpoints for qemu driver cleanup and domain status XML save

        b qemuStateCleanup
        b virDomainObjSave
        run

        Thread 20 "libvirtd" hit Breakpoint 2, virDomainObjSave
(obj=0x555cf5d83480, xmlopt=0x555cf5d7e6a0, statusDir=0x555cf5d26e70
"/run/libvirt/qemu") at ../../../src/conf/domain_conf.c:29157

Check the backtrace of the domain status XML save function, coming from
QEMU process reconnect:

        t 20

        (gdb) bt
        #0  virDomainObjSave (obj=0x555cf5d83480, xmlopt=0x555cf5d7e6a0, 
statusDir=0x555cf5d26e70 "/run/libvirt/qemu") at 
../../../src/conf/domain_conf.c:29157
        #1  0x00007f743b666268 in qemuProcessReconnect (opaque=<optimized out>) 
at ../../../src/qemu/qemu_process.c:8122
        #2  0x00007f7460b9054a in virThreadHelper (data=<optimized out>) at 
../../../src/util/virthread.c:196
        #3  0x00007f7460851609 in start_thread (arg=<optimized out>) at 
pthread_create.c:477
        #4  0x00007f7460776353 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Leave the thread at this point

Let's trigger the shutdown path

        $ sudo kill $(pidof libvirtd)

        Thread 1 "libvirtd" hit Breakpoint 1, qemuStateCleanup () at
../../../src/qemu/qemu_driver.c:1118

Check there are 2 threads: cleanup and domain status XML save

        (gdb) i th

          Id   Target Id                                   Frame
          1    Thread 0x7f745cd37b40 (LWP 8029) "libvirtd" qemuStateCleanup () 
at ../../../src/qemu/qemu_driver.c:1118
          2    Thread 0x7f745c8a9700 (LWP 8034) "libvirtd" (running)
        ...
          18   Thread 0x7f7417fff700 (LWP 8100) "libvirtd" (running)
        * 20   Thread 0x7f7416ffd700 (LWP 8105) "libvirtd" virDomainObjSave 
(obj=0x555cf5d83480, xmlopt=0x555cf5d7e6a0, statusDir=0x555cf5d26e70 
"/run/libvirt/qemu") at ../../../src/conf/domain_conf.c:29157

Confirm the qemu driver's domain xml formatter/options is
set/referenced:

        t 20

        (gdb) p xmlopt.privateData.format
        $1 = (virDomainXMLPrivateDataFormatFunc) 0x7f743b628810 
<qemuDomainObjPrivateXMLFormat>

        (gdb) p xmlopt.parent.u.s.refs
        $2 = 1

        (gdb) p/x xmlopt.parent
        $3 = {u = {dummy_align1 = 0x1cafe0027, dummy_align2 = 0x1cafe0027, s = 
{magic = 0xcafe0027, refs = 0x1}}, klass = 0x555cf5dbbbb0}

Let the cleanup function finish

        t 1
        finish
        
Check the formatter/options again; it is now zeroed, and used after freed:

        t 20

        (gdb) p xmlopt.privateData.format
        $5 = (virDomainXMLPrivateDataFormatFunc) 0x0

        (gdb) p xmlopt.parent.u.s.refs
        $6 = 21852

        (gdb) p/x xmlopt.parent
        $7 = {u = {dummy_align1 = 0x555cf5d39800, dummy_align2 = 
0x555cf5d39800, s = {magic = 0xf5d39800, refs = 0x555c}}, klass = 
0x555cf5d09010}

The object data is zeroed in the last unreference in Focal,
and its contents changed, as this is really an use-after-free
(another thread might get/use that memory).

Check the VM status XML *before* the save function finishes:

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='7932'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>

Let the save function continue

        (gdb) c &

Check the VM status XML *after*:

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='7932'>
          <domain type='qemu' id='1'>

It no longer has the 'monitor path' tag/field.

Let libvirt finish shutting down:

        (gdb) t 1
        (gdb) c
        Continuing.
        ...
        [Inferior 1 (process 8029) exited normally]
        (gdb) quit
        
Now, the next time libvirtd starts, it fails to parse that XML:

        $ sudo systemctl start libvirtd.service

        $ journalctl -b -u libvirtd.service | tail
        ...
        ... libvirtd[8297]: 8313: error : qemuDomainObjPrivateXMLParse:3678 : 
internal error: no monitor path
        ... libvirtd[8297]: 8313: error : virDomainObjListLoadAllConfigs:632 : 
Failed to load config for domain 'test-vm'

And libvirt is not aware of the domain, and cannot manage it:

        $ virsh list
         Id   Name   State
        --------------------

        $ virsh list --all
         Id   Name      State
        --------------------------
         -    test-vm   shut off

Even though it is still running:

        $ pgrep -af qemu-system-x86_64 | cut -d, -f1
        7932 /usr/bin/qemu-system-x86_64 -name guest=test-vm

Stop it manually

        $ sudo kill $(pgrep -f qemu-system-x86_64)
        $ sudo rm /run/libvirt/qemu/test-vm.{xml,pid}

Start libvirt:

        sudo systemctl start libvirtd.service

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059272

Title:
  libvirt domain is not listed/managed after libvirt restart with
  messages "internal error: no monitor path" and "Failed to load config
  for domain"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to