Steps with test packages on Focal (shutdown-on-runtime)
---

Stop libvirtd systemd units

         sudo systemctl stop 'libvirtd*'

Start libvirt in GDB

  sudo gdb \
    -iex 'set confirm off' \
    -iex 'set pagination off' \
    -ex 'set non-stop on' \
    -ex 'handle SIGTERM nostop noprint pass' \
    -ex 'add-symbol-file /usr/sbin/libvirtd' \
    -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \
    -ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \
    -ex 'add-symbol-file 
/usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \
    /usr/sbin/libvirtd

Add breakpoints for qemu driver cleanup and device deleted event

         b qemuStateCleanup
         b processDeviceDeletedEvent
         run

Start test VM with an USB mouse device

          cat <<-EOF >test-vm.xml
          <domain type='qemu'>
            <name>test-vm</name>
            <os>
              <type>hvm</type>
            </os>
            <memory unit='MiB'>32</memory>
            <vcpu>1</vcpu>
            <devices>
              <input type='mouse' bus='usb'/>
            </devices>
          </domain>
        EOF

         virsh define test-vm.xml
         virsh start test-vm

         $ virsh list
         Id Name State
         -------------------------
         1 test-vm running

Delete the USB mouse device

         DEVICE_ID=$(virsh qemu-monitor-command test-vm --hmp 'info qtree' | 
grep 'dev: usb-mouse' | cut -d'"' -f2)
         virsh qemu-monitor-command test-vm --hmp "device_del $DEVICE_ID"

Back to GDB

         Thread 20 "libvirtd" hit Breakpoint 2, 0x00007ffba902204e in
processDeviceDeletedEvent (devAlias=<optimized out>, vm=0x7ffbac00de90,
driver=0x7ffbac021380) at ../../../src/qemu/qemu_driver.c:4888

Add breakpoint to domain status XML save, and continue the thread above

         b virDomainObjSave
         t 20
         c

        Thread 20 "libvirtd" hit Breakpoint 3, virDomainObjSave
(obj=0x7ffbac00de90, xmlopt=0x7ffbac044130, statusDir=0x7ffbac01f530
"/run/libvirt/qemu") at ../../../src/conf/domain_conf.c:29157

Check the backtrace of the domain status XML save function, coming from
device deleted event

         (gdb) bt
        #0  virDomainObjSave (obj=0x7ffbac00de90, xmlopt=0x7ffbac044130, 
statusDir=0x7ffbac01f530 "/run/libvirt/qemu") at 
../../../src/conf/domain_conf.c:29157
        #1  0x00007ffba9022127 in processDeviceDeletedEvent 
(devAlias=0x556074b5e3f0 "input0", vm=0x7ffbac00de90, driver=0x7ffbac021380) at 
../../../src/qemu/qemu_driver.c:4312
        #2  qemuProcessEventHandler (data=0x556074b63a10, 
opaque=0x7ffbac021380) at ../../../src/qemu/qemu_driver.c:4888
        #3  0x00007ffbbee8f1af in virThreadPoolWorker 
(opaque=opaque@entry=0x556074c047a0) at ../../../src/util/virthreadpool.c:163
        #4  0x00007ffbbee8e51c in virThreadHelper (data=<optimized out>) at 
../../../src/util/virthread.c:196
        #5  0x00007ffbbeb4f609 in start_thread (arg=<optimized out>) at 
pthread_create.c:477
        #6  0x00007ffbbea74353 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Leave the thread at this point

Let's trigger the shutdown path

         $ sudo kill $(pidof libvirtd)

        Thread 1 "libvirtd" hit Breakpoint 1, qemuStateCleanup () at
../../../src/qemu/qemu_driver.c:1127

Check the function pointer is non-NULL _before_ cleanup

        (gdb) p xmlopt.privateData.format
        $1 = (virDomainXMLPrivateDataFormatFunc) 0x7ffba8f7c7c0 
<qemuDomainObjPrivateXMLFormat>

        (gdb) p/x xmlopt.parent
        $2 = {u = {dummy_align1 = 0x1cafe0027, dummy_align2 = 0x1cafe0027, s = 
{magic = 0xcafe0027, refs = 0x1}}, klass = 0x7ffbac044100}

Let cleanup run:

        t 1
        c &

Check the formatter/options again; it is *STILL* referenced, not 0x0
anymore:

        (gdb) p xmlopt.privateData.format
        $3 = (virDomainXMLPrivateDataFormatFunc) 0x7ffba8f7c7c0 
<qemuDomainObjPrivateXMLFormat>

        (gdb) p/x xmlopt.parent
        $4 = {u = {dummy_align1 = 0x1cafe0027, dummy_align2 = 0x1cafe0027, s = 
{magic = 0xcafe0027, refs = 0x1}}, klass = 0x7ffbac044100}

Check the shutdown/cleanup thread is waiting for it,
in the path to free the worker thread pool:

        (gdb) i th 1
          Id   Target Id                                   Frame
          1    Thread 0x7ffbbb035b40 (LWP 5887) "libvirtd" (running)
        (gdb) t 1
        (gdb) interrupt
        (gdb) bt
        #0  futex_wait_cancelable (private=<optimized out>, expected=0, 
futex_word=0x7ffbac05fd60) at ../sysdeps/nptl/futex-internal.h:183
        #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, 
mutex=0x7ffbac05fce0, cond=0x7ffbac05fd38) at pthread_cond_wait.c:508
        #2  __pthread_cond_wait (cond=0x7ffbac05fd38, mutex=0x7ffbac05fce0) at 
pthread_cond_wait.c:647
        #3  0x00007ffbbee8e79b in virCondWait (c=<optimized out>, m=<optimized 
out>) at ../../../src/util/virthread.c:144
        #4  0x00007ffbbee8f438 in virThreadPoolFree (pool=<optimized out>) at 
../../../src/util/virthreadpool.c:286
        #5  0x00007ffba8fed5d1 in qemuStateCleanup () at 
../../../src/qemu/qemu_driver.c:1131
        #6  0x00007ffbbf02c47f in virStateCleanup () at 
../../../src/libvirt.c:669
        #7  0x0000556072acebc8 in main (argc=<optimized out>, argv=<optimized 
out>) at ../../../src/remote/remote_daemon.c:1447

Let the save function continue, and libvirt finishes shutting down:

         (gdb) c &
         Continuing.
         (gdb) t 20
         (gdb) c
        [Inferior 1 (process 5887) exited normally]
         (gdb) q

Check the VM status XML *after*:

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='5996'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>

Now, the next time libvirtd starts, it correctly parses that XML:

         $ sudo systemctl start libvirtd.service

         $ journalctl -b -u libvirtd.service | grep -A1 error
         $

And libvirt is aware of the domain, and can manage it:

         $ virsh list
          Id Name State
         -------------------------
          1 test-vm running

         $ virsh destroy test-vm
         Domain test-vm destroyed

         $ virsh undefine test-vm
        Domain test-vm has been undefined

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059272

Title:
  libvirt domain is not listed/managed after libvirt restart with
  messages "internal error: no monitor path" and "Failed to load config
  for domain"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to