Hi all,

We came across a problem, qemu was accidently killed by libvirtd after we 
restart libvirtd by service command.
Steps to reproduce:
1) Attach net device to VM by 'virsh attach-device' command
2) Detach the net device that added before by 'virsh detach-device' command
3) Do the above two steps repeatedly, and at the same time, we restart libvirtd 
daemon.
And then we have chance to find that, after libvirtd is restarted at one time, 
qemu is killed accidently.

The libvirt log message is:
[2015-12-17 11:34:36]: libvirtd : 216046: info : libvirt version: 1.2.17
[2015-12-17 11:34:36]: libvirtd : 216046: warning : virDriverLoadModule:65 : 
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not 
accessible
[2015-12-17 11:34:36]: libvirtd : 216046: info : 
virNetlinkEventServiceStart:662 : starting netlink event service with protocol 0
[2015-12-17 11:34:36]: libvirtd : 216046: info : 
virNetlinkEventServiceStart:712 : set netlink socket big buffer size
[2015-12-17 11:34:36]: libvirtd : 216046: info : 
virNetlinkEventServiceStart:662 : starting netlink event service with protocol 
15
[2015-12-17 11:34:36]: libvirtd : 216046: info : 
virNetlinkEventServiceStart:712 : set netlink socket big buffer size
[2015-12-17 11:34:36]: libvirtd : 216079: error : 
dnsmasqCapsRefreshInternal:741 : Cannot check dnsmasq binary /usr/sbin/dnsmasq: 
No such file or directory
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
networkReloadFirewallRules:1861 : Reloading iptables rules
[2015-12-17 11:34:36]: libvirtd : 216079: info : networkRefreshDaemons:1830 : 
Refreshing network daemons
[2015-12-17 11:34:36]: libvirtd : 216079: error : 
virNodeSuspendSupportsTarget:332 : internal error: Cannot probe for supported 
suspend types
[2015-12-17 11:34:36]: libvirtd : 216079: warning : virQEMUCapsInit:1037 : 
Failed to get host power management capabilities
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23039 : Scanning for configs in 
/var/run/libvirt/qemu
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName2.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName3.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23039 : Scanning for configs in /etc/libvirt/qemu
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName2.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : 
virDomainObjListLoadAllConfigs:23063 : Loading config file 'VMName3.xml'
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : 
Scanning for snapshots for domain VMName in 
/var/lib/libvirt/qemu/snapshot/VMName
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : 
Scanning for snapshots for domain VMName2 in 
/var/lib/libvirt/qemu/snapshot/VMName2
[2015-12-17 11:34:36]: libvirtd : 216079: info : qemuDomainSnapshotLoad:490 : 
Scanning for snapshots for domain VMName3 in 
/var/lib/libvirt/qemu/snapshot/VMName3
[2015-12-17 11:34:36]: libvirtd : 216047: info : 
remoteDispatchDomainDetachDevice:4105 : Domain is detached a device by client: 
127.0.0.1;0
[2015-12-17 11:34:36]: libvirtd : 216050: info : 
remoteDispatchDomainDetachDevice:4105 : Domain is detached a device by client: 
127.0.0.1;0
[2015-12-17 11:34:36]: libvirtd : 216047: info : virDomainDetachDevice:8508 : enter 
virDomainDetachDevice domainname=VMName2, xml= <interface type='bridge'>^M
  <source bridge='br-1'/>^M
          <mac address='00:16:3e:76:85:17'/>^M
<virtualport type='openvswitch' />^M
      <model type='virtio'/>^M
</interface>^M
^M

[2015-12-17 11:34:36]: libvirtd : 216050: info : virDomainDetachDevice:8508 : enter 
virDomainDetachDevice domainname=VMName, xml= <interface type='bridge'>^M
  <source bridge='br-1'/>^M
          <mac address='00:16:3e:76:85:15'/>^M
2015-12-17 11:34:36]: libvirtd : 216095: error : qemuMonitorJSONCheckError:380 
: internal error: unable to execute QEMU command 'netdev_del': Device 
'hostnet5' not found
[2015-12-17 11:34:36]: libvirtd : 216096: error : qemuMonitorJSONCheckError:380 
: internal error: unable to execute QEMU command 'netdev_del': Device 
'hostnet2' not found
[2015-12-17 11:34:36]: libvirtd : 216097: error : qemuMonitorJSONCheckError:380 
: internal error: unable to execute QEMU command 'netdev_del': Device 
'hostnet1' not found
[2015-12-17 11:34:37]: libvirtd : 216095: info : 
virSecurityDACRestoreSecurityFileLabelInternal:325 : Restoring DAC user and 
group on '/data/suse11_sp3_64_1'
[2015-12-17 11:34:37]: libvirtd : 216095: info : 
virSecurityDACSetOwnershipInternal:248 : Setting DAC user and group on 
'/data/suse11_sp3_64_1' to '0:0'
[2015-12-17 11:34:37]: libvirtd : 216096: info : 
virSecurityDACRestoreSecurityFileLabelInternal:325 : Restoring DAC user and 
group on '/data/suse11_sp3_64_2'
[2015-12-17 11:34:37]: libvirtd : 216096: info : 
virSecurityDACSetOwnershipInternal:248 : Setting DAC user and group on 
'/data/suse11_sp3_64_2' to '0:0'
[2015-12-17 11:34:37]: libvirtd : 216095: info : qemuProcessStop:5720 : VM 
'VMName' is dying,remove vBMC
[2015-12-17 11:34:37]: libvirtd : 216095: error : rtev_vBMC_send:209 : Failed 
to remove domain vBMC bind path for 33ae1cb0-c451-4620-8cee-1bbddd124b82, errno 
is 2
[2015-12-17 11:34:37]: libvirtd : 216050: info : 
remoteDispatchDomainAttachDevice:2855 : Domain is attached a device by client: 
127.0.0.1;0
[2015-12-17 11:34:37]: libvirtd : 216050: info : virDomainAttachDevice:8390 : enter 
virDomainAttachDevice domainname=VMName, xml=<interface type='bridge'>^M

The VM's qemu log is:
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: device_del, qmp_cmd_arguments: 
{"id": "net4"}
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that 
means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that 
means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_pci_device_unplugged:966 virtio-net
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 402473}, 
"event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net4/virtio-backend"}}
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 402586}, "event": 
"DEVICE_DELETED", "data": {"device": "net4", "path": "/machine/peripheral/net4"}}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: 
{"id": "hostnet4"}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: qom-list, qmp_cmd_arguments: 
{"path": "/machine/peripheral"}
---------------------------------------------- At this time, We restart 
libvirtd just right ------------------------------------------------------------
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: device_del, qmp_cmd_arguments: 
{"id": "net5"}
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that 
means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_set_status:524 virtio-net device status is 1 that 
means ACKNOWLEDGE
[2015-12-17 11:34:30] virtio_pci_device_unplugged:966 virtio-net
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 810030}, 
"event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net5/virtio-backend"}}
[2015-12-17 11:34:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352070, "microseconds": 810150}, "event": 
"DEVICE_DELETED", "data": {"device": "net5", "path": "/machine/peripheral/net5"}}
[2015-12-17 11:34:30] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: 
{"id": "hostnet5"}
[2015-12-17 11:34:35] send_control_event:225 virtio serial port 2 send control 
message event = 6, value = 0
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: qmp_capabilities
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-commands
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: add-fd, qmp_cmd_arguments: {"fdset-id": 
0, "opaque": "/dev/null"}
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: block-commit, qmp_cmd_arguments: 
{"device": "bogus"}
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-events
[2015-12-17 11:34:36] send_control_event:225 virtio serial port 2 send control 
message event = 6, value = 1
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-status
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-block
[2015-12-17 11:34:36] handle_qmp_command:5105 qmp_cmd_name: query-chardev
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: qom-list, qmp_cmd_arguments: 
{"path": "/machine/peripheral"}
[2015-12-17 11:34:36] handle_qmp_command:5115 qmp_cmd_name: netdev_del, qmp_cmd_arguments: 
{"id": "hostnet5"}
[2015-12-17 11:34:36]: shutting down
[2015-12-17 11:34:36] send_control_event:225 virtio serial port 2 send control 
message event = 6, value = 0
qemu: terminating on signal 15 from pid 216046(libvirtd)
[2015-12-17 11:34:36] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352076, 
"microseconds": 267427}, "event": "SHUTDOWN"}
[2015-12-17 11:34:36] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1450352076, "microseconds": 267773}, "event": 
"DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-0-1", "tray-open": true}}

We have analyzed this problem, and thought it was triggered by the inconsistent 
state for the net device
between qemu and the new libvirtd.
After libvirtd is started, it will connect to qemu, and updates the devices 
state by command 'qom-list',
and also it will read the configure information from XML stored.
The codes call process is:
qemuProcessReconnect
  ->qemuProcessUpdateDevices
       ->if (qemuDomainUpdateDeviceList(driver, vm, QEMU_ASYNC_JOB_NONE) < 0)
       ->if ((tmp = old)) {
        while (*tmp) {
            if (!virStringArrayHasString(priv->qemuDevices, *tmp) &&
                virDomainDefFindDevice(vm->def, *tmp, &dev, false) == 0 &&
                qemuDomainRemoveDevice(driver, vm, &dev) < 0) {
                goto cleanup;  --->If we fail here, it will return -1; and it 
will call
                                   qemuProcessStop() in qemuProcessReconnect to 
kill VM.
            }
            tmp++;
        }
    }
It is a little hard to reproduce this problem, we can reproduce it in a special 
way:
1) Attach a net device by virsh command
2) Stop libvirtd
3) Detach the net device by directly send qmp command to QEMU though the 
monitor, just like:
# ./qmp-shell /var/lib/libvirt/qemu/redhat.monitor
Welcome to the QMP low-level shell!
Connected to QEMU 2.3.0

(QEMU) qom-list path=/machine/peripheral
{u'return': [{u'type': u'child<virtio-net-pci>', u'name': u'net1'}, {u'type': u'child<cirrus-vga>', u'name': 
u'video0'}, {u'type': u'child<usb-tablet>', u'name': u'input0'}, {u'type': u'child<isa-serial>', u'name': 
u'serial0'}, {u'type': u'child<virtio-net-pci>', u'name': u'net0'}, {u'type': u'child<scsi-hd>', u'name': 
u'scsi0-0-0-0'}, {u'type': u'child<virtio-scsi-pci>', u'name': u'scsi0'}, {u'type': u'child<piix3-usb-uhci>', 
u'name': u'usb'}, {u'type': u'string', u'name': u'type'}]}
(QEMU) device_del id=net1
{u'return': {}}
(QEMU) netdev_del id=hostnet1
{u'return': {}}
4) Restart libvirtd daemon and then you will found that qemu is killed by 
libvirtd.

The main reason for the inconsistent state is, we accidently restarted libvirtd 
before it finished treating
'detach device' work.
It seems a little violent to kill such VM directly.
Can we just do the unfinished cleanup work for the VM after libvirtd is 
starting ?
(IMHO, we can't ensure not to restart libvirtd while it is treating the detach 
device work.)


Thanks.
zhanghailiang




--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to