Since the qemu "lives" in that time I can try to debug what happens. With strace to sniff where things could be I see right before the end: 0.000203 recvmsg(27, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 0 <0.000014> 0.000049 futex(0xca65dacf4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0xca4785a80, 20) = 1 <0.000016> 0.000038 getpid() = 29750 <0.000023> 0.000011 tgkill(29750, 29760, SIGUSR1) = 0 <0.000030> 0.000012 futex(0xca4785a80, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000048> 0.000010 futex(0xca47b46e4, FUTEX_WAIT_PRIVATE, 19, NULL) = 0 <0.002215> 0.000032 sendmsg(21, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="{\"timestamp\": {\"seconds\": 1503322067, \"microseconds\": 613178}, \"event\": \"MIGRATION\", \"data\": {\"status\": \"failed\"}}\r\n", iov_len=116}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 116 <0.000024> 0.000074 write(2, "2017-08-21T13:27:47.613276Z qemu-system-x86_64: load of migration failed: Input/output error\n", 93) = 93 <0.000022> 0.000055 close(27) = 0 <0.000090>
Now 29750 is the main process/tgid and 29760 is the third process started on the migration. It is the one that does the vcpu ioctl's so I assume this is just the one representing the vpu. Well gdb should be more useful so looking with that. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1711602 Title: --copy-storage-all failing with qemu 2.10 Status in QEMU: New Status in libvirt package in Ubuntu: Confirmed Status in qemu package in Ubuntu: Confirmed Bug description: We fixed an issue around disk locking already in regard to qemu-nbd [1], but there still seem to be issues. $ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0) 2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error Source libvirt log for the guest: 2017-08-18 12:09:08.251+0000: initiating migration 2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer 2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer Target libvirt log for the guest: 2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error 2017-08-18 12:09:09.010+0000: shutting down, reason=crashed Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy). Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails. Dmesg has no related denials as often apparmor is in the mix. Need to check libvirt logs of source [2] and target [3] in Detail. [1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html [2]: http://paste.ubuntu.com/25339356/ [3]: http://paste.ubuntu.com/25339358/ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions