Since the qemu "lives" in that time I can try to debug what happens.
With strace to sniff where things could be I see right before the end:
     0.000203 recvmsg(27, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", 
iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, 
MSG_CMSG_CLOEXEC) = 0 <0.000014>
     0.000049 futex(0xca65dacf4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 
0xca4785a80, 20) = 1 <0.000016>
     0.000038 getpid()                  = 29750 <0.000023>
     0.000011 tgkill(29750, 29760, SIGUSR1) = 0 <0.000030>
     0.000012 futex(0xca4785a80, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000048>
     0.000010 futex(0xca47b46e4, FUTEX_WAIT_PRIVATE, 19, NULL) = 0 <0.002215>
     0.000032 sendmsg(21, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="{\"timestamp\": {\"seconds\": 1503322067, \"microseconds\": 
613178}, \"event\": \"MIGRATION\", \"data\": {\"status\": \"failed\"}}\r\n", 
iov_len=116}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 116 <0.000024>
     0.000074 write(2, "2017-08-21T13:27:47.613276Z qemu-system-x86_64: load of 
migration failed: Input/output error\n", 93) = 93 <0.000022>
     0.000055 close(27)                 = 0 <0.000090>

Now 29750 is the main process/tgid and 29760 is the third process started on 
the migration.
It is the one that does the vcpu ioctl's so I assume this is just the one 
representing the vpu.
Well gdb should be more useful so looking with that.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602

Title:
  --copy-storage-all failing with qemu 2.10

Status in QEMU:
  New
Status in libvirt package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  We fixed an issue around disk locking already in regard to qemu-nbd
  [1], but there still seem to be issues.

  $ virsh migrate --live --copy-storage-all kvmguest-artful-normal 
qemu+ssh://10.22.69.196/system
  error: internal error: qemu unexpectedly closed the monitor: 
2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: 
char device redirected to /dev/pts/0 (label charserial0)
  2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: 
Input/output error

  Source libvirt log for the guest:
  2017-08-18 12:09:08.251+0000: initiating migration
  2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: 
Connection reset by peer
  2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: 
Connection reset by peer

  Target libvirt log for the guest:
  2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: 
Input/output error
  2017-08-18 12:09:09.010+0000: shutting down, reason=crashed

  Given the timing it seems that the actual copy now works (it is busy ~10 
seconds on my environment which would be the copy).
  Also we don't see the old errors we saw before, but afterwards on the actual 
take-over it fails.

  Dmesg has no related denials as often apparmor is in the mix.

  Need to check libvirt logs of source [2] and target [3] in Detail.

  [1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
  [2]: http://paste.ubuntu.com/25339356/
  [3]: http://paste.ubuntu.com/25339358/

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions

Reply via email to