So this is failing I/O that iterates over a channel. I was tracking down the len, pending and pos used.
I found that this is not completely broken (like no access or generla I/O error) It starts at pos 0 and iterated with varying offsets, but works for quite some time. Example: [...] Thread 1 "qemu-system-x86" hit Breakpoint 2, qemu_fill_buffer (f=f@entry=0xd3b66f3c00) at ./migration/qemu-file.c:295 295 if (len > 0) { $11183 = 28728 $11184 = 4040 $11185 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 107130146, buf_index = 0, buf_size = 4040, buf = "\v\327\a\000\021\000\[...]\000"..., may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, iovcnt = 0, last_error = 0} [...] Well you could see the whole file read passing by one by one buffer Yet this isn't particularly fast, so track the one that has len==0 (gdb) b ./migration/qemu-file.c:295 if len == 0 And I got it as: (gdb) p *f $11195 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 319638837, buf_index = 0, buf_size = 0, buf = '\000' <repeats 5504 times>..., may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, iovcnt = 0, last_error = 0} Here pending == 0 so buf_size = 0 as well also pos is further down incremented to 319638837. Checking in detail I found that I had pending=0 and buf_size=0 as well as non aligned pos entried, but they worked. So I excluded the buf_size=0/pending=0 as well as the alignment as reasons. Maybe it just iterates pos out of the range that is working? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1711602 Title: --copy-storage-all failing with qemu 2.10 Status in QEMU: New Status in libvirt package in Ubuntu: Confirmed Status in qemu package in Ubuntu: Confirmed Bug description: We fixed an issue around disk locking already in regard to qemu-nbd [1], but there still seem to be issues. $ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0) 2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error Source libvirt log for the guest: 2017-08-18 12:09:08.251+0000: initiating migration 2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer 2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer Target libvirt log for the guest: 2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error 2017-08-18 12:09:09.010+0000: shutting down, reason=crashed Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy). Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails. Dmesg has no related denials as often apparmor is in the mix. Need to check libvirt logs of source [2] and target [3] in Detail. [1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html [2]: http://paste.ubuntu.com/25339356/ [3]: http://paste.ubuntu.com/25339358/ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions