[Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10

ChristianEhrhardt Tue, 22 Aug 2017 02:11:56 -0700

So this is failing I/O that iterates over a channel.
I was tracking down the len, pending and pos used.


I found that this is not completely broken (like no access or generla I/O error)
It starts at pos 0 and iterated with varying offsets, but works for quite some 
time.
Example:

[...]
Thread 1 "qemu-system-x86" hit Breakpoint 2, qemu_fill_buffer 
(f=f@entry=0xd3b66f3c00) at ./migration/qemu-file.c:295
295         if (len > 0) {
$11183 = 28728
$11184 = 4040
$11185 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 
0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 107130146, 
  buf_index = 0, buf_size = 4040, 
  buf = "\v\327\a\000\021\000\[...]\000"..., 
  may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, 
iovcnt = 0, last_error = 0}
[...]

Well you could see the whole file read passing by one by one buffer
Yet this isn't particularly fast, so track the one that has len==0
 (gdb) b ./migration/qemu-file.c:295 if len == 0

And I got it as:
(gdb) p *f
$11195 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 
0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 319638837, 
  buf_index = 0, buf_size = 0, buf = '\000' <repeats 5504 times>..., may_free = 
{0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, 
  iovcnt = 0, last_error = 0}

Here pending == 0 so buf_size = 0 as well also pos is further down
incremented to 319638837.

Checking in detail I found that I had pending=0 and buf_size=0 as well as non 
aligned pos entried, but they worked.
So I excluded the buf_size=0/pending=0 as well as the alignment as reasons.
Maybe it just iterates pos out of the range that is working?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602

Title:
  --copy-storage-all failing with qemu 2.10

Status in QEMU:
  New
Status in libvirt package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  We fixed an issue around disk locking already in regard to qemu-nbd
  [1], but there still seem to be issues.

  $ virsh migrate --live --copy-storage-all kvmguest-artful-normal 
qemu+ssh://10.22.69.196/system
  error: internal error: qemu unexpectedly closed the monitor: 
2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: 
char device redirected to /dev/pts/0 (label charserial0)
  2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: 
Input/output error

  Source libvirt log for the guest:
  2017-08-18 12:09:08.251+0000: initiating migration
  2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: 
Connection reset by peer
  2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: 
Connection reset by peer

  Target libvirt log for the guest:
  2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: 
Input/output error
  2017-08-18 12:09:09.010+0000: shutting down, reason=crashed

  Given the timing it seems that the actual copy now works (it is busy ~10 
seconds on my environment which would be the copy).
  Also we don't see the old errors we saw before, but afterwards on the actual 
take-over it fails.

  Dmesg has no related denials as often apparmor is in the mix.

  Need to check libvirt logs of source [2] and target [3] in Detail.

  [1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
  [2]: http://paste.ubuntu.com/25339356/
  [3]: http://paste.ubuntu.com/25339358/

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions

[Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10

Reply via email to