On Thu, Jul 17, 2014 at 3:54 PM, Marcin Gibuła <m.gib...@beyond.pl> wrote: >>> Yes, exactly. ISCSI-based setup can take some minutes to deploy, given >>> prepared image, and I have one hundred percent hit rate for the >>> original issue with it. >> >> >> I've reproduced your IO hang with 2.0 and both >> 9b1786829aefb83f37a8f3135e3ea91c56001b56 and >> a096b3a6732f846ec57dc28b47ee9435aa0609bf applied. >> >> Reverting 9b1786829aefb83f37a8f3135e3ea91c56001b56 indeed fixes the >> problem (but reintroduces block-migration hang). It's seems like qemu >> bug rather than guest problem, as no-kvmclock parameters makes no >> difference. IO just stops, all qemu IO threads die off. Almost like it >> forgets to migrate them:-) >> >> I'm attaching backtrace from guest kernel and qemu and qemu command line. >> >> Going to compile 2.1-rc. > > > 2.1-rc2 behaves exactly the same. > > Interestingly enough, reseting guest system causes I/O to work again. So > it's not qemu that hangs on IO, rather it fails to notify guest about > completed operations that were issued during migration. > > And its somehow caused by calling cpu_synchronize_all_states() inside > kvmclock_vm_state_change(). > > > > As for testing with cache=writeback, I'll try to setup some iscsi to test > it.
Awesome, thanks! AFAIK you`ll not be able to use write cache with iscsi for migration. VM which had a reset before hangs always when freshly launched have a chance to be migrated successfully. And yes, it looks like lower layer forgetting to notify driver about some operations at a glance. > > -- > mg