On Wed, Sep 28, 2016 at 11:34 AM, Fam Zheng <f...@redhat.com> wrote: > On Wed, 09/28 11:14, Roman Penyaev wrote: >> On Wed, Sep 28, 2016 at 5:01 AM, Fam Zheng <f...@redhat.com> wrote: >> > On Tue, 09/27 19:55, Roman Penyaev wrote: >> >> > The bug is 100% deterministic. Just boot up a guest with -drive >> >> > format=qcow2,aio=native. >> >> >> >> It turns out to be that everything is broken. I started all my >> >> tests with format=raw,aio=native and immediately got coroutine >> >> recursive. That is completely weird. >> >> >> >> So, what I did is the following: >> >> >> >> 1. Took latest master (nothing works) >> >> 2. Did interactive rebase to 12c8720 >> >> 12c8720 2016-06-28 | Merge remote-tracking branch >> >> 'remotes/stefanha/tags/block-pull-request' into staging [Peter >> >> Maydell] >> >> >> >> this merge request includes all your patches related to >> >> virtio-blk and MQ support. >> >> >> >> 3. Applied 0ed93d84edab. Everything works fine. >> > >> > Have you tried qcow2 at this point? raw crashes with 1a62d0accdf85 doesn't >> > mean >> > qcow2 is fine without it. >> > >> >> That's true. qcow2 IO path is different, and presence of the >> patch 1a62d0accdf85 does not affect - coroutine still enters >> recursively. >> >> But for me it is quite surprising that IO fragmentation (what >> was done in 1a62d0accdf85) rises the misbehavior on raw IO path. > > Maybe the mystery with this change is your particular I/O pattern on the raw > image is change thereafter, from ioq = 1 to ioq > 1 (from the linux-aio.c's > PoV, due to fragmentation), then multiple coroutines are created for one big > request, to trigger the crash.
Yes, could be. The only major difference in this patch is a loop over cut requests. -- Roman