On Wed, Aug 31, 2022 at 10:13:08AM +0100, Richard W.M. Jones wrote: > On Wed, Aug 31, 2022 at 09:45:45AM +0800, Ming Lei wrote: > > On Tue, Aug 30, 2022 at 05:13:46PM +0100, Richard W.M. Jones wrote: > > > On Tue, Aug 30, 2022 at 11:29:26PM +0800, Ming Lei wrote: > > > > On Tue, Aug 30, 2022 at 03:38:50PM +0100, Richard W.M. Jones wrote: > > > > > On Tue, Aug 30, 2022 at 03:12:23PM +0800, Ming Lei wrote: > > > > > > The patch sent in last email may cause io hang on MQ, and follows > > > > > > the fixed > > > > > > version: > > > > > > > > > > I split this into two commits and cleaned them up and posted them > > > > > here: > > > > > > > > > > https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/ > > > > > > > > > > Unfortunately this doesn't work for me. When I do various filesystem > > > > > operations like git clone and a compile I see some subtle disk errors > > > > > and eventually it deadlocks, so I guess there is some problem. > > > > > > > > OK, care to provide more details about the reproducer? Like how backend > > > > is setup, MQ/SQ is used, disk size, ... > > > > > > My test script is attached. $1 == "ublk". > > > > > > It basically just clones a Linux repo and compiles it. It hangs > > > either during the clone or early in the build, and there are various > > > "scary messages" from git which might indicate disk corruption. > > > > > > The NBD server is: > > > > > > nbdkit -f memory 24G > > > > > > running on the hypervisor ("nbd://pick"). > > > > > > > I have cloned linux kernel source tree on nbdublk disk and built it with > > > > fedora 36 config for ~20min, so far so good. In my setting, backend is > > > > 'nbdkit file /dev/sda(virtio-scsi)', nbdublk is single queue. > > > > > > Can you see if you can reproduce a hang with the source from: > > > > > > https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/ > > > > > > I may have made a mistake when rebasing your patch or fixing it up to > > > remove compiler warnings. > > > > My test used the your tree directly. And I compared with it with > > my native tree, basically same. > > > > Today I will setup & run the test by your approach. > > I tried it again now and it definitely deadlocks under load.
I can reproduce it, please try the top patch in aio branch, which fixed hang in my reproducer with your test setting. https://github.com/ming1/ubdsrv/commits/aio Thanks, Ming _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs