El jue., 18 nov. 2021 16:31, Hanna Reitz <hre...@redhat.com> escribió:
> On 18.11.21 14:50, Paolo Bonzini wrote: > > On 11/15/21 17:03, Hanna Reitz wrote: > >> > >> I only really see four solutions for this: > >> (1) We somehow make the amend job run in the main context under the > >> BQL and have it prevent all concurrent I/O access (seems bad) > >> (2) We can make the permission functions part of the I/O path (seems > >> wrong and probably impossible?) > >> (3) We can drop the permissions update and permanently require the > >> permissions that we need when updating keys (I think this might break > >> existing use cases) > >> (4) We can acquire the BQL around the permission update call and > >> perhaps that works? > >> > >> I don’t know how (4) would work but it’s basically the only > >> reasonable solution I can come up with. Would this be a way to call > >> a BQL function from an I/O function? > > > > I think that would deadlock: > > > > main I/O thread > > -------- ----- > > start bdrv_co_amend > > take BQL > > bdrv_drain > > ... hangs ... > > :/ > > Is there really nothing we can do? Forgive me if I’m talking complete > nonsense here (because frankly I don’t even really know what a bottom > half is exactly), but can’t we schedule some coroutine in the main > thread to do the perm notifications and wait for them in the I/O thread? > I think you still get a deadlock, just one with a longer chain. You still have a cycle of things depending on each other, but one of them is now the I/O thread waiting for the bottom half. Hmm... Perhaps. We would need to undo the permission change when the > job finishes, though, i.e. in JobDriver.prepare() or JobDriver.clean(). > Doing the change in qmp_x_blockdev_amend() would be asymmetric then, so > we’d probably want a new JobDriver method that runs in the main thread > before .run() is invoked. (Unfortunately, “.prepare()” is now taken > already...) > Ok at least it's feasible. Doesn’t solve the FUSE problem, but there we could try to just take the > RESIZE permission permanently and if that fails, we just don’t allow > truncates for that export. Not nice, but should work for common cases. > Yeah definitely not nice. Probably permissions could be protected by their own mutex, even a global one like the one we have for jobs. For now I suggest just ignoring the problem and adding a comment, since it's not really something that didn't exist. Paolo