On Fri, Jun 22, 2012 at 6:53 PM, Kevin Wolf <kw...@redhat.com> wrote: > Am 22.06.2012 10:20, schrieb Peter Crosthwaite: >> On Fri, Jun 22, 2012 at 5:49 PM, Kevin Wolf <kw...@redhat.com> wrote: >>> Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite: >>>> The block layer assumes that it is the only user of coroutines - >>>> The qemu_in_coroutine() is used to determine if a function is in one of the >>>> block layers coroutines, which is flawed. I.E. If a client (e.g. a device >>>> or >>>> a machine model) of the block layer uses couroutine itself, the block layer >>>> will identify the callers coroutines as its own, and may falsely yield the >>>> calling coroutine (instead of creating its own to yield). >>>> >>>> AFAICT, there are no conflicts in the QEMU master here yet, but its kind >>>> of an >>>> issue, as anyone who comes along and used coroutines and the block layer >>>> together is going to run into some very obscure and hard to debug race >>>> conditions. >>>> >>>> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwa...@petalogix.com> >>> >>> What does your coroutine caller look like that this is a problem? >> >> Its a machine model that instantiated some block devices concurrently. >> Theres some chicken-and-egg issues with the instantiation such that >> they have the happen concurrently. One device instantiates a block >> device (pflash_cfi_01) from coroutine context. block then check >> qemu_in_coroutine() which returns true so it uses my coroutine for its >> inner workings, whereas if it were a normal machine model it would >> kick of its own coroutine to do its block stuff. >> >> Does >>> it make assumptions about the number of yields or anything like that? >> >> Yes it does, but I thought that was the point of coroutines vs >> threads? coroutines you control yield behaviour explicitly whearas >> thread you cant? > > Well, I can see your point, although today no code in qemu makes use of > the property that the caller could in theory know where the coroutine > yields. I think it's also dangerous because there is a non-obvious > dependency of the caller on the internals of the coroutine. A simple > innocent looking refactoring of the coroutine might break assumptions > that are made here. > > Other code in qemu that uses coroutines only makes use of the fact that > coroutines can only be interrupted at known points,
So within the block layer, will the block coroutines yield anywhere or only at a qemu_coroutine_yield() site? Would the block layer break if a couroutine could be pre-empted by the OS anywhere? So lets continue this to an example of multiple clients of qemu-coroutine. The block layer is one client. It defines three possible yields points (qemu_co_yield() sites) in block.c. Lets call them A,B and C. The co-routine policy is that your thread of control will not be pre-empted until locations A, B or C are reached (I.E. coroutines can only be interrupted at known points). Alls fair and it works. Now here is where it breaks. I have a device or machine model or whatever (lets call it foo) that uses co-routines. It decides that it wants its coroutines to stop at only at points D,E and F for ease of synchronization. If those coroutines however make calls into to the block layer, the block layer will think its in its own co-routine and potentially yield at any of points A,B and C. Thing is, my foo system doesn't care about the block layer implementation, so it shouldnt have awareness of the fact it used co-routines too. But because it talks to block, it can get yielded as any call into the block API which is awkward considering the point of coroutines is they can only be interrupted at known points. You have essentially defeated the purpose or coroutines in the first place. Foo's coroutines are behaving like preemptible threads (while blocks are behaving like coroutines). The problem is solved with a simple piece of policy - dont yield someone elses co-routine. Thats this patch. Note that this is an RFC intended to start discussion - it needs a real implementation. Something along the lines of keeping track of the block layer coroutines and checking if in one of those specifically, rather then a blanket call to qemu_in_coroutine(). But I have this patch in my workarounds branch until we figure out a real solution. Regards, Peter so synchronisation > between multiple coroutines becomes easier. > > Kevin