On Fri, Jun 22, 2012 at 6:53 PM, Kevin Wolf <kw...@redhat.com> wrote:
> Am 22.06.2012 10:20, schrieb Peter Crosthwaite:
>> On Fri, Jun 22, 2012 at 5:49 PM, Kevin Wolf <kw...@redhat.com> wrote:
>>> Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite:
>>>> The block layer assumes that it is the only user of coroutines -
>>>> The qemu_in_coroutine() is used to determine if a function is in one of the
>>>> block layers coroutines, which is flawed. I.E. If a client (e.g. a device 
>>>> or
>>>> a machine model) of the block layer uses couroutine itself, the block layer
>>>> will identify the callers coroutines as its own, and may falsely yield the
>>>> calling coroutine (instead of creating its own to yield).
>>>>
>>>> AFAICT, there are no conflicts in the QEMU master here yet, but its kind 
>>>> of an
>>>> issue, as anyone who comes along and used coroutines and the block layer
>>>> together is going to run into some very obscure and hard to debug race
>>>> conditions.
>>>>
>>>> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwa...@petalogix.com>
>>>
>>> What does your coroutine caller look like that this is a problem?
>>
>> Its a machine model that instantiated some block devices concurrently.
>> Theres some chicken-and-egg issues with the instantiation such that
>> they have the happen concurrently. One device instantiates a block
>> device (pflash_cfi_01) from coroutine context. block then check
>> qemu_in_coroutine() which returns true so it uses my coroutine for its
>> inner workings, whereas if it were a normal machine model it would
>> kick of its own coroutine to do its block stuff.
>>
>>  Does
>>> it make assumptions about the number of yields or anything like that?
>>
>> Yes it does, but I thought that was the point of coroutines vs
>> threads? coroutines you control yield behaviour explicitly whearas
>> thread you cant?
>
> Well, I can see your point, although today no code in qemu makes use of
> the property that the caller could in theory know where the coroutine
> yields. I think it's also dangerous because there is a non-obvious
> dependency of the caller on the internals of the coroutine. A simple
> innocent looking refactoring of the coroutine might break assumptions
> that are made here.
>
> Other code in qemu that uses coroutines only makes use of the fact that
> coroutines can only be interrupted at known points,

So within the block layer, will the block coroutines yield anywhere or
only at a qemu_coroutine_yield() site? Would the block layer break if
a couroutine could be pre-empted by the OS anywhere?

So lets continue this to an example of multiple clients of
qemu-coroutine. The block layer is one client. It defines three
possible yields points (qemu_co_yield() sites) in block.c. Lets call
them A,B and C. The co-routine policy is that your thread of control
will not be pre-empted until locations  A, B or C are reached (I.E.
coroutines can only be interrupted at known points). Alls fair and it
works.

Now here is where it breaks. I have a device or machine model or
whatever (lets call it foo) that uses co-routines. It decides that it
wants its coroutines to stop at only at points D,E and F for ease of
synchronization. If those coroutines however make calls into to the
block layer, the block layer will think its in its own co-routine and
potentially yield at any of points A,B and C. Thing is, my foo system
doesn't care about the block layer implementation, so it shouldnt have
awareness of the fact it used co-routines too. But because it talks to
block, it can get yielded as any call into the block API which is
awkward considering the point of coroutines is they can only be
interrupted at known points. You have essentially defeated the purpose
or coroutines in the first place. Foo's coroutines are behaving like
preemptible threads (while blocks are behaving like coroutines).

The problem is solved with a simple piece of policy - dont yield
someone elses co-routine. Thats this patch. Note that this is an RFC
intended to start discussion - it needs a real implementation.
Something along the lines of keeping track of the block layer
coroutines and checking if in one of those specifically, rather then a
blanket call to qemu_in_coroutine(). But I have this patch in my
workarounds branch until we figure out a real solution.

Regards,
Peter

 so synchronisation
> between multiple coroutines becomes easier.
>
> Kevin

Reply via email to