On 10/23/2012 05:26 PM, Paolo Bonzini wrote:
>>> Yes, that's the point of doing things asynchronously---you do not need
>>> to do everything within stop_machine, you can start canceling AIO as
>>> soon as the OS sends the hot-unplug request.  Then you only proceed with
>>> stop_machine and freeing device memory when the first part.
>> 
>> You cannot always cancel I/O (for example threaded I/O already in progress).
> 
> Yep, but we try to do this anyway today and nothing changes really.  The
> difference is between hotplug never completing and blocking
> synchronously, vs. hotplug never completing and not invoking the
> asynchronous callback.  I.e. really no difference at all.

Not cancelling is not the same as not completing; the request will
complete on its own eventually.  The question is whether the programming
model is synchronous or callback based.

> 
>>> In other words, isolate can complete asynchronously.
>> 
>> Can it?  I don't think so.
>> 
>> Here's how I see it:
>> 
>>  1. non-malicious guest stops driving device
>>  2. isolate()
>>  3. a malicious guest cannot drive the device at this point
>>  4. some kind of barrier to let the device, or drive activity from a
>> malicious guest, wind down
>>  5. destroy()
>> 
>> If you need to report the completion of step 2, it cannot be done
>> asynchronously.
> 
> In hardware everything is asynchronous anyway.  It will *look*
> synchronous, because if CPU#0 is stuck in a synchronous isolate(), and
> CPU#1 polls for the outcome, CPU#1 will lock on the BQL held by CPU#0.

That is fine.  isolate() is expensive but it is cpu bound, it does not
involve any I/O (unlike the barrier afterwards, which has to wait on any
I/O which we were not able to cancel).

> But our interfaces had better support asynchronicity, and indeed they
> do: after you write to the "eject" register, the "up" will show the
> device as present until after destroy is done.  This can be changed to
> show the device as present only until after step 4 is done.

Let's say we want to eject the hotplug hardware itself (just as an
example).  With refcounts, the callback that updates "up" will hold on
to to it via refcounts.  With stop_machine(), you need to cancel that
callback, or wait for it somehow, or it can arrive after the
stop_machine() and bite you.

> 
>> We may also want notification after step 4 (or 5); if the device holds
>> some host resource someone may want to know that it is ready for reuse.
> 
> I think guest notification should be after (4), while management
> notification should be after (5).

Yes. After (2) we can return from the eject mmio.

-- 
error compiling committee.c: too many arguments to function

Reply via email to