Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] vl: Drain before (block) job cancel when quitting

Philippe Mathieu-Daudé Tue, 16 Jul 2019 00:00:36 -0700

On 6/19/19 9:40 PM, Max Reitz wrote:
> On 14.06.19 11:22, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 19:03, Max Reitz wrote:
>>> [re-adding the original CCs, why not]
>>>
>>> On 13.06.19 16:30, Vladimir Sementsov-Ogievskiy wrote:
>>>> 13.06.2019 17:21, Max Reitz wrote:
>>>>> On 13.06.19 16:19, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 13.06.2019 1:08, Max Reitz wrote:
>>>>>>> If the main loop cancels all block jobs while the block layer is not
>>>>>>> drained, this cancelling may not happen instantaneously.  We can start a
>>>>>>> drained section before vm_shutdown(), which entails another
>>>>>>> bdrv_drain_all(); this nested bdrv_drain_all() will thus be a no-op,
>>>>>>> basically.
>>>>>>>
>>>>>>> We do not have to end the drained section, because we actually do not
>>>>>>> want any requests to happen from this point on.


Your cover description was more helpful to understand the big picture
than this commit description.

>>>>>>>
>>>>>>> Signed-off-by: Max Reitz <mre...@redhat.com>
>>>>>>> ---
>>>>>>> I don't know whether it actually makes sense to never end this drained
>>>>>>> section.  It makes sense to me.  Please correct me if I'm wrong.
>>>>>>> ---
>>>>>>>     vl.c | 11 +++++++++++
>>>>>>>     1 file changed, 11 insertions(+)
>>>>>>>
>>>>>>> diff --git a/vl.c b/vl.c
>>>>>>> index cd1fbc4cdc..3f8b3f74f5 100644
>>>>>>> --- a/vl.c
>>>>>>> +++ b/vl.c
>>>>>>> @@ -4538,6 +4538,17 @@ int main(int argc, char **argv, char **envp)
>>>>>>>          */
>>>>>>>         migration_shutdown();
>>>>>>>     
>>>>>>> +    /*
>>>>>>> +     * We must cancel all block jobs while the block layer is drained,
>>>>>>> +     * or cancelling will be affected by throttling and thus may block
>>>>>>> +     * for an extended period of time.
>>>>>>> +     * vm_shutdown() will bdrv_drain_all(), so we may as well include
>>>>>>> +     * it in the drained section.
>>>>>>> +     * We do not need to end this section, because we do not want any
>>>>>>> +     * requests happening from here on anyway.
>>>>>>> +     */
>>>>>>> +    bdrv_drain_all_begin();

This change looks sane, but I'm not expert with the 'block layer'.

>>>>>>> +
>>>>>>>         /* No more vcpu or device emulation activity beyond this point 
>>>>>>> */
>>>>>>>         vm_shutdown();
>>>>>>>     
>>>>>>>
>>>>>>
>>>>>> So, actually, the problem is that we may wait for job requests twice:
>>>>>> on drain and then on cancel.
>>>>>
>>>>> We don’t wait on drain.  When the throttle node is drained, it will
>>>>> ignore throttling (as noted in the cover letter).
>>>>>
>>>>> We do wait when cancelling a job while the throttle node isn’t drained,
>>>>> though.  That’s the problem.
>>>>
>>>> Ah, understand now.
>>>>
>>>> Is it safe to drain_begin before stopping cpus? We may finish up then with 
>>>> some queued
>>>> somewhere IO requests..
>>>
>>> Hm...  Aren’t guest devices prohibited from issuing requests to the
>>> block layer while their respective block device is drained?
>>
>> It's at least a buggy place, I remember Denis Plotnikov sent patch to fix it 
>> and had a huge
>> discussion with Kevin.
>> And here it is:
>> https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00732.html
> 
> Ah, I even have that in my inbox...  The latest reply I see came in April:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2019-04/msg00243.html
> 
> Where Kevin asked for an RFC patch in the current state.
> 
> I’m not sure whether I should work around a potential bug here, if we
> can agree that it is a bug, and if it isn’t clear whether this place
> would actually be affected.

Hmmm so what is the outcome of this thread?

Apparently this is not a bug for Max, but a potential one for Vladimir?

Vladimir are you NACKing this patch?

Thanks,

Phil.

> 
> Max
>

Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] vl: Drain before (block) job cancel when quitting

Reply via email to