On 2013-05-06 15:09, Paolo Bonzini wrote:
> Il 06/05/2013 14:06, Jan Kiszka ha scritto:
>> On 2013-05-06 13:47, Paolo Bonzini wrote:
>>> Il 06/05/2013 13:39, Jan Kiszka ha scritto:
>>>> On 2013-05-06 13:28, Paolo Bonzini wrote:
>>>>> Il 06/05/2013 13:11, Jan Kiszka ha scritto:
>>>>>> On 2013-05-06 12:58, Paolo Bonzini wrote:
>>>>>>> Il 06/05/2013 12:56, Jan Kiszka ha scritto:
>>>>>>>>> The problem is that even if I/O for a region is supposed to happen
>>>>>>>>> within the BQL, lookup can happen outside the BQL.  Lookup will use 
>>>>>>>>> the
>>>>>>>>> region even if it is just to discard it:
>>>>>>>>>
>>>>>>>>>            VCPU thread (under BQL)              device thread
>>>>>>>>>  
>>>>>>>>> --------------------------------------------------------------------------------------
>>>>>>>>>                                                 flatview_ref
>>>>>>>>>                                                 memory_region_find 
>>>>>>>>> returns d->mr
>>>>>>>>>                                                 
>>>>>>>>> memory_region_ref(d->mr) /* nop */
>>>>>>>>>            qdev_free(d)
>>>>>>>>>              object_unparent(d)
>>>>>>>>>                unrealize(d)
>>>>>>>>>                  memory_region_del_subregion(d->mr)
>>>>>>>>>                    FlatView updated, d->mr not in the new view
>>>>>>>>>
>>>>>>>>>                                                 flatview_unref
>>>>>>>>>                                                   
>>>>>>>>> memory_region_unref(d->mr)
>>>>>>>>>                                                     object_unref(d)
>>>>>>>>>                                                       free(d)
>>>>>>>>>                                                 if (!d->mr->is_ram) { 
>>>>>>>>>        /* BAD! */
>>>>>>>>>                                                   
>>>>>>>>> memory_region_unref(d->mr) /* nop */
>>>>>>>>>                                                   return error
>>>>>>>>>                                                 }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here, the memory region is dereferenced *before* we know that it is 
>>>>>>>>> BQL-free
>>>>>>>>> (in fact, exactly to ascertain whether it is BQL-free).
>>>>>>>>
>>>>>>>> Both flatview update and lookup *plus* locking type evaluation (i.e.
>>>>>>>> memory region dereferencing) always happen under the address space 
>>>>>>>> lock.
>>>>>>>> See Pingfan's patch.
>>>>>>>
>>>>>>> That's true of address_space_rw/map, but I don't think it holds for
>>>>>>> memory_region_find.
>>>>>>
>>>>>> It has to, or it would be broken: Either it is called on a region that
>>>>>> supports reference counting
>>>>>
>>>>> You cannot know that in advance, can you?  The address is decided by the
>>>>> guest.
>>>>
>>>> Need to help me again to get the context: In which case is this a
>>>> hot-path that we want to keep BQL-free? Current users of
>>>> memory_region_find appear to be all relatively slow paths, thus are fine
>>>> with staying under BQL.
>>>
>>> virtio-blk-dataplane is basically redoing memory_region_find with a
>>> separate data structure, exactly so that it can run outside the BQL
>>> before we get BQL-free MMIO dispatch.
>>>
>>> I can try to post patches later today that actually use
>>> memory_region_find instead.
>>
>> We could define its semantics as follows: return a reference to the
>> corresponding memory region, provide this is safe. A reference is safe when
>>  - the region supports BQL-free operation (thus provides an owner to
>>    apply reference counting on)
> 
> This doesn't really work.  Regions that are known not to disappear (most
> importantly, the main RAM region) also support BQL-free operation, but
> have no owner right now.

Those few are much easier to convert than a full set of PCI and other
hot-pluggable device, that's my point.

> 
> Also, memory_region_find cannot know if it's returning a valid result,
> and the callee cannot check it because the region may have disappeared
> already when it is returned.

Again, we hold the address space lock while checking the conditions. If
a region does not supports BQL-free mode and BQL is not held, we have an
error and return NULL (or bail out with a runtime error).

> 
> But I really would be surprised if adding an owner everywhere is so
> hard...  let's try that first, it would solve the problem.

If we can avoid it, that would only help the process. If we can't, ok.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Reply via email to