On 2012-11-16 14:29, Hans de Goede wrote:
> Hi,
> 
> On 11/15/2012 04:40 PM, Jan Kiszka wrote:
>> Hi Hans,
>>
>> On 2012-11-15 16:19, Hans de Goede wrote:
>>> Hi Jan,
>>>
>>> I just saw your $subject patch in Gerd's usb-next tree, and I've a question
>>> about it. The token should be enough to uniquely identify a device + ep,
>>> and unless a guest uses multiple qhs for a singe ep, that _should_ be 
>>> enough.
>>
>> But what disallows that the guest issues multiple requests (QH + series
>> of TDs) for a single endpoint? I'm not finding any trace in the spec
>> that disallows this.
> 
> It may not be explicitly forbidden in the spec, but the whole text of the
> spec implies a 1 on 1 relationship between qhs and eps, and this is how
> all major guests do it.

Was afraid of this...

> 
>> And my special guest is stumbling over that limitation in QEMU.
> 
> I guess you're not at liberty to disclose what guest this is ?
> 

Hairy legacy code with additions I'm not yet able to share.

> <snip>
> 
>>> The reason why I want to know, is that identifying the queue by qh has a
>>> disadvantage, to be precise I believe the following can then happen:
>>>
>>> 1) The guest queues up multiple requests for a queue
>>> 2) We execute one, go async
>>> 3) The guest changes it mind an unlinks the qh
>>> 4) The guest will think the queue is cancelled after frindex has
>>> changed by 1, but we keep it around for 64 frames (which sucks,
>>> and I want to improve on this, but we need to for various reasons)
>>> 5) The guests submits some new requests to the queue, using a
>>> new qh (which possibly has a different address).
>>> 6) We see the new qh, and execute a request from there
>>> 7) The 1st request on the old qh completes, and we execute the next
>>> 8) Now things are a mess as we're executing requests from the old
>>> (cancelled from the guest pov) and new queue intermixed...
>>>
>>> Using the token to identify the queue fixes this, cause we will
>>> find the same queue for the old and new qh, and uhci_queue_verify()
>>> will then fail because of the qh addr change, and then we cancel
>>> the old queue. IOW then we do the right thing.
>>
>> I was afraid of triggering some conflict but, as pointed out above, the
>> current code also does something wrong. It associates two different
>> active QH with the same queue and then triggers a failure for the first
>> QH as it wrongly thinks the guest has unlinked it.
>>
>>>
>>> So I'm wondering if there is another, potentially better fix for
>>> what you are seeing?
>>
>> /me too. I'm not fully understanding your scenario above yet,
>> specifically when and how the guest can correctly remove an active QH,
> 
> A guest can unlink an active qh when it wants to cancel the requests
> in there (ie a driver level timeout kicks in). This is a quite normal
> thing to do (unlike using multiple qhs for a single ep).
> 
> For ehci we can reliable detect this happening (except for interrupt
> endpoints, sigh) thanks to the "doorbell" mechanism. But for uhci it
> is harder, and things are further complicated by the fact that
> Linux' full speed recovery code for bulk endpoints temporarily unlinks
> and then relinks a pending requests to make some changes to the qh
> while requests are in flight ... (bad Linux, bad!).
> 
> So we rely on a combination of measures to detect cancels in the
> qemu uhci code, of which the qh validation is one, and this gets
> broken by your patch.

How should unlink be defined here? A qh is no longer present in any
chain of any frame? For how long?

> 
>> so I cannot provide good suggestions at this point.
> 
> I see 2 solutions at this point:
> 1) You modify your special guest if you can
> 2) We add a property to the uhci emulation to search for queues
> by address rather then by token, which would of course default
> to false.

Well, this particular thing worked on real hw, so I thought it would be
good to adjust QEMU. But as I'm struggling with my guest anyway, it may
turn out that we will no longer need this in the end - at least in this
case. Still, I'm with a bad feeling regarding the current heuristics for
identifying invalidated queues.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Reply via email to