On 2012-11-16 14:29, Hans de Goede wrote: > Hi, > > On 11/15/2012 04:40 PM, Jan Kiszka wrote: >> Hi Hans, >> >> On 2012-11-15 16:19, Hans de Goede wrote: >>> Hi Jan, >>> >>> I just saw your $subject patch in Gerd's usb-next tree, and I've a question >>> about it. The token should be enough to uniquely identify a device + ep, >>> and unless a guest uses multiple qhs for a singe ep, that _should_ be >>> enough. >> >> But what disallows that the guest issues multiple requests (QH + series >> of TDs) for a single endpoint? I'm not finding any trace in the spec >> that disallows this. > > It may not be explicitly forbidden in the spec, but the whole text of the > spec implies a 1 on 1 relationship between qhs and eps, and this is how > all major guests do it.
Was afraid of this... > >> And my special guest is stumbling over that limitation in QEMU. > > I guess you're not at liberty to disclose what guest this is ? > Hairy legacy code with additions I'm not yet able to share. > <snip> > >>> The reason why I want to know, is that identifying the queue by qh has a >>> disadvantage, to be precise I believe the following can then happen: >>> >>> 1) The guest queues up multiple requests for a queue >>> 2) We execute one, go async >>> 3) The guest changes it mind an unlinks the qh >>> 4) The guest will think the queue is cancelled after frindex has >>> changed by 1, but we keep it around for 64 frames (which sucks, >>> and I want to improve on this, but we need to for various reasons) >>> 5) The guests submits some new requests to the queue, using a >>> new qh (which possibly has a different address). >>> 6) We see the new qh, and execute a request from there >>> 7) The 1st request on the old qh completes, and we execute the next >>> 8) Now things are a mess as we're executing requests from the old >>> (cancelled from the guest pov) and new queue intermixed... >>> >>> Using the token to identify the queue fixes this, cause we will >>> find the same queue for the old and new qh, and uhci_queue_verify() >>> will then fail because of the qh addr change, and then we cancel >>> the old queue. IOW then we do the right thing. >> >> I was afraid of triggering some conflict but, as pointed out above, the >> current code also does something wrong. It associates two different >> active QH with the same queue and then triggers a failure for the first >> QH as it wrongly thinks the guest has unlinked it. >> >>> >>> So I'm wondering if there is another, potentially better fix for >>> what you are seeing? >> >> /me too. I'm not fully understanding your scenario above yet, >> specifically when and how the guest can correctly remove an active QH, > > A guest can unlink an active qh when it wants to cancel the requests > in there (ie a driver level timeout kicks in). This is a quite normal > thing to do (unlike using multiple qhs for a single ep). > > For ehci we can reliable detect this happening (except for interrupt > endpoints, sigh) thanks to the "doorbell" mechanism. But for uhci it > is harder, and things are further complicated by the fact that > Linux' full speed recovery code for bulk endpoints temporarily unlinks > and then relinks a pending requests to make some changes to the qh > while requests are in flight ... (bad Linux, bad!). > > So we rely on a combination of measures to detect cancels in the > qemu uhci code, of which the qh validation is one, and this gets > broken by your patch. How should unlink be defined here? A qh is no longer present in any chain of any frame? For how long? > >> so I cannot provide good suggestions at this point. > > I see 2 solutions at this point: > 1) You modify your special guest if you can > 2) We add a property to the uhci emulation to search for queues > by address rather then by token, which would of course default > to false. Well, this particular thing worked on real hw, so I thought it would be good to adjust QEMU. But as I'm struggling with my guest anyway, it may turn out that we will no longer need this in the end - at least in this case. Still, I'm with a bad feeling regarding the current heuristics for identifying invalidated queues. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux