RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > My suspicion is that long SG lists are unusual - otherwise the > > ring expansion code would have been needed much earlier. > > > > Can anyone remember whether that was needed because of long SG lists > > or because of large numbers of outstanding requests? > > > > I've seen it for network cards - but only because usbnet sends > > down far too many tx buffers. > > usb-storage limits the maximum transfer size to 120K. That is a max of > 31 page-size segments if my math is right. That's probably why mass-storage > never saw a problem. I found some references to mass storage running out of ring space. My best guess as to why mass storage has never had a problem with the alignment at link TRBs is that the effect of getting it wrong is to split the transfer. Since almost all the transfers are of page aligned data that will only happen on a 1k boundary - where it is invisible to the target. For USB2 targets the link TRB only has to be 512 byte aligned. The constraint that mass storage applied for the usb2 controllers is enough to satisfy it. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: linux-usb-ow...@vger.kernel.org > [mailto:linux-usb-ow...@vger.kernel.org] On Behalf Of David Laight > Sent: Wednesday, November 20, 2013 9:16 AM > > > Ok, so the networking layer should be fine. However, with the current > > patch, if the mass storage driver sends down a scatter-gather list > > that's bigger than a ring segment, or needs to be split up so it doesn't > > cross 64K boundaries, then the URB submission will fail. We don't want > > that to happen. > > My suspicion is that long SG lists are unusual - otherwise the > ring expansion code would have been needed much earlier. > > Can anyone remember whether that was needed because of long SG lists > or because of large numbers of outstanding requests? > > I've seen it for network cards - but only because usbnet sends > down far too many tx buffers. usb-storage limits the maximum transfer size to 120K. That is a max of 31 page-size segments if my math is right. That's probably why mass-storage never saw a problem. -- Paul -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, Nov 20, 2013 at 05:16:12PM -, David Laight wrote: > > Ok, so the networking layer should be fine. However, with the current > > patch, if the mass storage driver sends down a scatter-gather list > > that's bigger than a ring segment, or needs to be split up so it doesn't > > cross 64K boundaries, then the URB submission will fail. We don't want > > that to happen. > > My suspicion is that long SG lists are unusual - otherwise the > ring expansion code would have been needed much earlier. > > Can anyone remember whether that was needed because of long SG lists > or because of large numbers of outstanding requests? > > I've seen it for network cards - but only because usbnet sends > down far too many tx buffers. It was added because USB video and audio drivers queue isochronous URBs with lots of buffers for each service interval. So basically, lots of outstanding requests, not long SG lists was the driving motivation to add the ring expansion. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> Ok, so the networking layer should be fine. However, with the current > patch, if the mass storage driver sends down a scatter-gather list > that's bigger than a ring segment, or needs to be split up so it doesn't > cross 64K boundaries, then the URB submission will fail. We don't want > that to happen. My suspicion is that long SG lists are unusual - otherwise the ring expansion code would have been needed much earlier. Can anyone remember whether that was needed because of long SG lists or because of large numbers of outstanding requests? I've seen it for network cards - but only because usbnet sends down far too many tx buffers. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Alan Stern [mailto:st...@rowland.harvard.edu] > On Wed, 20 Nov 2013, David Laight wrote: > > > > From: Sarah Sharp > > ... > > > (Also, usb-storage aligns the block sizes to 512K, which explains why > > > we've never had an issue with TD fragments with that driver.) > > > > What is a 'block' in that context? > > I think Sarah means that usb-storage requires the block layer to align > its data buffers to 512-byte boundaries. (Note: 512 bytes, not 512K.) I did think it might be a typo... > Disk I/O naturally tends to be done in units of the page size, anyway, > although raw I/O can involve single sectors. > > If a user supplies an unaligned buffer, the block layer will set up a > bounce buffer. Ah, ok, some other systems only do that from byte-misaligned buffers (for general disk access). The ehci alignment rules force 512 byte alignment. That does mean that USB2 mass storage devices (attached to xhci) will never generate incorrectly aligned link TRBs. The ax88179_178a driver might still do so. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Sarah Sharp [mailto:sarah.a.sh...@linux.intel.com] > On Wed, Nov 20, 2013 at 09:36:11AM -, David Laight wrote: > > > From: Alan Stern [mailto:st...@rowland.harvard.edu] > > > On Tue, 19 Nov 2013, Sarah Sharp wrote: > > > > > > > The xHCI driver can limit the number of sg-list entries through > > > > hcd->self.sg_tablesize. It's currently set to ~0, which is "however > > > > many entries you want. You could set that to the number of TRBs in a > > > > segment (minus one for the link TRB). > > > > ... > > Ok, so the networking layer should be fine. However, with the current > patch, if the mass storage driver sends down a scatter-gather list > that's bigger than a ring segment, or needs to be split up so it doesn't > cross 64K boundaries, then the URB submission will fail. We don't want > that to happen. > > At the very least, we should limit hcd->self.sg_tablesize in > drivers/usb/host/xhci.c to 63 (TRBS_PER_SEGMENT - 1). But we could > still potentially run into the 64K boundary issue in one or maybe all of > those entries. Would it be crazy to limit the number of entries to half > that (31)? It may impact performance, but it ensures that SCSI reads > and writes don't randomly fail. We can always increase the ring segment > size in a later patch. Unless there is a limit on the overall length of the transfer you'll always lose. An aligned 4MB transfer in a contiguous memory buffer requires 64 TRBs. So won't fit in a ring segment. Although I'd guess that such transfers are unusual. If you bet that crossing a 64k boundaries is unusual then 32 is probably a sane limit. Fragments over 16k can easily be split between ring segments - so a more complex check could allow for that - but it won't be quick to write. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, Nov 20, 2013 at 09:46:08AM -, David Laight wrote: > > From: Sarah Sharp > ... > > (Also, usb-storage aligns the block sizes to 512K, which explains why > > we've never had an issue with TD fragments with that driver.) > > What is a 'block' in that context? > 512K sounds more like the value that very long transfers get chopped > up into. With 4k pages that might be 128 fragments. Sorry, I meant 512-byte boundaries. See Alan's comment in drivers/usb/storage/scsiglue.c: /* USB has unusual DMA-alignment requirements: Although the * starting address of each scatter-gather element doesn't matter, * the length of each element except the last must be divisible * by the Bulk maxpacket value. There's currently no way to * express this by block-layer constraints, so we'll cop out * and simply require addresses to be aligned at 512-byte * boundaries. This is okay since most block I/O involves * hardware sectors that are multiples of 512 bytes in length, * and since host controllers up through USB 2.0 have maxpacket * values no larger than 512. * * But it doesn't suffice for Wireless USB, where Bulk maxpacket * values can be as large as 2048. To make that work properly * will require changes to the block layer. */ blk_queue_update_dma_alignment(sdev->request_queue, (512 - 1)); > I'd have thought that the SG list would normally contain references > to a number of memory pages - so each would be 4k (on x86) aligned. > My suspicion is that the xhci controller will generate correct USB3 > data provided the link TRB is on a 1k boundary - so such data won't > be a problem. If the max burst size is less than four, and the scsi layer hands down 4k chunks, then the driver would still work without any modification for TD fragments, since MBP would be 4k and there would never be a link TRB in the middle of an MBP. However, the driver could be in violation of the spec if the burst size was greater than 4. I suspect what would happen is the host controller would read the TD, and do a shorter burst of 4 max-packet-sized 1k chunks, and then end the burst early. But I'm not a hardware engineer, and we can't count on how they designed it. I'm just trying to figure out why usb-storage worked for so many years without running into this issue. > If a user program does a direct transfer from the block device > (and that is done by locking down the user pages) then the buffer > could have an arbitrary alignment. Sure. In that case though, limiting the sg_tablesize so that TDs fit into one ring segment isn't going to help, because the block layer won't use it. I guess the transfers will just fail, until we can get a better fix in. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, 20 Nov 2013, David Laight wrote: > > From: Sarah Sharp > ... > > (Also, usb-storage aligns the block sizes to 512K, which explains why > > we've never had an issue with TD fragments with that driver.) > > What is a 'block' in that context? I think Sarah means that usb-storage requires the block layer to align its data buffers to 512-byte boundaries. (Note: 512 bytes, not 512K.) Disk I/O naturally tends to be done in units of the page size, anyway, although raw I/O can involve single sectors. If a user supplies an unaligned buffer, the block layer will set up a bounce buffer. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, Nov 20, 2013 at 09:36:11AM -, David Laight wrote: > > From: Alan Stern [mailto:st...@rowland.harvard.edu] > > On Tue, 19 Nov 2013, Sarah Sharp wrote: > > > > > The xHCI driver can limit the number of sg-list entries through > > > hcd->self.sg_tablesize. It's currently set to ~0, which is "however > > > many entries you want. You could set that to the number of TRBs in a > > > segment (minus one for the link TRB). > > > > > > The usb-storage and uas drivers currently use sg_tablesize. Could the > > > network stack be taught to use sg_tablesize as well? > > > > The sg_tablesize you're talking about is a field in struct usb_bus > > (there's a similar field in struct scsi_host_template). It's not > > relevant to the network stack, since network interfaces aren't USB host > > controllers (or SCSI hosts). > > Ben said the largest number of fragments from the current network > stack will be 17, and that none of them will cross 32k boundaries. > So the network stack won't send down long SG lists. Ok, so the networking layer should be fine. However, with the current patch, if the mass storage driver sends down a scatter-gather list that's bigger than a ring segment, or needs to be split up so it doesn't cross 64K boundaries, then the URB submission will fail. We don't want that to happen. At the very least, we should limit hcd->self.sg_tablesize in drivers/usb/host/xhci.c to 63 (TRBS_PER_SEGMENT - 1). But we could still potentially run into the 64K boundary issue in one or maybe all of those entries. Would it be crazy to limit the number of entries to half that (31)? It may impact performance, but it ensures that SCSI reads and writes don't randomly fail. We can always increase the ring segment size in a later patch. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Eric Dumazet [mailto:eric.duma...@gmail.com] > On Wed, 2013-11-20 at 09:36 +, David Laight wrote: > > > Ben said the largest number of fragments from the current network > > stack will be 17, and that none of them will cross 32k boundaries. > > So the network stack won't send down long SG lists. > > Please note that skb->head itself _might_ cross a 32K or 64K boundary : > > skb->head is kmalloc() provided, and SLUB can be tweaked > (slub_max_order) to use very high order pages. Worth remembering... I suspect the number of fragments (inc 64k boundaries) is still limited because maximum data length is under 64k (for TSO) and the SLUB memory has to be physically contiguous (it is still limited even if not). At the moment the usb bulk tx code calculated the exact number of fragments needed. To do this it has to scan the sg list twice. It would seem more sensible to generate a quick upper bound (ie nfrags*2 + len/65536) and maybe calculate the exact number if this would exceed the maximum number. OTOH the max allowed could be documented that way. In any case this is all aside from the bug itself. David N�r��yb�X��ǧv�^�){.n�+{��^n�r���z���h�&���G���h�(�階�ݢj"���m��z�ޖ���f���h���~�m�
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, 2013-11-20 at 09:36 +, David Laight wrote: > Ben said the largest number of fragments from the current network > stack will be 17, and that none of them will cross 32k boundaries. > So the network stack won't send down long SG lists. Please note that skb->head itself _might_ cross a 32K or 64K boundary : skb->head is kmalloc() provided, and SLUB can be tweaked (slub_max_order) to use very high order pages. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Sarah Sharp ... > (Also, usb-storage aligns the block sizes to 512K, which explains why > we've never had an issue with TD fragments with that driver.) What is a 'block' in that context? 512K sounds more like the value that very long transfers get chopped up into. With 4k pages that might be 128 fragments. I'd have thought that the SG list would normally contain references to a number of memory pages - so each would be 4k (on x86) aligned. My suspicion is that the xhci controller will generate correct USB3 data provided the link TRB is on a 1k boundary - so such data won't be a problem. If a user program does a direct transfer from the block device (and that is done by locking down the user pages) then the buffer could have an arbitrary alignment. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Alan Stern [mailto:st...@rowland.harvard.edu] > On Tue, 19 Nov 2013, Sarah Sharp wrote: > > > The xHCI driver can limit the number of sg-list entries through > > hcd->self.sg_tablesize. It's currently set to ~0, which is "however > > many entries you want. You could set that to the number of TRBs in a > > segment (minus one for the link TRB). > > > > The usb-storage and uas drivers currently use sg_tablesize. Could the > > network stack be taught to use sg_tablesize as well? > > The sg_tablesize you're talking about is a field in struct usb_bus > (there's a similar field in struct scsi_host_template). It's not > relevant to the network stack, since network interfaces aren't USB host > controllers (or SCSI hosts). Ben said the largest number of fragments from the current network stack will be 17, and that none of them will cross 32k boundaries. So the network stack won't send down long SG lists. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Tue, 19 Nov 2013, Sarah Sharp wrote: > The xHCI driver can limit the number of sg-list entries through > hcd->self.sg_tablesize. It's currently set to ~0, which is "however > many entries you want. You could set that to the number of TRBs in a > segment (minus one for the link TRB). > > The usb-storage and uas drivers currently use sg_tablesize. Could the > network stack be taught to use sg_tablesize as well? The sg_tablesize you're talking about is a field in struct usb_bus (there's a similar field in struct scsi_host_template). It's not relevant to the network stack, since network interfaces aren't USB host controllers (or SCSI hosts). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Mon, Nov 18, 2013 at 03:41:00PM -, David Laight wrote: > > > > -Original Message- > > From: Ben Hutchings [mailto:bhutchi...@solarflare.com] > > Sent: 18 November 2013 15:03 > > To: David Laight > > Cc: Alan Stern; Sarah Sharp; net...@vger.kernel.org; > > linux-usb@vger.kernel.org > > Subject: Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload > > burst. > > > > On Mon, 2013-11-18 at 09:48 +, David Laight wrote: > > > > > But the minimum fragment size is (probably) 4k. > > > > > For the network stack an OUT transfer might have a lot (and I mean > > > > > lots) > > > > > of fragments (there may be constraints, and linearising the skb is a > > > > > option). > > > > [...] > > > > > > > > The maximum number of fragments in the skb is going to be 17 (including > > > > the 'head' area). (I'm ignoring NETIF_F_FRAGLIST which is not normally > > > > supported by physical device drivers.) > > > > > > > > I don't know how many fragments that can end up as, at the USB level. > > > > > > If you assume that every fragment crosses a 64k boundary that would be 34. > > > OTOH I've not seen a fragment of a 64k TSO send crossing a 32k > > > boundary, and I think the 'head' area is constrained to be part of > > > a single (4k or larger) page. > > > > I don't know that it's possible at the moment, but I wouldn't recommend > > relying on that. > > The xhci (USB3) hardware supports SG, but a non-obvious alignment restriction > applies at the end of a ring segment. In effect this means that the number of > fragments mustn't exceed the size of the ring segment. > It would make the xchi driver simpler if excessively fragmented requests > could just be bounced. > Since ring entries are 16 bytes there isn't much reason to not use a 4k > 'page' for a ring and have (almost) 256 ring slots. > (The code currently uses multiple ring segments with 63 usable slots.) > > Looks like skb are constrained enough that a sensible limit can be applied. > > The other likely generator of fragmented requests is the mass storage code. > Most likely for dd(1) with a large block size. The xHCI driver can limit the number of sg-list entries through hcd->self.sg_tablesize. It's currently set to ~0, which is "however many entries you want. You could set that to the number of TRBs in a segment (minus one for the link TRB). The usb-storage and uas drivers currently use sg_tablesize. Could the network stack be taught to use sg_tablesize as well? (Also, usb-storage aligns the block sizes to 512K, which explains why we've never had an issue with TD fragments with that driver.) Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> -Original Message- > From: Ben Hutchings [mailto:bhutchi...@solarflare.com] > Sent: 18 November 2013 15:03 > To: David Laight > Cc: Alan Stern; Sarah Sharp; net...@vger.kernel.org; linux-usb@vger.kernel.org > Subject: Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload > burst. > > On Mon, 2013-11-18 at 09:48 +, David Laight wrote: > > > > But the minimum fragment size is (probably) 4k. > > > > For the network stack an OUT transfer might have a lot (and I mean lots) > > > > of fragments (there may be constraints, and linearising the skb is a > > > > option). > > > [...] > > > > > > The maximum number of fragments in the skb is going to be 17 (including > > > the 'head' area). (I'm ignoring NETIF_F_FRAGLIST which is not normally > > > supported by physical device drivers.) > > > > > > I don't know how many fragments that can end up as, at the USB level. > > > > If you assume that every fragment crosses a 64k boundary that would be 34. > > OTOH I've not seen a fragment of a 64k TSO send crossing a 32k > > boundary, and I think the 'head' area is constrained to be part of > > a single (4k or larger) page. > > I don't know that it's possible at the moment, but I wouldn't recommend > relying on that. The xhci (USB3) hardware supports SG, but a non-obvious alignment restriction applies at the end of a ring segment. In effect this means that the number of fragments mustn't exceed the size of the ring segment. It would make the xchi driver simpler if excessively fragmented requests could just be bounced. Since ring entries are 16 bytes there isn't much reason to not use a 4k 'page' for a ring and have (almost) 256 ring slots. (The code currently uses multiple ring segments with 63 usable slots.) Looks like skb are constrained enough that a sensible limit can be applied. The other likely generator of fragmented requests is the mass storage code. Most likely for dd(1) with a large block size. David
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Mon, 2013-11-18 at 09:48 +, David Laight wrote: > > > But the minimum fragment size is (probably) 4k. > > > For the network stack an OUT transfer might have a lot (and I mean lots) > > > of fragments (there may be constraints, and linearising the skb is a > > > option). > > [...] > > > > The maximum number of fragments in the skb is going to be 17 (including > > the 'head' area). (I'm ignoring NETIF_F_FRAGLIST which is not normally > > supported by physical device drivers.) > > > > I don't know how many fragments that can end up as, at the USB level. > > If you assume that every fragment crosses a 64k boundary that would be 34. > OTOH I've not seen a fragment of a 64k TSO send crossing a 32k > boundary, and I think the 'head' area is constrained to be part of > a single (4k or larger) page. I don't know that it's possible at the moment, but I wouldn't recommend relying on that. > Isn't there something odd about skb merged by receive offload? > I've not entirely sorted out the full structure of skb. There has been some work to allow for using both the frags array and frag list, but a driver will not see such an skb if it does not advertise the NETIF_F_FRAGLIST feature. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > But the minimum fragment size is (probably) 4k. > > For the network stack an OUT transfer might have a lot (and I mean lots) > > of fragments (there may be constraints, and linearising the skb is a > > option). > [...] > > The maximum number of fragments in the skb is going to be 17 (including > the 'head' area). (I'm ignoring NETIF_F_FRAGLIST which is not normally > supported by physical device drivers.) > > I don't know how many fragments that can end up as, at the USB level. If you assume that every fragment crosses a 64k boundary that would be 34. OTOH I've not seen a fragment of a 64k TSO send crossing a 32k boundary, and I think the 'head' area is constrained to be part of a single (4k or larger) page. Isn't there something odd about skb merged by receive offload? I've not entirely sorted out the full structure of skb. David
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, 2013-11-13 at 16:58 +, David Laight wrote: [...] > > > > > You can split a bulk TD on a 1k boundary and the target won't know the > > > > > difference. > > > > > > > > The target won't know the difference, but the host will. Here's an > > > > example: Suppose the driver submits two URBs, each for a data-in > > > > transfer of 32 KB. You split each URB up into two 16-KB TDs; let's > > > > call them A, B, C, and D (where A and B make up the first URB, and C > > > > and D make up the second URB). > > > > > > I was thinking about OUT transfers, IN ones are unlikely to be badly > > > fragmented. > > > > Maybe not for the network stack, but OUT and IN end up equally > > fragmented for the storage stack. > > But the minimum fragment size is (probably) 4k. > For the network stack an OUT transfer might have a lot (and I mean lots) > of fragments (there may be constraints, and linearising the skb is a option). [...] The maximum number of fragments in the skb is going to be 17 (including the 'head' area). (I'm ignoring NETIF_F_FRAGLIST which is not normally supported by physical device drivers.) I don't know how many fragments that can end up as, at the USB level. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> -Original Message- > From: Alan Stern [mailto:st...@rowland.harvard.edu] > Sent: 13 November 2013 16:21 > To: David Laight > Cc: Sarah Sharp; net...@vger.kernel.org; linux-usb@vger.kernel.org > Subject: RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload > burst. > > On Wed, 13 Nov 2013, David Laight wrote: > > > > > Since you don't really want to do all the work twice, the sensible > > > > way is to add each input fragment to the ring one a time. > > > > If you need to cross a link TRB and the last MBP boundary is within > > > > the previous data TRB then you can split the previous data TRB at the > > > > MBP boundary and continue. > > > > If the previous TRB is short then you'd need to go back through the > > > > earlier TRB until you found the one that contains a TRB boundary, > > > > split it, and write a link TRB in the following slot. > > > > > > Right. You could avoid doing a lot of work twice by using two passes. > > > In the first pass, keep track of the number of bytes allotted to each > > > TRB and the MBP boundaries, without doing anything else. That way, > > > you'll know where to put a Link TRB without any need for backtracking. > > > Then in the second pass, fill in the actual contents of the TRBs. > > > > No - you really don't want to process everything twice. > > What I described does not process anything twice. It calculates the > byte counts on the first pass and everything else on the second pass. > > > You could remember the TRB and offset of the last MBP boundary. > > Then you really _do_ end up processing the TRB's which follow the last > MBP boundary twice. Once when setting up the TD as normal, and then > again after you realize they need to be moved to a different ring > segment. That would only be a few of the TRBs, not all of them. The cheap option is to work out an upper bound for the number of TRB and the write a new LINK TRB if there isn't enough space. That does limit the maximum number of TRB in a URB to half the total ring (less if it has more than 2 fragments). > > > Unless the first TRB of the message is the first TRB of the ring > > > segment. Then you're in trouble... Hopefully, real URBs will never be > > > that badly fragmented. > > > > I suspect that ones from the network stack might be badly fragmented. > > The code needs to at least detect and error them. > > I'm not so sure about this. The network stack works okay with other > host controller drivers (like ehci-hcd), which have much stricter > requirements about fragmentation. They require that the length of each > SG element except the last one must be a multiple of the maxpacket > size. For high-speed bulk transfers, that means a multiple of 512. Yes, for other host controllers the network cards linearise the skb. it is only for xhci where where the buffer fragments from the skb can be given to the usb controller. This is probably also only enabled (at the moment) for the ASIX Ge card - which can also do TCP transmit segmentation offload (TSO). It is effectively a necessity for TSO since it would otherwise require 64k blocks of contiguous memory (or some horrid code that involves a misaligned copy). The SG code in usbnet was only added for 3.12. > > > > You can split a bulk TD on a 1k boundary and the target won't know the > > > > difference. > > > > > > The target won't know the difference, but the host will. Here's an > > > example: Suppose the driver submits two URBs, each for a data-in > > > transfer of 32 KB. You split each URB up into two 16-KB TDs; let's > > > call them A, B, C, and D (where A and B make up the first URB, and C > > > and D make up the second URB). > > > > I was thinking about OUT transfers, IN ones are unlikely to be badly > > fragmented. > > Maybe not for the network stack, but OUT and IN end up equally > fragmented for the storage stack. But the minimum fragment size is (probably) 4k. For the network stack an OUT transfer might have a lot (and I mean lots) of fragments (there may be constraints, and linearising the skb is a option). > > > Suppose you have only two ring segments, and a driver submits an URB > > > which is so fragmented that it requires more TRBs than you have room > > > for in those two segments. When do you want the interrupts to arrive? > > > Answer: At each segment crossing. > > > > You bounce the original request and fix the driver to submit URB > > with fewer fragments. > > In other words, you want t
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, 13 Nov 2013, David Laight wrote: > > > that doesn't matter; we don't get an interrupt when a ring segment is > > > crossed. Instead we set the interrupt-on-completion flag on the last > > > TRB of the TD, not on any earlier fragment or link TRB. > > > > That's because you don't worry about handling URBs which require too > > many TRBs (i.e., more than are available). You just add more ring > > segments. Instead, you could re-use segments on the fly. > > > > For example, suppose you have only two ring segments and you get an URB > > which requires enough TRBs to fill up four segments. You could fill in > > the first two segments worth, and get an interrupt when the controller > > traverses the Link TRB between them. At that point you store the third > > set of TRBs in the first segment, which is now vacant. Similarly, when > > the second Link TRB is traversed, you fill in the fourth set of TRBs. > > That isn't going to work. Queuing URBs and processing them in pieces was _your_ suggestion. I merely repeated it to Sarah and filled in some detail. > It might work for very long TD, but for very fragmented ones the interrupt > rate would be stupid. Not if the individual SG elements are reasonably large -- say at least 512 bytes. They have to be that big if they are to work with ehci-hcd, so why not also for xhci-hcd? > In any case you'd definitely need enough ring > segments for the TRB that describe a single 'max packet size' block. That would be two TRBs (for maxpacket = 1024), meaning you'd need at least one ring segment. That's not much of a restriction. > > > Finally, it's interesting to note that the USB mass storage driver is > > > using scatter gather lists just fine without the driver following the TD > > > fragment rules. Or at least no one has reported any issues. I wonder > > > why it works? > > > > I'd guess this is because the hardware is actually a lot more flexible > > than the "No Link TRBs in the middle of a TD fragment" rule. > > With the hardware I have (Intel i7 Sandy bridge) Link TRB cannot > be placed at arbitrary boundaries. > I don't know whether the actual restriction is only to packet boundaries. That's what I had in mind by "more flexible" above. > I don't have a USB3 monitor and it would also require a more contrived > test than I've been doing. You wouldn't need a monitor to check it, but you would need some careful tests. > > The whole idea of TD fragments makes no sense to begin with. What > > point is there in grouping packets into MaxBurst-sized collections? > > It probably saved a few logic gates somewhere, either that or it is > a bug in some hardware implementation that got documented in the spec > instead of being fixed :-) I can believe the guess about it reflecting a bug. Particularly since it is so badly written. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Wed, 13 Nov 2013, David Laight wrote: > > > Since you don't really want to do all the work twice, the sensible > > > way is to add each input fragment to the ring one a time. > > > If you need to cross a link TRB and the last MBP boundary is within > > > the previous data TRB then you can split the previous data TRB at the > > > MBP boundary and continue. > > > If the previous TRB is short then you'd need to go back through the > > > earlier TRB until you found the one that contains a TRB boundary, > > > split it, and write a link TRB in the following slot. > > > > Right. You could avoid doing a lot of work twice by using two passes. > > In the first pass, keep track of the number of bytes allotted to each > > TRB and the MBP boundaries, without doing anything else. That way, > > you'll know where to put a Link TRB without any need for backtracking. > > Then in the second pass, fill in the actual contents of the TRBs. > > No - you really don't want to process everything twice. What I described does not process anything twice. It calculates the byte counts on the first pass and everything else on the second pass. > You could remember the TRB and offset of the last MBP boundary. Then you really _do_ end up processing the TRB's which follow the last MBP boundary twice. Once when setting up the TD as normal, and then again after you realize they need to be moved to a different ring segment. > > Unless the first TRB of the message is the first TRB of the ring > > segment. Then you're in trouble... Hopefully, real URBs will never be > > that badly fragmented. > > I suspect that ones from the network stack might be badly fragmented. > The code needs to at least detect and error them. I'm not so sure about this. The network stack works okay with other host controller drivers (like ehci-hcd), which have much stricter requirements about fragmentation. They require that the length of each SG element except the last one must be a multiple of the maxpacket size. For high-speed bulk transfers, that means a multiple of 512. > > > You can split a bulk TD on a 1k boundary and the target won't know the > > > difference. > > > > The target won't know the difference, but the host will. Here's an > > example: Suppose the driver submits two URBs, each for a data-in > > transfer of 32 KB. You split each URB up into two 16-KB TDs; let's > > call them A, B, C, and D (where A and B make up the first URB, and C > > and D make up the second URB). > > I was thinking about OUT transfers, IN ones are unlikely to be badly > fragmented. Maybe not for the network stack, but OUT and IN end up equally fragmented for the storage stack. > > Suppose you have only two ring segments, and a driver submits an URB > > which is so fragmented that it requires more TRBs than you have room > > for in those two segments. When do you want the interrupts to arrive? > > Answer: At each segment crossing. > > You bounce the original request and fix the driver to submit URB > with fewer fragments. In other words, you want to limit the number of SG elements the driver will accept. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > that doesn't matter; we don't get an interrupt when a ring segment is > > crossed. Instead we set the interrupt-on-completion flag on the last > > TRB of the TD, not on any earlier fragment or link TRB. > > That's because you don't worry about handling URBs which require too > many TRBs (i.e., more than are available). You just add more ring > segments. Instead, you could re-use segments on the fly. > > For example, suppose you have only two ring segments and you get an URB > which requires enough TRBs to fill up four segments. You could fill in > the first two segments worth, and get an interrupt when the controller > traverses the Link TRB between them. At that point you store the third > set of TRBs in the first segment, which is now vacant. Similarly, when > the second Link TRB is traversed, you fill in the fourth set of TRBs. That isn't going to work. It might work for very long TD, but for very fragmented ones the interrupt rate would be stupid. In any case you'd definitely need enough ring segments for the TRB that describe a single 'max packet size' block. > > Finally, it's interesting to note that the USB mass storage driver is > > using scatter gather lists just fine without the driver following the TD > > fragment rules. Or at least no one has reported any issues. I wonder > > why it works? > > I'd guess this is because the hardware is actually a lot more flexible > than the "No Link TRBs in the middle of a TD fragment" rule. With the hardware I have (Intel i7 Sandy bridge) Link TRB cannot be placed at arbitrary boundaries. I don't know whether the actual restriction is only to packet boundaries. I don't have a USB3 monitor and it would also require a more contrived test than I've been doing. > The whole idea of TD fragments makes no sense to begin with. What > point is there in grouping packets into MaxBurst-sized collections? It probably saved a few logic gates somewhere, either that or it is a bug in some hardware implementation that got documented in the spec instead of being fixed :-) David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > Since you don't really want to do all the work twice, the sensible > > way is to add each input fragment to the ring one a time. > > If you need to cross a link TRB and the last MBP boundary is within > > the previous data TRB then you can split the previous data TRB at the > > MBP boundary and continue. > > If the previous TRB is short then you'd need to go back through the > > earlier TRB until you found the one that contains a TRB boundary, > > split it, and write a link TRB in the following slot. > > Right. You could avoid doing a lot of work twice by using two passes. > In the first pass, keep track of the number of bytes allotted to each > TRB and the MBP boundaries, without doing anything else. That way, > you'll know where to put a Link TRB without any need for backtracking. > Then in the second pass, fill in the actual contents of the TRBs. No - you really don't want to process everything twice. You could remember the TRB and offset of the last MBP boundary. > > If you are within the first MBP then you'd need to replace the first > > TRB of the message with a link TRB. > > Unless the first TRB of the message is the first TRB of the ring > segment. Then you're in trouble... Hopefully, real URBs will never be > that badly fragmented. I suspect that ones from the network stack might be badly fragmented. The code needs to at least detect and error them. I don't think linux skb can be as fragmented as I've seen network buffers on other systems - 1 byte per buffer chained together to a maximal sized ethernet packet. > > > > For bulk data the link TRB can be forced at a packet boundary > > > > by splitting the TD up - the receiving end won't know the difference. > > > > > > That won't work. What happens if you split a TD up into two pieces and > > > the first piece receives a short packet? The host controller will > > > automatically move to the start of the second piece. That's not what > > > we want. > > > > You can split a bulk TD on a 1k boundary and the target won't know the > > difference. > > The target won't know the difference, but the host will. Here's an > example: Suppose the driver submits two URBs, each for a data-in > transfer of 32 KB. You split each URB up into two 16-KB TDs; let's > call them A, B, C, and D (where A and B make up the first URB, and C > and D make up the second URB). I was thinking about OUT transfers, IN ones are unlikely to be badly fragmented. > > > > There is no necessity for taking an interrupt from every link segment. > > > > > > Yes, there is. The HCD needs to know when the dequeue pointer has > > > moved beyond the end of the ring segment, so that it can start reusing > > > the TRB slots in that segment. > > > > You already know that because of the interrupts for the data packets > > themselves. > > I don't know what you mean by this. The host controller does not > generate an interrupt for each data packet. In generates interrupts > only for TRBs with the IOC bit set (which is generally the last TRB in > each TD). > > Suppose you have only two ring segments, and a driver submits an URB > which is so fragmented that it requires more TRBs than you have room > for in those two segments. When do you want the interrupts to arrive? > Answer: At each segment crossing. You bounce the original request and fix the driver to submit URB with fewer fragments. > > > > I would change the code to use a single segment (for coding simplicity) > > > > and queue bulk URB when there isn't enough ring space. > > > > URB with too many fragments could either be rejected, sent in sections, > > > > or partially linearised (and probably still sent in sections). > > > > > > Rejecting an URB is not feasible. Splitting it up into multiple TDs is > > > not acceptable, as explained above. Sending it in sections (i.e., > > > queueing only some of the TRBs at any time) would work, provided you > > > got at least two interrupts every time the queue wrapped around (which > > > suggests you might want at least two ring segments). > > > > Rejecting badly fragmented URB is almost certainly ok. You really don't > > want the hardware overhead of processing a TRB every few bytes. > > This would be especially bad on iommu systems. > > I disagree. Sure, it would be bad. But if you can handle it, you > should. You have to draw the line somewhere. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Tue, 12 Nov 2013, Sarah Sharp wrote: > > It comes down to a question of how often you want the controller to > > issue an interrupt. If a ring segment is 4 KB (one page), then it can > > hold 256 TRBs. With scatter-gather transfers, each SG element > > typically refers to something like a 2-page buffer (depends on how > > fragmented the memory is). Therefore a ring segment will describe > > somewhere around 512 pages of data, i.e., something like 2 MB. Since > > SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 > > interrupts every second just because of ring segment crossings. > > The driver is currently defined to have 64 TRBs per ring segment. But A TRB is 16 bytes, right? So a page can hold 256 TRBs. Why use only 64 per segment? > that doesn't matter; we don't get an interrupt when a ring segment is > crossed. Instead we set the interrupt-on-completion flag on the last > TRB of the TD, not on any earlier fragment or link TRB. That's because you don't worry about handling URBs which require too many TRBs (i.e., more than are available). You just add more ring segments. Instead, you could re-use segments on the fly. For example, suppose you have only two ring segments and you get an URB which requires enough TRBs to fill up four segments. You could fill in the first two segments worth, and get an interrupt when the controller traverses the Link TRB between them. At that point you store the third set of TRBs in the first segment, which is now vacant. Similarly, when the second Link TRB is traversed, you fill in the fourth set of TRBs. > > Using larger ring segments would help. > > Ring segments have to be physically contiguous, so I'm not sure if we > want to ask for segments that are bigger than a page. I've already got > a report from someone else about the ring expansion getting out of > control, so I would like to figure that out before we talk about using > even bigger segments. Maybe you can get away with fewer segments, if they are bigger. > Finally, it's interesting to note that the USB mass storage driver is > using scatter gather lists just fine without the driver following the TD > fragment rules. Or at least no one has reported any issues. I wonder > why it works? I'd guess this is because the hardware is actually a lot more flexible than the "No Link TRBs in the middle of a TD fragment" rule. The whole idea of TD fragments makes no sense to begin with. What point is there in grouping packets into MaxBurst-sized collections? The hardware does not have to finish one burst before beginning the next one. For example, suppose the MaxBurst size is 8. The host starts by bursting packets 1-8. When it receives the ACK for packet 4, the host could then burst packets 9-12. It doesn't have to wait for the ACK to packet 8. (Unless I have misunderstood the USB-3 spec.) If the host does this, the burst boundaries won't occur on MBP boundaries, and hence won't occur on TD fragment boundaries. The fragment boundaries will be essentially meaningless. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Tue, 12 Nov 2013, David Laight wrote: > > > If all the fragments are larger than the MBP (assume 16k) then > > > that would be relatively easy. However that is very dependant > > > on the source of the data. It might be true for disk data, but > > > is unlikely to be true for ethernet data. > > > > I don't quite understand your point. Are you saying that if all the > > TRBs are very short, you might need more than 64 TRBs to reach a 16-KB > > boundary? > > Since you don't really want to do all the work twice, the sensible > way is to add each input fragment to the ring one a time. > If you need to cross a link TRB and the last MBP boundary is within > the previous data TRB then you can split the previous data TRB at the > MBP boundary and continue. > If the previous TRB is short then you'd need to go back through the > earlier TRB until you found the one that contains a TRB boundary, > split it, and write a link TRB in the following slot. Right. You could avoid doing a lot of work twice by using two passes. In the first pass, keep track of the number of bytes allotted to each TRB and the MBP boundaries, without doing anything else. That way, you'll know where to put a Link TRB without any need for backtracking. Then in the second pass, fill in the actual contents of the TRBs. > If you are within the first MBP then you'd need to replace the first > TRB of the message with a link TRB. Unless the first TRB of the message is the first TRB of the ring segment. Then you're in trouble... Hopefully, real URBs will never be that badly fragmented. > And yes, if the data is really fragmented you might need a lot of > TRB for even 1k of data. Until somebody needs more than 16 TRBs for each KB of data, we should be okay. > > > For bulk data the link TRB can be forced at a packet boundary > > > by splitting the TD up - the receiving end won't know the difference. > > > > That won't work. What happens if you split a TD up into two pieces and > > the first piece receives a short packet? The host controller will > > automatically move to the start of the second piece. That's not what > > we want. > > You can split a bulk TD on a 1k boundary and the target won't know the > difference. The target won't know the difference, but the host will. Here's an example: Suppose the driver submits two URBs, each for a data-in transfer of 32 KB. You split each URB up into two 16-KB TDs; let's call them A, B, C, and D (where A and B make up the first URB, and C and D make up the second URB). Now suppose the device terminates the first transfer after only 7200 bytes, with a short packet. It then sends a complete 32 KB of data for the second transfer. What will happen in this case? 7200 bytes is somewhere in the middle of TD A. That TD will terminate early. The host controller will then proceed to TD B. As a result, the next 32 KB of data will be stored in TD B and C -- not in TD C and D as it should be. This example shows why you must not split IN URBs into multiple TDs. > > > There is no necessity for taking an interrupt from every link segment. > > > > Yes, there is. The HCD needs to know when the dequeue pointer has > > moved beyond the end of the ring segment, so that it can start reusing > > the TRB slots in that segment. > > You already know that because of the interrupts for the data packets > themselves. I don't know what you mean by this. The host controller does not generate an interrupt for each data packet. In generates interrupts only for TRBs with the IOC bit set (which is generally the last TRB in each TD). Suppose you have only two ring segments, and a driver submits an URB which is so fragmented that it requires more TRBs than you have room for in those two segments. When do you want the interrupts to arrive? Answer: At each segment crossing. > > > I would change the code to use a single segment (for coding simplicity) > > > and queue bulk URB when there isn't enough ring space. > > > URB with too many fragments could either be rejected, sent in sections, > > > or partially linearised (and probably still sent in sections). > > > > Rejecting an URB is not feasible. Splitting it up into multiple TDs is > > not acceptable, as explained above. Sending it in sections (i.e., > > queueing only some of the TRBs at any time) would work, provided you > > got at least two interrupts every time the queue wrapped around (which > > suggests you might want at least two ring segments). > > Rejecting badly fragmented URB is almost certainly ok. You really don't > want the hardware overhead of processing a TRB every few bytes. > This would be especially bad on iommu systems. I disagree. Sure, it would be bad. But if you can handle it, you should. > Before the ring expansion code was added there was an implicit > limit of (probably) 125 fragments for a URB. Exceeding this limit > wasn't the reason for adding the ring expansion code. > > And, as I've pointed out, both
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > That's one way to do it. Or you could allow a Link TRB at an > > intermediate MBP boundary. > > I like this idea instead. The xHCI driver should be modified to be able > to handle link TRBs in the middle of the segments (the cancellation code > would have to be touched as well). We would keep a running count > of the number of bytes left in a TD fragment, as we fill in the TRBs. > If we find the TD fragment would span a link TRB, we backtrack to the > end of the last TD fragment, put in a link TRB, and then continue on the > next segment. I'd do that as a later change. It needs a fair amount of other stuff fixing first and the current code is rather broken - and needs a fix for stable. > > It comes down to a question of how often you want the controller to > > issue an interrupt. If a ring segment is 4 KB (one page), then it can > > hold 256 TRBs. With scatter-gather transfers, each SG element > > typically refers to something like a 2-page buffer (depends on how > > fragmented the memory is). Therefore a ring segment will describe > > somewhere around 512 pages of data, i.e., something like 2 MB. Since > > SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 > > interrupts every second just because of ring segment crossings. > > The driver is currently defined to have 64 TRBs per ring segment. But > that doesn't matter; we don't get an interrupt when a ring segment is > crossed. Instead we set the interrupt-on-completion flag on the last > TRB of the TD, not on any earlier fragment or link TRB. > > > Using larger ring segments would help. > > Ring segments have to be physically contiguous, so I'm not sure if we > want to ask for segments that are bigger than a page. I've already got > a report from someone else about the ring expansion getting out of > control, so I would like to figure that out before we talk about using > even bigger segments. I'd vote for a single segment containing either 128 or 256 TRBs. That is less than a single page. > Finally, it's interesting to note that the USB mass storage driver is > using scatter gather lists just fine without the driver following the TD > fragment rules. Or at least no one has reported any issues. I wonder > why it works? I suspect that breaking the rules just generates two TD. The mass storage driver is probably almost always presenting buffer fragments that are page sized and page aligned. In which case the split is invisible at the target. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> On Fri, Nov 08, 2013 at 11:07:41AM -, David Laight wrote: > > > While this change improves things a lot (it runs for 20 minutes rather > > > than > > > a few seconds), there is still something else wrong. > > > > Almost certainly caused by an unrelated local change. > > Ok, good, I was concerned there was another issue we were missing. My modified code wasn't setting the td_size properly. That is fixed and verified in the later patch I sent. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Fri, Nov 08, 2013 at 11:07:41AM -, David Laight wrote: > > While this change improves things a lot (it runs for 20 minutes rather than > > a few seconds), there is still something else wrong. > > Almost certainly caused by an unrelated local change. Ok, good, I was concerned there was another issue we were missing. Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Mon, Nov 11, 2013 at 03:34:53PM -0500, Alan Stern wrote: > On Mon, 11 Nov 2013, David Laight wrote: > > > > Suppose, for example, the MBP is 1024. If you have a TD with length > > > 1500, and if it had only one fragment, the last (and only) fragment's > > > length would not less than the MBP and it would not be an exact > > > multiple of the MBP. > > > > That doesn't matter - eg example 2 in figure 25 > > You're right. I do wish the spec had been written more clearly. > > > Reading it all again makes me think that a LINK trb is only > > allowed on the burst boundary (which might be 16k bytes). > > The only real way to implement that is to ensure that TD never > > contain LINK TRB. > > That's one way to do it. Or you could allow a Link TRB at an > intermediate MBP boundary. I like this idea instead. The xHCI driver should be modified to be able to handle link TRBs in the middle of the segments (the cancellation code would have to be touched as well). We would keep a running count of the number of bytes left in a TD fragment, as we fill in the TRBs. If we find the TD fragment would span a link TRB, we backtrack to the end of the last TD fragment, put in a link TRB, and then continue on the next segment. > It comes down to a question of how often you want the controller to > issue an interrupt. If a ring segment is 4 KB (one page), then it can > hold 256 TRBs. With scatter-gather transfers, each SG element > typically refers to something like a 2-page buffer (depends on how > fragmented the memory is). Therefore a ring segment will describe > somewhere around 512 pages of data, i.e., something like 2 MB. Since > SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 > interrupts every second just because of ring segment crossings. The driver is currently defined to have 64 TRBs per ring segment. But that doesn't matter; we don't get an interrupt when a ring segment is crossed. Instead we set the interrupt-on-completion flag on the last TRB of the TD, not on any earlier fragment or link TRB. > Using larger ring segments would help. Ring segments have to be physically contiguous, so I'm not sure if we want to ask for segments that are bigger than a page. I've already got a report from someone else about the ring expansion getting out of control, so I would like to figure that out before we talk about using even bigger segments. Finally, it's interesting to note that the USB mass storage driver is using scatter gather lists just fine without the driver following the TD fragment rules. Or at least no one has reported any issues. I wonder why it works? Sarah Sharp -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> > If all the fragments are larger than the MBP (assume 16k) then > > that would be relatively easy. However that is very dependant > > on the source of the data. It might be true for disk data, but > > is unlikely to be true for ethernet data. > > I don't quite understand your point. Are you saying that if all the > TRBs are very short, you might need more than 64 TRBs to reach a 16-KB > boundary? Since you don't really want to do all the work twice, the sensible way is to add each input fragment to the ring one a time. If you need to cross a link TRB and the last MBP boundary is within the previous data TRB then you can split the previous data TRB at the MBP boundary and continue. If the previous TRB is short then you'd need to go back through the earlier TRB until you found the one that contains a TRB boundary, split it, and write a link TRB in the following slot. If you are within the first MBP then you'd need to replace the first TRB of the message with a link TRB. And yes, if the data is really fragmented you might need a lot of TRB for even 1k of data. > > For bulk data the link TRB can be forced at a packet boundary > > by splitting the TD up - the receiving end won't know the difference. > > That won't work. What happens if you split a TD up into two pieces and > the first piece receives a short packet? The host controller will > automatically move to the start of the second piece. That's not what > we want. You can split a bulk TD on a 1k boundary and the target won't know the difference. > > There is no necessity for taking an interrupt from every link segment. > > Yes, there is. The HCD needs to know when the dequeue pointer has > moved beyond the end of the ring segment, so that it can start reusing > the TRB slots in that segment. You already know that because of the interrupts for the data packets themselves. > > I would change the code to use a single segment (for coding simplicity) > > and queue bulk URB when there isn't enough ring space. > > URB with too many fragments could either be rejected, sent in sections, > > or partially linearised (and probably still sent in sections). > > Rejecting an URB is not feasible. Splitting it up into multiple TDs is > not acceptable, as explained above. Sending it in sections (i.e., > queueing only some of the TRBs at any time) would work, provided you > got at least two interrupts every time the queue wrapped around (which > suggests you might want at least two ring segments). Rejecting badly fragmented URB is almost certainly ok. You really don't want the hardware overhead of processing a TRB every few bytes. This would be especially bad on iommu systems. Before the ring expansion code was added there was an implicit limit of (probably) 125 fragments for a URB. Exceeding this limit wasn't the reason for adding the ring expansion code. And, as I've pointed out, both bulk and isoc URB can be split unless they are using too many fragments for a short piece of data. The current code refuses to write into a TRB segment until it is empty - I think that restriction is only there so that it can add another segment when the space runs out. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Tue, 12 Nov 2013, David Laight wrote: > > You're right. I do wish the spec had been written more clearly. > > I've read a lot of hardware specs in my time ... > > > > Reading it all again makes me think that a LINK trb is only > > > allowed on the burst boundary (which might be 16k bytes). > > > The only real way to implement that is to ensure that TD never > > > contain LINK TRB. > > > > That's one way to do it. Or you could allow a Link TRB at an > > intermediate MBP boundary. > > If all the fragments are larger than the MBP (assume 16k) then > that would be relatively easy. However that is very dependant > on the source of the data. It might be true for disk data, but > is unlikely to be true for ethernet data. I don't quite understand your point. Are you saying that if all the TRBs are very short, you might need more than 64 TRBs to reach a 16-KB boundary? > For bulk data the link TRB can be forced at a packet boundary > by splitting the TD up - the receiving end won't know the difference. That won't work. What happens if you split a TD up into two pieces and the first piece receives a short packet? The host controller will automatically move to the start of the second piece. That's not what we want. > > It comes down to a question of how often you want the controller to > > issue an interrupt. If a ring segment is 4 KB (one page), then it can > > hold 256 TRBs. With scatter-gather transfers, each SG element > > typically refers to something like a 2-page buffer (depends on how > > fragmented the memory is). Therefore a ring segment will describe > > somewhere around 512 pages of data, i.e., something like 2 MB. Since > > SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 > > interrupts every second just because of ring segment crossings. > > 250 interrupts/sec is noise. Send/receive 13000 ethernet packets/sec > and then look at the interrupt rate! > > There is no necessity for taking an interrupt from every link segment. Yes, there is. The HCD needs to know when the dequeue pointer has moved beyond the end of the ring segment, so that it can start reusing the TRB slots in that segment. Suppose you have queued a bulk URB because there weren't enough free TRB slots. How else would you know when the occupied slots became available? > The current ring segments contain 64 entries, a strange choice > since they are created with 2 segments. > (The ring expansion code soon doubles that for my ethernet traffic.) > > I would change the code to use a single segment (for coding simplicity) > and queue bulk URB when there isn't enough ring space. > URB with too many fragments could either be rejected, sent in sections, > or partially linearised (and probably still sent in sections). Rejecting an URB is not feasible. Splitting it up into multiple TDs is not acceptable, as explained above. Sending it in sections (i.e., queueing only some of the TRBs at any time) would work, provided you got at least two interrupts every time the queue wrapped around (which suggests you might want at least two ring segments). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> You're right. I do wish the spec had been written more clearly. I've read a lot of hardware specs in my time ... > > Reading it all again makes me think that a LINK trb is only > > allowed on the burst boundary (which might be 16k bytes). > > The only real way to implement that is to ensure that TD never > > contain LINK TRB. > > That's one way to do it. Or you could allow a Link TRB at an > intermediate MBP boundary. If all the fragments are larger than the MBP (assume 16k) then that would be relatively easy. However that is very dependant on the source of the data. It might be true for disk data, but is unlikely to be true for ethernet data. For bulk data the link TRB can be forced at a packet boundary by splitting the TD up - the receiving end won't know the difference. > It comes down to a question of how often you want the controller to > issue an interrupt. If a ring segment is 4 KB (one page), then it can > hold 256 TRBs. With scatter-gather transfers, each SG element > typically refers to something like a 2-page buffer (depends on how > fragmented the memory is). Therefore a ring segment will describe > somewhere around 512 pages of data, i.e., something like 2 MB. Since > SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 > interrupts every second just because of ring segment crossings. 250 interrupts/sec is noise. Send/receive 13000 ethernet packets/sec and then look at the interrupt rate! There is no necessity for taking an interrupt from every link segment. OTOH an interrupt is requested for every bulk TD. I'm sending ethernet data (with TSO) and each TD is just under 64k mostly made up of 3 or 4 fragments. The receive side is interrupting for every receive packet. > Using larger ring segments would help. The current ring segments contain 64 entries, a strange choice since they are created with 2 segments. (The ring expansion code soon doubles that for my ethernet traffic.) I would change the code to use a single segment (for coding simplicity) and queue bulk URB when there isn't enough ring space. URB with too many fragments could either be rejected, sent in sections, or partially linearised (and probably still sent in sections). David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Mon, 11 Nov 2013, David Laight wrote: > > Suppose, for example, the MBP is 1024. If you have a TD with length > > 1500, and if it had only one fragment, the last (and only) fragment's > > length would not less than the MBP and it would not be an exact > > multiple of the MBP. > > That doesn't matter - eg example 2 in figure 25 You're right. I do wish the spec had been written more clearly. > Reading it all again makes me think that a LINK trb is only > allowed on the burst boundary (which might be 16k bytes). > The only real way to implement that is to ensure that TD never > contain LINK TRB. That's one way to do it. Or you could allow a Link TRB at an intermediate MBP boundary. It comes down to a question of how often you want the controller to issue an interrupt. If a ring segment is 4 KB (one page), then it can hold 256 TRBs. With scatter-gather transfers, each SG element typically refers to something like a 2-page buffer (depends on how fragmented the memory is). Therefore a ring segment will describe somewhere around 512 pages of data, i.e., something like 2 MB. Since SuperSpeed is 500 MB/s, you'd end up getting in the vicinity of 250 interrupts every second just because of ring segment crossings. Using larger ring segments would help. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Alan Stern [mailto:st...@rowland.harvard.edu] > On Fri, 8 Nov 2013, David Laight wrote: > ... > GET_MAX_PACKET always returns MaxPacketSize, and for USB-3 bulk > endpoints, MaxPacketSize is always 1024. MaxBurstSize can be anything > from 1 to 16. I've just read something that explained about bursts. > > > According to my version of the spec (Rev 1.0, dated 5/21/2010), if a TD > > > is larger than the MBP and its length isn't a multiple of the MBP, then > > > the last MBP boundary in the TD must also be a TRB boundary. This > > > follows from two statements in section 4.11.7.1: > > > > > > If the TD Transfer Size is an even multiple of the MBP then all > > > TD Fragments shall define exact multiples of MBP data bytes. > > > If not, then only the last TD Fragment shall define less than > > > MBP data (or the Residue) bytes. > > > > No, that doesn't stop there being only 1 TD fragment. > > (I think it means exact multiple of the MBP.) > > Suppose, for example, the MBP is 1024. If you have a TD with length > 1500, and if it had only one fragment, the last (and only) fragment's > length would not less than the MBP and it would not be an exact > multiple of the MBP. That doesn't matter - eg example 2 in figure 25 > I agree that the text is not as clear as it should be. Reading it all again makes me think that a LINK trb is only allowed on the burst boundary (which might be 16k bytes). The only real way to implement that is to ensure that TD never contain LINK TRB. There are other things that can be done at a TD fragment boundary (or only once per TD fragment) - but the code doesn't attempt any of them. The restriction might be there to simplify the hardware to retransmit as TRB burst. My USB3 ethernet card (ax88179_178a driver) is running a lot more reliably with these changes. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Fri, 8 Nov 2013, David Laight wrote: > > The whole idea of TD fragments is exceedingly cloudy. The definition > > doesn't make sense, and the spec doesn't say what the actual hardware > > restrictions are, i.e., what is the underlying reality that the TD > > fragment concept wants to express. > > I think that a TD fragment boundary exists whenever a TRB/dma boundary > lines up with the packet boundary. The spec doesn't say that. > > Even if you do your best to ignore the whole idea, there still appear > > to be certain restrictions you need to obey. In addition to the > > question of where Link TRBs are allowed, you also have to worry about > > TDs whose size isn't a multiple of the Max Burst Payload (MBP) size = > > MaxBurstSize * MaxPacketSize. > > I'm not certain what MaxBurstSize and MaxPacketSize are, but the important > number is the MBP. I'm fairly sure that is the value returned by > GET_MAX_PACKET() - 1024 for USB3 bulk endpoints. GET_MAX_PACKET always returns MaxPacketSize, and for USB-3 bulk endpoints, MaxPacketSize is always 1024. MaxBurstSize can be anything from 1 to 16. > > According to my version of the spec (Rev 1.0, dated 5/21/2010), if a TD > > is larger than the MBP and its length isn't a multiple of the MBP, then > > the last MBP boundary in the TD must also be a TRB boundary. This > > follows from two statements in section 4.11.7.1: > > > > If the TD Transfer Size is an even multiple of the MBP then all > > TD Fragments shall define exact multiples of MBP data bytes. > > If not, then only the last TD Fragment shall define less than > > MBP data (or the Residue) bytes. > > No, that doesn't stop there being only 1 TD fragment. > (I think it means exact multiple of the MBP.) Suppose, for example, the MBP is 1024. If you have a TD with length 1500, and if it had only one fragment, the last (and only) fragment's length would not less than the MBP and it would not be an exact multiple of the MBP. I agree that the text is not as clear as it should be. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> -Original Message- > From: Alan Stern [mailto:st...@rowland.harvard.edu] > Sent: 08 November 2013 16:46 > To: Sarah Sharp > Cc: David Laight; net...@vger.kernel.org; linux-usb@vger.kernel.org > Subject: Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload > burst. > > On Thu, 7 Nov 2013, Sarah Sharp wrote: > > > On Thu, Nov 07, 2013 at 05:20:49PM -, David Laight wrote: > > > > > > Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link > > > TRB > > > can only occur at a boundary between underlying USB frames (512 bytes for > > > 480M). > > > > Which version of the spec are you looking at? I'm looking at the > > updated version from 08/2012 and I don't see any such requirement. > > > > I do see that section says "A TD Fragment shall not span Transfer Ring > > Segments" where a TD fragment is defined as exact multiples of Max Burst > > Size * Max Packet Size bytes for the endpoint. Is that what you mean > > about USB frames? > > ... > > > The driver would also have to make sure that TD fragments didn't have > > link TRBs in them. That's an issue, since the spec decidedly unclear > > about what exactly comprises a TD fragment. Is the xHC host greedy, and > > will lump all multiples into one TD fragment? Or will it assume the TD > > fragment has ended once it consumes one multiple of Max Burst Size * Max > > Packet Size bytes? > > > > This ambiguity means it's basically impossible to figure out whether a > > TD with a link TRB in the middle is constructed properly. The xHCI spec > > architect didn't have a good solution to this problem, so I punted and > > never implemented TD fragments. If this is really an issue, it's going > > to be pretty complex to solve. > > This is something I wanted to ask you about also. > > The whole idea of TD fragments is exceedingly cloudy. The definition > doesn't make sense, and the spec doesn't say what the actual hardware > restrictions are, i.e., what is the underlying reality that the TD > fragment concept wants to express. I think that a TD fragment boundary exists whenever a TRB/dma boundary lines up with the packet boundary. > Even if you do your best to ignore the whole idea, there still appear > to be certain restrictions you need to obey. In addition to the > question of where Link TRBs are allowed, you also have to worry about > TDs whose size isn't a multiple of the Max Burst Payload (MBP) size = > MaxBurstSize * MaxPacketSize. I'm not certain what MaxBurstSize and MaxPacketSize are, but the important number is the MBP. I'm fairly sure that is the value returned by GET_MAX_PACKET() - 1024 for USB3 bulk endpoints. > According to my version of the spec (Rev 1.0, dated 5/21/2010), if a TD > is larger than the MBP and its length isn't a multiple of the MBP, then > the last MBP boundary in the TD must also be a TRB boundary. This > follows from two statements in section 4.11.7.1: > > If the TD Transfer Size is an even multiple of the MBP then all > TD Fragments shall define exact multiples of MBP data bytes. > If not, then only the last TD Fragment shall define less than > MBP data (or the Residue) bytes. No, that doesn't stop there being only 1 TD fragment. (I think it means exact multiple of the MBP.) > So if a TD is longer than MBP then it must contain at least two TD > fragments, and the last fragment must consist of the last Residue > bytes (i.e., the bytes beyond the last MBP boundary). > > Each TD Fragment is comprised of one or more TRBs. > > Hence the last MBP boundary in the TD, which marks the beginning of the > last TD fragment, must also be a TRB boundary. > > I have no idea whether violating this restriction will cause trouble > for real hardware. Violating the one for LINK TRBs is definitely a problem. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Thu, 7 Nov 2013, Sarah Sharp wrote: > On Thu, Nov 07, 2013 at 05:20:49PM -, David Laight wrote: > > > > Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB > > can only occur at a boundary between underlying USB frames (512 bytes for > > 480M). > > Which version of the spec are you looking at? I'm looking at the > updated version from 08/2012 and I don't see any such requirement. > > I do see that section says "A TD Fragment shall not span Transfer Ring > Segments" where a TD fragment is defined as exact multiples of Max Burst > Size * Max Packet Size bytes for the endpoint. Is that what you mean > about USB frames? ... > The driver would also have to make sure that TD fragments didn't have > link TRBs in them. That's an issue, since the spec decidedly unclear > about what exactly comprises a TD fragment. Is the xHC host greedy, and > will lump all multiples into one TD fragment? Or will it assume the TD > fragment has ended once it consumes one multiple of Max Burst Size * Max > Packet Size bytes? > > This ambiguity means it's basically impossible to figure out whether a > TD with a link TRB in the middle is constructed properly. The xHCI spec > architect didn't have a good solution to this problem, so I punted and > never implemented TD fragments. If this is really an issue, it's going > to be pretty complex to solve. This is something I wanted to ask you about also. The whole idea of TD fragments is exceedingly cloudy. The definition doesn't make sense, and the spec doesn't say what the actual hardware restrictions are, i.e., what is the underlying reality that the TD fragment concept wants to express. Even if you do your best to ignore the whole idea, there still appear to be certain restrictions you need to obey. In addition to the question of where Link TRBs are allowed, you also have to worry about TDs whose size isn't a multiple of the Max Burst Payload (MBP) size = MaxBurstSize * MaxPacketSize. According to my version of the spec (Rev 1.0, dated 5/21/2010), if a TD is larger than the MBP and its length isn't a multiple of the MBP, then the last MBP boundary in the TD must also be a TRB boundary. This follows from two statements in section 4.11.7.1: If the TD Transfer Size is an even multiple of the MBP then all TD Fragments shall define exact multiples of MBP data bytes. If not, then only the last TD Fragment shall define less than MBP data (or the Residue) bytes. So if a TD is longer than MBP then it must contain at least two TD fragments, and the last fragment must consist of the last Residue bytes (i.e., the bytes beyond the last MBP boundary). Each TD Fragment is comprised of one or more TRBs. Hence the last MBP boundary in the TD, which marks the beginning of the last TD fragment, must also be a TRB boundary. I have no idea whether violating this restriction will cause trouble for real hardware. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> While this change improves things a lot (it runs for 20 minutes rather than > a few seconds), there is still something else wrong. Almost certainly caused by an unrelated local change. David -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
> From: Sarah Sharp [mailto:sarah.a.sh...@linux.intel.com] > On Thu, Nov 07, 2013 at 05:20:49PM -, David Laight wrote: > > > > Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB > > can only occur at a boundary between underlying USB frames (512 bytes for > > 480M). > > Which version of the spec are you looking at? I'm looking at the > updated version from 08/2012 and I don't see any such requirement. The copy I downloaded last week is dated 5/21/10, seems the copy on the web site hasn't been updated. > I do see that section says "A TD Fragment shall not span Transfer Ring > Segments" where a TD fragment is defined as exact multiples of Max Burst > Size * Max Packet Size bytes for the endpoint. Is that what you mean > about USB frames? That is the clause, it needs some understanding. Figure 24 shows a correctly aligned link TRB - one where the TRB boundary is aligned with a packet boundary (512 bytes at 480M, 1k at 5G etc). > > If this isn't done the USB frames aren't formatted correctly and, for > > example, > > the USB3 ethernet ax88179_178a card will stop sending (while still > > receiving) > > when running a netperf tcp transmit test with (say) and 8k buffer. > > Which driver does that use? Is it using the scatter gather list > mechanism for URBs (i.e. urb->sg)? Or is it submitting multiple URBs > for fragmented buffers? Or is it submitting isochronous URBs with > multiple frames? Or is it submitting bulk URBs with one transfer buffer > that crosses the 64KB boundary? > > I'm trying to understand what the USB device driver is doing in order to > create multi-TRB TDs that would violate the TD fragment rules. The usb setup is done by net/usbnet.c, the ax88179_178a driver just adds a header to the ethernet frame. Since the hardware support TCP segment offload the urb->sg list is used, the first fragment seems to be 1578 bytes and is followed by two or three fragments (split on an 0x8000 boundary) making up the rest of the 65234 bytes. (I haven't yet seen a buffer that crossed a 64k boundary). I'm running netperf - which is doing a burst of 8k sends into a TCP socket. > > While this change improves things a lot (it runs for 20 minutes rather than > > a few seconds), there is still something else wrong. > > > > This should be a candidate for stable, the ax88179_178a driver defaults to > > gso and tso enabled so it passes a lot of fragmented skb to the USB stack. > > I want to understand what the larger issue is here before pushing > bandaid fixes. > > I don't think this is the right approach, because prepare_ring() can be > called for isochronous URBs that get mapped to multiple TDs. The TD > fragment rules only apply to one TD, so failing URB submission because > many isochronous TDs span more than one ring segment doesn't seem like > the right approach. Inserting no-op TRBs will also result in odd > isochronous behavior, since a no-op TRB means the xHC should not send or > request an isochronous packet for that service interval. I wasn't aware of that effect of NOP TRB on isoc data. Three obvious options: 1) Pass an extra parameter indicating that the caller will handle an TRB data alignment issues at the link TRB. 2) Add a test for bulk endpoints (ok until the isoc code gets modified to support SG data). 3) Copy the LINK TRB instead of adding NOPs (would require the rest of the code use the ring length instead of reading the TRB type). The second is probably the easiest way to isolate the change. However there could be problems on an isoc endpoint if the data crosses a 64k boundary. > The patch also doesn't address the underlying issue of constructing > multi-TRB TDs so that it fits the TD fragment rules. Your patch ensures > there are no link TRBs in the middle of a TD, but it doesn't ensure that > the TRBs in the TD are exact multiples of the Max Burst Size * Max > Packet Size bytes for the endpoint. That doesn't matter, look at figure 25. A TD can be any number of USB packets and any number or TRB. The constraint is that a TD cannot contain a LINK TRB. > Instead, new code in count_sg_trbs_needed() should break the TD into > proper TD fragments. The queueing code for all endpoints would also > have to follow those rules. This would have to happen while still > respecting the 64KB boundary rule. > > The driver would also have to make sure that TD fragments didn't have > link TRBs in them. That's an issue, since the spec decidedly unclear > about what exactly comprises a TD fragment. Is the xHC host greedy, and > will lump all multiples into one TD fragment? Or will it assume the TD > fragment has ended once it consumes one multiple of Max Burst Size * Max > Packet Size bytes? It only matters at a LINK TRB (unless you want to request an interrupt). The constraint seems to be that the data TRB before a link TRB must end on a MBP boundary. > This ambiguity means it's basically impossible to figure out whether a > TD with a link TRB
Re: [PATCH] usb: xhci: Link TRB must not occur with a USB payload burst.
On Thu, Nov 07, 2013 at 05:20:49PM -, David Laight wrote: > > Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB > can only occur at a boundary between underlying USB frames (512 bytes for > 480M). Which version of the spec are you looking at? I'm looking at the updated version from 08/2012 and I don't see any such requirement. I do see that section says "A TD Fragment shall not span Transfer Ring Segments" where a TD fragment is defined as exact multiples of Max Burst Size * Max Packet Size bytes for the endpoint. Is that what you mean about USB frames? > If this isn't done the USB frames aren't formatted correctly and, for example, > the USB3 ethernet ax88179_178a card will stop sending (while still receiving) > when running a netperf tcp transmit test with (say) and 8k buffer. Which driver does that use? Is it using the scatter gather list mechanism for URBs (i.e. urb->sg)? Or is it submitting multiple URBs for fragmented buffers? Or is it submitting isochronous URBs with multiple frames? Or is it submitting bulk URBs with one transfer buffer that crosses the 64KB boundary? I'm trying to understand what the USB device driver is doing in order to create multi-TRB TDs that would violate the TD fragment rules. > While this change improves things a lot (it runs for 20 minutes rather than > a few seconds), there is still something else wrong. > > This should be a candidate for stable, the ax88179_178a driver defaults to > gso and tso enabled so it passes a lot of fragmented skb to the USB stack. I want to understand what the larger issue is here before pushing bandaid fixes. I don't think this is the right approach, because prepare_ring() can be called for isochronous URBs that get mapped to multiple TDs. The TD fragment rules only apply to one TD, so failing URB submission because many isochronous TDs span more than one ring segment doesn't seem like the right approach. Inserting no-op TRBs will also result in odd isochronous behavior, since a no-op TRB means the xHC should not send or request an isochronous packet for that service interval. The patch also doesn't address the underlying issue of constructing multi-TRB TDs so that it fits the TD fragment rules. Your patch ensures there are no link TRBs in the middle of a TD, but it doesn't ensure that the TRBs in the TD are exact multiples of the Max Burst Size * Max Packet Size bytes for the endpoint. Instead, new code in count_sg_trbs_needed() should break the TD into proper TD fragments. The queueing code for all endpoints would also have to follow those rules. This would have to happen while still respecting the 64KB boundary rule. The driver would also have to make sure that TD fragments didn't have link TRBs in them. That's an issue, since the spec decidedly unclear about what exactly comprises a TD fragment. Is the xHC host greedy, and will lump all multiples into one TD fragment? Or will it assume the TD fragment has ended once it consumes one multiple of Max Burst Size * Max Packet Size bytes? This ambiguity means it's basically impossible to figure out whether a TD with a link TRB in the middle is constructed properly. The xHCI spec architect didn't have a good solution to this problem, so I punted and never implemented TD fragments. If this is really an issue, it's going to be pretty complex to solve. > Signed-off-by: David Laight > --- > > Although I've got a USB2 analyser its trigger and trace stuff is almost > unusable! In any case this fails much faster on USB3 (Intel i7 cpu). I thought you were using a big endian system? > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 5480215..23abc9b 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -2927,8 +2932,43 @@ static int prepare_ring(struct xhci_hcd *xhci, struct > xhci_ring *ep_ring, > } > > while (1) { > - if (room_on_ring(xhci, ep_ring, num_trbs)) > - break; > + if (room_on_ring(xhci, ep_ring, num_trbs)) { > + union xhci_trb *trb = ep_ring->enqueue; > + unsigned int usable = ep_ring->enq_seg->trbs + > + TRBS_PER_SEGMENT - 1 - trb; > + u32 nop_cmd; > + > + /* > + * Section 4.11.7.1 TD Fragments states that a link > + * TRB must only occur at the boundary between > + * data bursts (eg 512 bytes for 480M). > + * While it is possible to split a large fragment > + * we don't know the size yet. > + * Simplest solution is to fill the trb before the > + * LINK with nop commands. > + */ > + if (num_trbs == 1 || num_trbs <= usable || usable == 0) > + break; > + > + if