RE: Bug#733826: crazy loop xhci_hcd Too many fragments
From: Alan Stern Subject: Re: Bug#733826: crazy loop xhci_hcd Too many fragments On Mon, 6 Jan 2014, Ben Hutchings wrote: On Sat, 2014-01-04 at 05:44 +0800, jida...@jidanni.org wrote: ... # cat /var/log/syslog Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 5 lo ::1 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 6 eth0 fe80::2289:84ff:fe28:ad9 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: peers refreshed Jan 1 06:57:38 jidanni5 ntpd[2822]: Listening on routing socket on fd #23 for interface updates Jan 1 07:04:49 jidanni5 kernel: [ 559.624680] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624695] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624704] xhci_hcd :00:14.0: Too many fragments 79, max 63 10 lines later... oops I mean an actual MILLION lines later Assuming my fix for the repetition is correct, the remaining problem is why usb-storage is generating such large/fragmented urbs. usb-storage doesn't generate large or fragmented anything. It merely passes on the scatter-gather information it gets from the block layer. Although not a real fix to the underlying problem, it seems that the default ring size is far too small. Any amount of network traffic also activates the ring expansion code. IIRC each ring entry is 16 bytes, so increasing the ring size to 256 still keeps the rings to a single 4k page. Whether anything regularly exceeds 255 fragments is a another matter. David -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug#733826: crazy loop xhci_hcd Too many fragments
On Mon, Jan 06, 2014 at 03:52:24PM +, David Laight wrote: From: Alan Stern Subject: Re: Bug#733826: crazy loop xhci_hcd Too many fragments On Mon, 6 Jan 2014, Ben Hutchings wrote: On Sat, 2014-01-04 at 05:44 +0800, jida...@jidanni.org wrote: ... # cat /var/log/syslog Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 5 lo ::1 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 6 eth0 fe80::2289:84ff:fe28:ad9 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: peers refreshed Jan 1 06:57:38 jidanni5 ntpd[2822]: Listening on routing socket on fd #23 for interface updates Jan 1 07:04:49 jidanni5 kernel: [ 559.624680] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624695] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624704] xhci_hcd :00:14.0: Too many fragments 79, max 63 10 lines later... oops I mean an actual MILLION lines later Assuming my fix for the repetition is correct, the remaining problem is why usb-storage is generating such large/fragmented urbs. usb-storage doesn't generate large or fragmented anything. It merely passes on the scatter-gather information it gets from the block layer. Although not a real fix to the underlying problem, it seems that the default ring size is far too small. Did you mean ring segment size? Any amount of network traffic also activates the ring expansion code. IIRC each ring entry is 16 bytes, so increasing the ring size to 256 still keeps the rings to a single 4k page. Whether anything regularly exceeds 255 fragments is a another matter. If so, yes, changing the segment size makes sense. TRBS_PER_SEGMENT could be increased to 256. I'm not sure if we should switch to using dma_alloc_coherent instead of a DMA pool. Some systems could be using bigger than 4K pages, so we should probably still stick with DMA pools. Ben, can you change your patch to increase TRBS_PER_SEGMENT to 256? Sarah Sharp -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug#733826: crazy loop xhci_hcd Too many fragments
On Mon, Jan 06, 2014 at 10:06:33AM -0500, Alan Stern wrote: On Mon, 6 Jan 2014, Ben Hutchings wrote: On Sat, 2014-01-04 at 05:44 +0800, jida...@jidanni.org wrote: BH == Ben Hutchings b...@decadent.org.uk writes: BH And what were those error messages? BH Which USB devices are you using (this is probably disk or network BH related)? I had done an aptitude update on writing onto # fdisk -l Disk /dev/sdg: 3867 MB, 3867148288 bytes OK, the important thing is it's usb-storage. 181 heads, 32 sectors/track, 1304 cylinders, total 7553024 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xc3072e18 # mount Device Boot Start End Blocks Id System /dev/sdg1 32 868799 434384 83 Linux /dev/sdg2 868800 7553023 3342112 83 Linux /dev/sdg2 on /var/cache/apt/archives type ext3 (rw,noatime,errors=remount-ro,data=ordered) /dev/sdg1 on /var/lib/apt/lists type ext3 (rw,noatime,errors=remount-ro,data=ordered) # cat /var/log/syslog Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 5 lo ::1 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 6 eth0 fe80::2289:84ff:fe28:ad9 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: peers refreshed Jan 1 06:57:38 jidanni5 ntpd[2822]: Listening on routing socket on fd #23 for interface updates Jan 1 07:04:49 jidanni5 kernel: [ 559.624680] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624695] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624704] xhci_hcd :00:14.0: Too many fragments 79, max 63 10 lines later... oops I mean an actual MILLION lines later Assuming my fix for the repetition is correct, the remaining problem is why usb-storage is generating such large/fragmented urbs. usb-storage doesn't generate large or fragmented anything. It merely passes on the scatter-gather information it gets from the block layer. And the block layer depends on drivers to tell it what their scatter-gather capabilities are. The answer appears to be that xhci is lying: int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks) { [...] /* Accept arbitrarily long scatter-gather lists */ hcd-self.sg_tablesize = ~0; and this value gets copied up the stack to the block layer. Ben. -- Ben Hutchings The two most common things in the universe are hydrogen and stupidity. -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug#733826: crazy loop xhci_hcd Too many fragments
On Sat, 2014-01-04 at 05:44 +0800, jida...@jidanni.org wrote: BH == Ben Hutchings b...@decadent.org.uk writes: BH And what were those error messages? BH Which USB devices are you using (this is probably disk or network BH related)? I had done an aptitude update on writing onto # fdisk -l Disk /dev/sdg: 3867 MB, 3867148288 bytes OK, the important thing is it's usb-storage. 181 heads, 32 sectors/track, 1304 cylinders, total 7553024 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xc3072e18 # mount Device Boot Start End Blocks Id System /dev/sdg1 32 868799 434384 83 Linux /dev/sdg2 868800 7553023 3342112 83 Linux /dev/sdg2 on /var/cache/apt/archives type ext3 (rw,noatime,errors=remount-ro,data=ordered) /dev/sdg1 on /var/lib/apt/lists type ext3 (rw,noatime,errors=remount-ro,data=ordered) # cat /var/log/syslog Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 5 lo ::1 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: Listen normally on 6 eth0 fe80::2289:84ff:fe28:ad9 UDP 123 Jan 1 06:57:38 jidanni5 ntpd[2822]: peers refreshed Jan 1 06:57:38 jidanni5 ntpd[2822]: Listening on routing socket on fd #23 for interface updates Jan 1 07:04:49 jidanni5 kernel: [ 559.624680] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624695] xhci_hcd :00:14.0: Too many fragments 79, max 63 Jan 1 07:04:49 jidanni5 kernel: [ 559.624704] xhci_hcd :00:14.0: Too many fragments 79, max 63 10 lines later... oops I mean an actual MILLION lines later Assuming my fix for the repetition is correct, the remaining problem is why usb-storage is generating such large/fragmented urbs. (And how did this work before the recent changes to Link TRBs? Or did it result in a different failure mode?) [...] Jan 1 07:04:58 jidanni5 kernel: [ 568.615784] usb 1-4.3: USB disconnect, device number 5 Jan 1 07:04:58 jidanni5 kernel: [ 568.622573] sd 7:0:0:0: [sdg] Unhandled error code Jan 1 07:04:58 jidanni5 kernel: [ 568.622577] sd 7:0:0:0: [sdg] Jan 1 07:04:58 jidanni5 kernel: [ 568.622579] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK Jan 1 07:04:58 jidanni5 kernel: [ 568.622581] sd 7:0:0:0: [sdg] CDB: Jan 1 07:04:58 jidanni5 kernel: [ 568.622583] Write(10): 2a 00 00 06 85 0e 00 00 da 00 I think this is a write of 218 sectors, presumably 512 bytes each. Jan 1 07:04:58 jidanni5 kernel: [ 568.622591] end_request: I/O error, dev sdg, sector 427278 Jan 1 07:04:58 jidanni5 kernel: [ 568.622595] Buffer I/O error on device sdg1, logical block 213623 Jan 1 07:04:58 jidanni5 kernel: [ 568.622596] lost page write due to I/O error on sdg1 Jan 1 07:04:58 jidanni5 kernel: [ 568.622673] Aborting journal on device sdg1-8. Jan 1 07:04:58 jidanni5 kernel: [ 568.622702] JBD2: Error -5 detected when updating journal superblock for sdg1-8. Jan 1 07:04:58 jidanni5 kernel: [ 568.622782] journal commit I/O error [...] Ben. -- Ben Hutchings Any smoothly functioning technology is indistinguishable from a rigged demo. signature.asc Description: This is a digitally signed message part
Re: Bug#733826: crazy loop xhci_hcd Too many fragments
BH == Ben Hutchings b...@decadent.org.uk writes: BH Assuming my fix for the repetition is correct, the remaining problem is BH why usb-storage is generating such large/fragmented urbs. (And how did BH this work before the recent changes to Link TRBs? Or did it result in a BH different failure mode?) Well all I know is now in my cat /etc/ppp/ip-up.d/z50apt-get I added syncs, sync; apt-get update; sync and haven't had the problem again, and now using $ uname -a Linux jidanni3 3.12-1-686-pae #1 SMP Debian 3.12.6-2 (2013-12-29) i686 GNU/Linux -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html