Базы данных (окт 2017) +79139230330 Пишите (whats\vib\telegr\imo\sms)!
Базы данных (окт 2017) +79139230330 Пишите (whats\vib\telegr\imo\sms)!
Re: Increasing TRBS_PER_SEGMENT causes DMAR/PTE faults
On Fri, Aug 07, 2015 at 05:39:55PM +0300, Mathias Nyman wrote: > > On 07.08.2015 15:40, linux-...@andraxin.se wrote: > > Hi there, > > Hi Hi again (a few months later, when I finally found some time) > > Moving from 4.0.4 to 4.0.5, my USB 3.0 controller became practically useless > > (and this same issue persists [at least] upto and including 4.1.3). > > Here's an excerpt from my syslog demonstrating the problem: > > > ... > > > > Looking in ChangeLog-4.0.5 for xhci-related changes gave me 3 hits: > > > > (1) xhci: gracefully handle xhci_irq dead device > > commit 948fa13504f80b9765d2b753691ab94c83a10341 upstream. > > > > (2) xhci: Solve full event ring by increasing TRBS_PER_SEGMENT to 256 > > commit 18cc2f4cbbaf825a4fedcf2d60fd388d291e0a38 upstream. > > > > (3) xhci: fix isoc endpoint dequeue from advancing too far on transaction > > error > > commit d104d0152a97fade389f47635b73a9ccc7295d0b upstream. > > > > The first one seemed fairly irrelevant. The third one mentions DMA, but > > looking at the actual changes, I couldn't see an obvious connection to the > > 'PTE read access' issue. So, I tested reverting the second patch in my > > current (4.1.3) kernel and my USB 3.0 controller started working again. > > > > This is good enough for me, but I thought you might want to look more > > closely > > at the root cause, and ensure that the PTE is correctly setup even with the > > increased TRBS_PER_SEGMENT. > > > > We just found that the TRBS_PER_SEGMENT reveals an old off by one error in an > upper boundary check. > Previously a ring segment didn't use a full memory page, and each ring > segment was allocated a new page, so the +1 off by one never caused any harm. > > Now that we use the full memory page the off by one actually allowed us going > past the allocated page. > > This is fixed in patch > commit 7895086afde2a05fa24a0e410d8e6b75ca7c8fdd > xhci: fix off by one error in TRB DMA address boundary check > > Fix is now in Greg's usb-linus branch, and should end up in linus 4.2 kernel > (rc6 earliest) > > Does that fix help in your case? No, actually, it did not; although at first I thought it did, but that was likely because my patch reverting the TRBS_PER_SEGMENT adjustment was (accidentally) still active when I did the testing. I did more thorough tests with patch-free 4.1.3 and 4.1.4 yesterday, and the results are the following: fixing the off-by-one error alone did not help for 4.1.3/4.1.4, I still had to revert the TRBS_PER_SEGMENT patch as well to get a working system. In fact, the same problem persists even for my current kernel (4.3.5), except that there it's enought with reverting the TRBS_PER_SEGMENT patch, since the fix for the off-by-one error has long been incorported into mainline. I'm still interested in tracking this down, although the issue may be quite specific to the combination of older HW in my stationary box (GA-EQ45M-S2 motherboard with LogiLink PC0057 PCI-e 4-port USB3 card), since I never had any problems running the exact same kernel on a variety of other (more modern, mostly laptop) HW. Regards, --@; -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Increasing TRBS_PER_SEGMENT causes DMAR/PTE faults
Hi there, Moving from 4.0.4 to 4.0.5, my USB 3.0 controller became practically useless (and this same issue persists [at least] upto and including 4.1.3). Here's an excerpt from my syslog demonstrating the problem: -=-=- Jun 24 23:15:22 eleven kernel: usb 2-4: new SuperSpeed USB device number 2 using xhci_hcd Jun 24 23:15:22 eleven kernel: usb 2-4: New USB device found, idVendor=0781, idProduct=5588 Jun 24 23:15:22 eleven kernel: usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Jun 24 23:15:22 eleven kernel: usb 2-4: Product: ExtremePro Jun 24 23:15:22 eleven kernel: usb 2-4: Manufacturer: SanDisk Jun 24 23:15:22 eleven kernel: usb 2-4: SerialNumber: AA010602150045080168 Jun 24 23:15:22 eleven kernel: usb-storage 2-4:1.0: USB Mass Storage device detected Jun 24 23:15:22 eleven kernel: scsi host8: usb-storage 2-4:1.0 Jun 24 23:15:21 eleven mtp-probe[19425]: checking bus 2, device 2: /sys/devices/pci:00/:00:1c.4/:03:00.0/usb2/2-4 Jun 24 23:15:21 eleven mtp-probe[19425]: bus: 2, device: 2 was not an MTP device Jun 24 23:15:23 eleven kernel: scsi 8:0:0:0: Direct-Access SanDisk ExtremePro 0001 PQ: 0 ANSI: 6 Jun 24 23:15:23 eleven kernel: sd 8:0:0:0: [sdi] 250069680 512-byte logical blocks: (128 GB/119 GiB) Jun 24 23:15:23 eleven kernel: sd 8:0:0:0: [sdi] Write Protect is off Jun 24 23:15:23 eleven kernel: sd 8:0:0:0: [sdi] Mode Sense: 53 00 00 08 Jun 24 23:15:23 eleven kernel: sd 8:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Jun 24 23:15:23 eleven kernel: sdi: sdi1 sdi2 sdi3 sdi4 Jun 24 23:15:23 eleven kernel: sd 8:0:0:0: [sdi] Attached SCSI removable disk Jun 24 23:15:23 eleven kernel: dmar: DRHD: handling fault status reg 2 Jun 24 23:15:23 eleven kernel: dmar: DMAR:[DMA Read] Request device [03:00.0] fault addr fffbd000 \x0aDMAR:[fault reason 06] PTE Read access is not set Jun 24 23:15:23 eleven kernel: dmar: DRHD: handling fault status reg 2 Jun 24 23:15:23 eleven kernel: dmar: DMAR:[DMA Read] Request device [03:00.0] fault addr fffbd000 \x0aDMAR:[fault reason 06] PTE Read access is not set Jun 24 23:15:23 eleven kernel: xhci_hcd :03:00.0: WARNING: Host System Error Jun 24 23:16:13 eleven kernel: xhci_hcd :03:00.0: Stopped the command ring failed, maybe the host is dead Jun 24 23:16:15 eleven kernel: xhci_hcd :03:00.0: Abort command ring failed Jun 24 23:16:15 eleven kernel: xhci_hcd :03:00.0: HC died; cleaning up Jun 24 23:16:15 eleven kernel: xhci_hcd :03:00.0: Timeout while waiting for setup device command Jun 24 23:16:15 eleven kernel: usb 1-1: USB disconnect, device number 2 Jun 24 23:16:15 eleven kernel: usb 2-4: USB disconnect, device number 0 -=-=- This is a VIA Technologies, Inc. VL80x xHCI USB 3.0 Controller inserted into an older Gigabyte board with Intel Q35 chipset, VT-d enabled... Looking in ChangeLog-4.0.5 for xhci-related changes gave me 3 hits: (1) xhci: gracefully handle xhci_irq dead device commit 948fa13504f80b9765d2b753691ab94c83a10341 upstream. (2) xhci: Solve full event ring by increasing TRBS_PER_SEGMENT to 256 commit 18cc2f4cbbaf825a4fedcf2d60fd388d291e0a38 upstream. (3) xhci: fix isoc endpoint dequeue from advancing too far on transaction error commit d104d0152a97fade389f47635b73a9ccc7295d0b upstream. The first one seemed fairly irrelevant. The third one mentions DMA, but looking at the actual changes, I couldn't see an obvious connection to the 'PTE read access' issue. So, I tested reverting the second patch in my current (4.1.3) kernel and my USB 3.0 controller started working again. This is good enough for me, but I thought you might want to look more closely at the root cause, and ensure that the PTE is correctly setup even with the increased TRBS_PER_SEGMENT. Kind regards, --@; PS I'd _love_ to help, but I have very little screen time. I'm lucky if I can turn on this box once a week... (That's also why it took me almost 2 months to figure out what the problem was.) DS -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html