Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant Sreedharan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): > On Fri, 2015-04-17 at 15:12 -0400, David Miller wrote: > > From: Konrad Rzeszutek Wilk > > Date: Fri, 17 Apr 2015 15:04:48 -0400 > > > A huge amount of NIC drivers use the DMA API, however if > > > compiled under 32-bit an very important part of the DMA API can > > > be ommitted leading to the drivers not working at all > > > (especially if used with 'swiotlb=force iommu=soft'). ... > > > As such enable this even when using 32-bit kernels. > > > > > > Reported-by: Ian Jackson > > > Signed-off-by: Konrad Rzeszutek Wilk Thanks. Tested-by: Ian Jackson I'd appreciate it if this patch could make its way into the 3.14.y stable branch, as well as all the other places it's applicable of course. I've included another copy below for convenience, with acks etc. from this email thread folded in. I have tested 3.14.34 plus just this patch, with my usual kernel configuration input and the affected machine, and this fixes the problem completely AFAICT. I have tested both baremetal 32-bit with `iommu=soft swiotlb=force' (which previously corrupted all received network packets) and 32-bit Linux under 64-bit Xen without any special options (which previously gave 25-30% packet loss). Thanks, Ian. >From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Fri, 17 Apr 2015 14:55:47 -0400 Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: "the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but ... with "iommu=soft and swiotlb=force" this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address." As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson Signed-off-by: Konrad Rzeszutek Wilk Acked-by: David S. Miller Acked-by: Prashant Sreedharan Tested-by: Ian Jackson --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b7d31ca..570c71d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -177,7 +177,7 @@ config SBUS config NEED_DMA_MAP_STATE def_bool y - depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG + depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB config NEED_SG_DMA_LENGTH def_bool y -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Fri, 2015-04-17 at 15:12 -0400, David Miller wrote: > From: Konrad Rzeszutek Wilk > Date: Fri, 17 Apr 2015 15:04:48 -0400 > > > From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk > > Date: Fri, 17 Apr 2015 14:55:47 -0400 > > Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected > > > > A huge amount of NIC drivers use the DMA API, however if compiled > > under 32-bit an very important part of the DMA API can be ommitted leading > > to the drivers not working at all (especially if used with > > 'swiotlb=force iommu=soft'). > > > > As Prashant Sreedharan explains it: "the driver [tg3] uses > > DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma > > "mapping" and dma_unmap_addr() to get the "mapping" value. On most of > > the platforms this is a no-op, but ... with "iommu=soft and > > swiotlb=force" this house keeping is required, ... otherwise > > we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the > > DMA address." > > > > As such enable this even when using 32-bit kernels. > > > > Reported-by: Ian Jackson > > Signed-off-by: Konrad Rzeszutek Wilk > > Acked-by: David S. Miller Acked-by: Prashant Sreedharan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
From: Konrad Rzeszutek Wilk Date: Fri, 17 Apr 2015 15:04:48 -0400 > From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk > Date: Fri, 17 Apr 2015 14:55:47 -0400 > Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected > > A huge amount of NIC drivers use the DMA API, however if compiled > under 32-bit an very important part of the DMA API can be ommitted leading > to the drivers not working at all (especially if used with > 'swiotlb=force iommu=soft'). > > As Prashant Sreedharan explains it: "the driver [tg3] uses > DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma > "mapping" and dma_unmap_addr() to get the "mapping" value. On most of > the platforms this is a no-op, but ... with "iommu=soft and > swiotlb=force" this house keeping is required, ... otherwise > we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the > DMA address." > > As such enable this even when using 32-bit kernels. > > Reported-by: Ian Jackson > Signed-off-by: Konrad Rzeszutek Wilk Acked-by: David S. Miller -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Fri, Apr 17, 2015 at 10:46:09AM -0700, Michael Chan wrote: > On Fri, 2015-04-17 at 13:19 -0400, David Miller wrote: > > So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in > > situations where we might actually need it to be 'y' based upon kernel > > comman line boot options given. > > > > Right? > > Yes. Would this work ? Peter, Ingo, Thomas, pls see Prashant's thread: http://www.spinics.net/lists/netdev/msg325645.html http://www.spinics.net/lists/netdev/msg325774.html Thank you. >From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Fri, 17 Apr 2015 14:55:47 -0400 Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: "the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but ... with "iommu=soft and swiotlb=force" this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address." As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson Signed-off-by: Konrad Rzeszutek Wilk --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b7d31ca..570c71d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -177,7 +177,7 @@ config SBUS config NEED_DMA_MAP_STATE def_bool y - depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG + depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB config NEED_SG_DMA_LENGTH def_bool y -- 2.1.0 > > > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Fri, 2015-04-17 at 13:19 -0400, David Miller wrote: > So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in > situations where we might actually need it to be 'y' based upon kernel > comman line boot options given. > > Right? Yes. > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
From: Ian Jackson Date: Fri, 17 Apr 2015 17:29:28 +0100 > Prashant Sreedharan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and > 3 more messages]"): >> Ok this is what is causing the problem, the driver uses >> DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma >> "mapping" and dma_unmap_addr() to get the "mapping" value. On most of >> the platforms this is a no-op, but it appears with "iommu=soft and >> swiotlb=force" this house keeping is required, when I pass the correct >> dma_addr instead of 0 while calling pci_unmap_/pci_dma_sync_ I don't see >> the corruption. ie If you set CONFIG_NEED_DMA_MAP_STATE=y in your kernel >> config you should not see the problem. Can you confirm ? Thanks > > That kernel config option is an automatically computed one: > > config NEED_DMA_MAP_STATE > def_bool y > depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG > > and grepping my .config shows: > > # CONFIG_INTEL_IOMMU is not set > # CONFIG_DMA_API_DEBUG is not set > > It's a 32-bit kernel so it hasn't got X86_64 enabled either. > > Arguably at least some of osstest's kernels should have INTEL_IOMMU > enabled to detect conflicts between Xen's use of the iommu and > possible attempts bo Linux to do the same thing, but not having it > enabled should not cause a driver bug. So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in situations where we might actually need it to be 'y' based upon kernel comman line boot options given. Right? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant Sreedharan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): > Ok this is what is causing the problem, the driver uses > DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma > "mapping" and dma_unmap_addr() to get the "mapping" value. On most of > the platforms this is a no-op, but it appears with "iommu=soft and > swiotlb=force" this house keeping is required, when I pass the correct > dma_addr instead of 0 while calling pci_unmap_/pci_dma_sync_ I don't see > the corruption. ie If you set CONFIG_NEED_DMA_MAP_STATE=y in your kernel > config you should not see the problem. Can you confirm ? Thanks That kernel config option is an automatically computed one: config NEED_DMA_MAP_STATE def_bool y depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG and grepping my .config shows: # CONFIG_INTEL_IOMMU is not set # CONFIG_DMA_API_DEBUG is not set It's a 32-bit kernel so it hasn't got X86_64 enabled either. Arguably at least some of osstest's kernels should have INTEL_IOMMU enabled to detect conflicts between Xen's use of the iommu and possible attempts bo Linux to do the same thing, but not having it enabled should not cause a driver bug. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Thu, 2015-04-16 at 18:15 +0100, Ian Jackson wrote: > Michael Chan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more > messages]"): > > On Thu, 2015-04-16 at 09:24 -0300, casca...@linux.vnet.ibm.com wrote: > > > Yes, this looks like the driver is not syncing the DMA buffers. Unmap is > > > supposed to synchronize as well. > > > > For small rx packets (< 256 bytes), we sync the DMA buffer before we > > copy the data to another SKB. For larger packets, we unmap the DMA > > buffer. Do we see the corruption in both cases? > > Yes, at least with swiotlb=force iommu=soft. Ok this is what is causing the problem, the driver uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma "mapping" and dma_unmap_addr() to get the "mapping" value. On most of the platforms this is a no-op, but it appears with "iommu=soft and swiotlb=force" this house keeping is required, when I pass the correct dma_addr instead of 0 while calling pci_unmap_/pci_dma_sync_ I don't see the corruption. ie If you set CONFIG_NEED_DMA_MAP_STATE=y in your kernel config you should not see the problem. Can you confirm ? Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
From: Michael Chan Date: Thu, 16 Apr 2015 09:39:13 -0700 > On Thu, 2015-04-16 at 09:24 -0300, casca...@linux.vnet.ibm.com wrote: >> Yes, this looks like the driver is not syncing the DMA buffers. Unmap is >> supposed to synchronize as well. >> > > For small rx packets (< 256 bytes), we sync the DMA buffer before we > copy the data to another SKB. For larger packets, we unmap the DMA > buffer. Do we see the corruption in both cases? I wonder about that prefetch which is done before the DMA sync. Also we should think about whether that DMA sync applies to the proper region. The 'len' is relative to "data+TG3_RX_OFFSET" yet if I read the code correctly we are effectively sync'ing from 'data' to 'data+len'. There is some bug hiding in here I think... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Michael Chan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): > On Thu, 2015-04-16 at 09:24 -0300, casca...@linux.vnet.ibm.com wrote: > > Yes, this looks like the driver is not syncing the DMA buffers. Unmap is > > supposed to synchronize as well. > > For small rx packets (< 256 bytes), we sync the DMA buffer before we > copy the data to another SKB. For larger packets, we unmap the DMA > buffer. Do we see the corruption in both cases? Yes, at least with swiotlb=force iommu=soft. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Thu, 2015-04-16 at 09:24 -0300, casca...@linux.vnet.ibm.com wrote: > Yes, this looks like the driver is not syncing the DMA buffers. Unmap is > supposed to synchronize as well. > For small rx packets (< 256 bytes), we sync the DMA buffer before we copy the data to another SKB. For larger packets, we unmap the DMA buffer. Do we see the corruption in both cases? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Thu, Apr 16, 2015 at 11:18:39AM +0100, Ian Jackson wrote: > Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more > messages]"): > > Ian, using your config we are able to recreate the problem that you are > > seeing. The driver finds the RX data buffer to be all zero, with a > > analyzer trace we are seeing the chip is DMA'ing valid RX data buffer > > contents to the host but once the driver tries to read this DMA area, it > > is seeing all zero's which is the reason of the corruption. This is only > > for the RX data buffer, the RX descriptor and status block update DMA > > regions are having valid contents. > > I am no expert on this area, but this suggests that the driver is > misoperating the Linux DMA management API. This is what I think > Konrad suspected when he suggested the `iommu=soft swiotlb=force' > command line option. > > Note in kernel-parameters.txt: > > swiotlb=[ARM,IA-64,PPC,MIPS,X86] > Format: { | force } > -- Number of I/O TLB slabs > force -- force using of bounce buffers even if they > wouldn't be automatically used by the kernel > > So with `swiotlb=force' the DMA is _expected_ to go to a bounce buffer > managed by the kernel DMA API. > > > This is unlikely to be a chip or driver issue, as the chip is doing the > > correct DMA but the corruption occurs before driver reads it. Would > > request iommu experts to take a look and suggest what can be done next. > > As I say above I think this is probably a driver bug. > Yes, this looks like the driver is not syncing the DMA buffers. Unmap is supposed to synchronize as well. Prashant, can you point to where in the code you see all zeroes after checking up the data? Cascardo. > I have seen identical symptoms on a >5yo desktop box under my desk and > on two brand new rackmount servers; I therefore doubt that it's a > hardware problem. > > Ian. > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): > Ian, using your config we are able to recreate the problem that you are > seeing. The driver finds the RX data buffer to be all zero, with a > analyzer trace we are seeing the chip is DMA'ing valid RX data buffer > contents to the host but once the driver tries to read this DMA area, it > is seeing all zero's which is the reason of the corruption. This is only > for the RX data buffer, the RX descriptor and status block update DMA > regions are having valid contents. I am no expert on this area, but this suggests that the driver is misoperating the Linux DMA management API. This is what I think Konrad suspected when he suggested the `iommu=soft swiotlb=force' command line option. Note in kernel-parameters.txt: swiotlb=[ARM,IA-64,PPC,MIPS,X86] Format: { | force } -- Number of I/O TLB slabs force -- force using of bounce buffers even if they wouldn't be automatically used by the kernel So with `swiotlb=force' the DMA is _expected_ to go to a bounce buffer managed by the kernel DMA API. > This is unlikely to be a chip or driver issue, as the chip is doing the > correct DMA but the corruption occurs before driver reads it. Would > request iommu experts to take a look and suggest what can be done next. As I say above I think this is probably a driver bug. I have seen identical symptoms on a >5yo desktop box under my desk and on two brand new rackmount servers; I therefore doubt that it's a hardware problem. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On 4/15/2015 3:54 AM, Ian Jackson wrote: Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): I tried to reproduce the problem on 32 bit 3.14.34 stable kernel baremetal, with iommu=soft swiotlb=force but no luck, no drops or errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer trace to confirm the problem. Is it feasible to capture at your end ? In private correspondence with Prashant we have established that Prashant was using a different kernel configuration. Prashant provided me with their kernel and module binaries, which work in my environment. I have also established that I can reproduce the problem with 3.14.37 (`iommu=soft swiotlb=force' baremetal => all rx wholly corrupted). The kernel config I was using is here: http://logs.test-lab.xenproject.org/osstest/logs/50387/build-i386-pvops/build/config As far as I am aware no-one at Broadcom has attempted a build and test using my kernel config. Ian, using your config we are able to recreate the problem that you are seeing. The driver finds the RX data buffer to be all zero, with a analyzer trace we are seeing the chip is DMA'ing valid RX data buffer contents to the host but once the driver tries to read this DMA area, it is seeing all zero's which is the reason of the corruption. This is only for the RX data buffer, the RX descriptor and status block update DMA regions are having valid contents. This is unlikely to be a chip or driver issue, as the chip is doing the correct DMA but the corruption occurs before driver reads it. Would request iommu experts to take a look and suggest what can be done next. We are more than willing to try any changes in the driver, I have added few more team members who will work with you if needed. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"): > I tried to reproduce the problem on 32 bit 3.14.34 stable kernel > baremetal, with iommu=soft swiotlb=force but no luck, no drops or > errors. I did not try with Xen 64 bit yet. Btw I need a pcie analyzer > trace to confirm the problem. Is it feasible to capture at your end ? In private correspondence with Prashant we have established that Prashant was using a different kernel configuration. Prashant provided me with their kernel and module binaries, which work in my environment. I have also established that I can reproduce the problem with 3.14.37 (`iommu=soft swiotlb=force' baremetal => all rx wholly corrupted). The kernel config I was using is here: http://logs.test-lab.xenproject.org/osstest/logs/50387/build-i386-pvops/build/config As far as I am aware no-one at Broadcom has attempted a build and test using my kernel config. Thanks, Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html