Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
The only problem host found so far is a newer Asus 970FX motherboard, regardless of OS. We are seeing that some time after LTSSM finishes, but long before OS load, our PEX_DEVICE_CONTROL register is changed. On the working motherboards, NO_SNOOP is enabled; if I read the spec correctly, this means that TLPs with no-snoop are permitted. On the non-working motherboards, NO_SNOOP is disabled; this is supposed to mean that TLPs with no-snoop are not permitted (again, if I understand correctly). Is it possible that there has been another misinterpretation of this bit? Something regarding the generation of snoops on CSB when it is cleared? This bit could be a complete red herring; it was one of the few differences in the config space, however, so it was my best guess. - Ben On Sun, Mar 8, 2009 at 9:56 PM, Liu Dave-R63238 dave...@freescale.comwrote: Hi Ben, The second issue. you told me some hosts has problem, and some hosts worked well. what is the problem-hosts? The issue seems like the hosts did set the NO SNOOP attribute bit at TLP. The PEX_DEVICE_CONTROL is standard PCI configuration space register, it controls the behavior of the initiator's transaction. For 8315, it is outbound, not inbound transaction. Thanks, Dave -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Saturday, March 07, 2009 12:30 AM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited Thank you for your help! That bit resolved all of the RDMA/WDMA coherency issues on the CSB side...except: We expose a 1MB region of memory from CSB via a BAR (BAR1, if it matters) to the Host. This region is also not behaving correctly with respect to coherency on SOME hosts; again, disabling our cache makes it work correctly on all hosts. We have set PEX_DEVICE_CONTROL in PCI-E Config Space (0x54) to 0x2010 (sorry about the endianness below). We thought that CLEARING the no-snoop bit here would indicate that snooping was required for this region...is this a similar issue? - Ben On Fri, Mar 6, 2009 at 10:12 AM, Ben Menchaca ben.mench...@gmail.comwrote: Testing now...it looks like it (almost) works, though! Why does setting no-snoop cause snooping to work? More on the effect on setting that bit in a few minutes...need more testing. ACR is 0x00030300. - Ben On Fri, Mar 6, 2009 at 12:30 AM, Liu Dave-R63238 dave...@freescale.comwrote: Did you enable the descriptor bit 3 to have a try? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 2:10 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited I can look at ACR morning...although I can say with a fair amount of certainty that I have not changed it from the POR value. I will try enabling No Snoop for CSB in the descriptor (bit 3, yes?)...this seems a bit counterintuitive to me. What is the hope regarding these two? Some combination I am not seeing? On Thu, Mar 5, 2009 at 11:40 PM, Liu Dave-R63238 dave...@freescale.comwrote: what is the value of ACR register? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 1:38 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited 1. BAT2 in linux is set to WIMG=0010, and covers all 64M 2. PEX_DEVICE_CONTROL in PCI-E Config Space (0x54): 0x1020 3. PEX_xDMA_CTRL is set to 0x0401 at the initiation of the DMA. 4. OWAR0 is set to 0xF005, so NSNP is 0. 5. The DMA descriptor (randomly chosen when I hit a trigger...just ignore the size...) contains 0002AFF3 at offset 0, so nosnoops are cleared. Core is 400MHz, and CSB is 133MHz. - Ben On Thu, Mar 5, 2009 at 11:27 PM, Liu Dave-R63238 dave...@freescale.com wrote: and what settings is DMA description bit 3? -Original Message- From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliulinuxppc-dev-bounces%2Bdaveliu =freescale@ozlabs.org] On Behalf Of Liu Dave-R63238 Sent: Friday, March 06, 2009 1:22 PM To: Ben Menchaca; linuxppc-dev@ozlabs.org Subject: RE: 83xx: Marking or Allocating Pages as Cache-Inhibited Did you enable the snoop bit at PEX_WDMA_CTRL[SNOOP] and PEX_RDMA_CTRL[SNOOP]? What is the freq settings? CORE/CSB bus. Thanks, Dave From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliulinuxppc-dev-bounces%2Bdaveliu =freescale@ozlabs.org] On Behalf Of Ben Menchaca Sent: Friday, March 06, 2009 12:33 PM To: linuxppc-dev@ozlabs.org Subject: 83xx: Marking or Allocating Pages as Cache-Inhibited I am working
Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
On Sat, Mar 7, 2009 at 3:51 PM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Fri, 2009-03-06 at 10:12 -0600, Ben Menchaca wrote: Testing now...it looks like it (almost) works, though! Why does setting no-snoop cause snooping to work? More on the effect on setting that bit in a few minutes...need more testing. Maybe they got the documentation for that bit backward ? :-) For posterity...it does appear that this is the case. I don't have a bus analyzer to watch the transaction, but a JTAG trigger caught the update happening if and only if bit 3 was set in the (R|W)DMA descriptor. My FAE at FS said he is watching this thread, so hopefully some doc errata can be generated so others can avoid my confusion :-). - Ben ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
Testing now...it looks like it (almost) works, though! Why does setting no-snoop cause snooping to work? More on the effect on setting that bit in a few minutes...need more testing. ACR is 0x00030300. - Ben On Fri, Mar 6, 2009 at 12:30 AM, Liu Dave-R63238 dave...@freescale.comwrote: Did you enable the descriptor bit 3 to have a try? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 2:10 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited I can look at ACR morning...although I can say with a fair amount of certainty that I have not changed it from the POR value. I will try enabling No Snoop for CSB in the descriptor (bit 3, yes?)...this seems a bit counterintuitive to me. What is the hope regarding these two? Some combination I am not seeing? On Thu, Mar 5, 2009 at 11:40 PM, Liu Dave-R63238 dave...@freescale.comwrote: what is the value of ACR register? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 1:38 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited 1. BAT2 in linux is set to WIMG=0010, and covers all 64M 2. PEX_DEVICE_CONTROL in PCI-E Config Space (0x54): 0x1020 3. PEX_xDMA_CTRL is set to 0x0401 at the initiation of the DMA. 4. OWAR0 is set to 0xF005, so NSNP is 0. 5. The DMA descriptor (randomly chosen when I hit a trigger...just ignore the size...) contains 0002AFF3 at offset 0, so nosnoops are cleared. Core is 400MHz, and CSB is 133MHz. - Ben On Thu, Mar 5, 2009 at 11:27 PM, Liu Dave-R63238 dave...@freescale.comwrote: and what settings is DMA description bit 3? -Original Message- From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Liu Dave-R63238 Sent: Friday, March 06, 2009 1:22 PM To: Ben Menchaca; linuxppc-dev@ozlabs.org Subject: RE: 83xx: Marking or Allocating Pages as Cache-Inhibited Did you enable the snoop bit at PEX_WDMA_CTRL[SNOOP] and PEX_RDMA_CTRL[SNOOP]? What is the freq settings? CORE/CSB bus. Thanks, Dave From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Ben Menchaca Sent: Friday, March 06, 2009 12:33 PM To: linuxppc-dev@ozlabs.org Subject: 83xx: Marking or Allocating Pages as Cache-Inhibited I am working on a Freescale 8314e design, and the embedded device is configured as a PCI-e endpoint running a 2.6.27-5 kernel. For context, we have written a kernel module which, among other things, uses the RDMA/WDMA engine in the PCI-e IP block. On the host side, these DMAs are coherent. However, on the embedded side, things are quite a bit less rosy; we must manually flush/invalidate cache lines for WDMA/RDMAs to occur successfully. After speaking with (several) FAEs at Freescale, we believe there is a configuration issue that is the cause, but we have yet to have anyone successfully point to it. Disabling the data cache altogether resolves the issue entirely, but of course, also completely tanks performance. As a temporary workaround, I would like to simply mark the pages (obtained currently via dma_alloc_coherent) involved as cache-inhibited. I have attempted to do this via some snippets remaining in fec.c (va_to_pte, uncache_pte to set _PAGE_NO_CACHE, flush_tlb_page, then unmap_pte), but this is almost certainly braindead; va_to_pte is not a part of the 83xx source, as far as I can tell; 8xx only. A quick pointer in the correct direction for marking pages as cache-inhibited on a 2.6.27-5 kernel would be appreciated, or if my approach to a workaround is flawed, a pointer to the correct way would be great. Ben Menchaca ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
Thank you for your help! That bit resolved all of the RDMA/WDMA coherency issues on the CSB side...except: We expose a 1MB region of memory from CSB via a BAR (BAR1, if it matters) to the Host. This region is also not behaving correctly with respect to coherency on SOME hosts; again, disabling our cache makes it work correctly on all hosts. We have set PEX_DEVICE_CONTROL in PCI-E Config Space (0x54) to 0x2010 (sorry about the endianness below). We thought that CLEARING the no-snoop bit here would indicate that snooping was required for this region...is this a similar issue? - Ben On Fri, Mar 6, 2009 at 10:12 AM, Ben Menchaca ben.mench...@gmail.comwrote: Testing now...it looks like it (almost) works, though! Why does setting no-snoop cause snooping to work? More on the effect on setting that bit in a few minutes...need more testing. ACR is 0x00030300. - Ben On Fri, Mar 6, 2009 at 12:30 AM, Liu Dave-R63238 dave...@freescale.comwrote: Did you enable the descriptor bit 3 to have a try? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 2:10 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited I can look at ACR morning...although I can say with a fair amount of certainty that I have not changed it from the POR value. I will try enabling No Snoop for CSB in the descriptor (bit 3, yes?)...this seems a bit counterintuitive to me. What is the hope regarding these two? Some combination I am not seeing? On Thu, Mar 5, 2009 at 11:40 PM, Liu Dave-R63238 dave...@freescale.comwrote: what is the value of ACR register? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 1:38 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited 1. BAT2 in linux is set to WIMG=0010, and covers all 64M 2. PEX_DEVICE_CONTROL in PCI-E Config Space (0x54): 0x1020 3. PEX_xDMA_CTRL is set to 0x0401 at the initiation of the DMA. 4. OWAR0 is set to 0xF005, so NSNP is 0. 5. The DMA descriptor (randomly chosen when I hit a trigger...just ignore the size...) contains 0002AFF3 at offset 0, so nosnoops are cleared. Core is 400MHz, and CSB is 133MHz. - Ben On Thu, Mar 5, 2009 at 11:27 PM, Liu Dave-R63238 dave...@freescale.comwrote: and what settings is DMA description bit 3? -Original Message- From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu =freescale@ozlabs.org] On Behalf Of Liu Dave-R63238 Sent: Friday, March 06, 2009 1:22 PM To: Ben Menchaca; linuxppc-dev@ozlabs.org Subject: RE: 83xx: Marking or Allocating Pages as Cache-Inhibited Did you enable the snoop bit at PEX_WDMA_CTRL[SNOOP] and PEX_RDMA_CTRL[SNOOP]? What is the freq settings? CORE/CSB bus. Thanks, Dave From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu =freescale@ozlabs.org] On Behalf Of Ben Menchaca Sent: Friday, March 06, 2009 12:33 PM To: linuxppc-dev@ozlabs.org Subject: 83xx: Marking or Allocating Pages as Cache-Inhibited I am working on a Freescale 8314e design, and the embedded device is configured as a PCI-e endpoint running a 2.6.27-5 kernel. For context, we have written a kernel module which, among other things, uses the RDMA/WDMA engine in the PCI-e IP block. On the host side, these DMAs are coherent. However, on the embedded side, things are quite a bit less rosy; we must manually flush/invalidate cache lines for WDMA/RDMAs to occur successfully. After speaking with (several) FAEs at Freescale, we believe there is a configuration issue that is the cause, but we have yet to have anyone successfully point to it. Disabling the data cache altogether resolves the issue entirely, but of course, also completely tanks performance. As a temporary workaround, I would like to simply mark the pages (obtained currently via dma_alloc_coherent) involved as cache-inhibited. I have attempted to do this via some snippets remaining in fec.c (va_to_pte, uncache_pte to set _PAGE_NO_CACHE, flush_tlb_page, then unmap_pte), but this is almost certainly braindead; va_to_pte is not a part of the 83xx source, as far as I can tell; 8xx only. A quick pointer in the correct direction for marking pages as cache-inhibited on a 2.6.27-5 kernel would be appreciated, or if my approach to a workaround is flawed, a pointer to the correct way would be great. Ben Menchaca ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org
83xx: Marking or Allocating Pages as Cache-Inhibited
I am working on a Freescale 8314e design, and the embedded device is configured as a PCI-e endpoint running a 2.6.27-5 kernel. For context, we have written a kernel module which, among other things, uses the RDMA/WDMA engine in the PCI-e IP block. On the host side, these DMAs are coherent. However, on the embedded side, things are quite a bit less rosy; we must manually flush/invalidate cache lines for WDMA/RDMAs to occur successfully. After speaking with (several) FAEs at Freescale, we believe there is a configuration issue that is the cause, but we have yet to have anyone successfully point to it. Disabling the data cache altogether resolves the issue entirely, but of course, also completely tanks performance. As a temporary workaround, I would like to simply mark the pages (obtained currently via dma_alloc_coherent) involved as cache-inhibited. I have attempted to do this via some snippets remaining in fec.c (va_to_pte, uncache_pte to set _PAGE_NO_CACHE, flush_tlb_page, then unmap_pte), but this is almost certainly braindead; va_to_pte is not a part of the 83xx source, as far as I can tell; 8xx only. A quick pointer in the correct direction for marking pages as cache-inhibited on a 2.6.27-5 kernel would be appreciated, or if my approach to a workaround is flawed, a pointer to the correct way would be great. Ben Menchaca ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
1. BAT2 in linux is set to WIMG=0010, and covers all 64M 2. PEX_DEVICE_CONTROL in PCI-E Config Space (0x54): 0x1020 3. PEX_xDMA_CTRL is set to 0x0401 at the initiation of the DMA. 4. OWAR0 is set to 0xF005, so NSNP is 0. 5. The DMA descriptor (randomly chosen when I hit a trigger...just ignore the size...) contains 0002AFF3 at offset 0, so nosnoops are cleared. Core is 400MHz, and CSB is 133MHz. - Ben On Thu, Mar 5, 2009 at 11:27 PM, Liu Dave-R63238 dave...@freescale.comwrote: and what settings is DMA description bit 3? -Original Message- From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Liu Dave-R63238 Sent: Friday, March 06, 2009 1:22 PM To: Ben Menchaca; linuxppc-dev@ozlabs.org Subject: RE: 83xx: Marking or Allocating Pages as Cache-Inhibited Did you enable the snoop bit at PEX_WDMA_CTRL[SNOOP] and PEX_RDMA_CTRL[SNOOP]? What is the freq settings? CORE/CSB bus. Thanks, Dave From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Ben Menchaca Sent: Friday, March 06, 2009 12:33 PM To: linuxppc-dev@ozlabs.org Subject: 83xx: Marking or Allocating Pages as Cache-Inhibited I am working on a Freescale 8314e design, and the embedded device is configured as a PCI-e endpoint running a 2.6.27-5 kernel. For context, we have written a kernel module which, among other things, uses the RDMA/WDMA engine in the PCI-e IP block. On the host side, these DMAs are coherent. However, on the embedded side, things are quite a bit less rosy; we must manually flush/invalidate cache lines for WDMA/RDMAs to occur successfully. After speaking with (several) FAEs at Freescale, we believe there is a configuration issue that is the cause, but we have yet to have anyone successfully point to it. Disabling the data cache altogether resolves the issue entirely, but of course, also completely tanks performance. As a temporary workaround, I would like to simply mark the pages (obtained currently via dma_alloc_coherent) involved as cache-inhibited. I have attempted to do this via some snippets remaining in fec.c (va_to_pte, uncache_pte to set _PAGE_NO_CACHE, flush_tlb_page, then unmap_pte), but this is almost certainly braindead; va_to_pte is not a part of the 83xx source, as far as I can tell; 8xx only. A quick pointer in the correct direction for marking pages as cache-inhibited on a 2.6.27-5 kernel would be appreciated, or if my approach to a workaround is flawed, a pointer to the correct way would be great. Ben Menchaca ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 83xx: Marking or Allocating Pages as Cache-Inhibited
I can look at ACR morning...although I can say with a fair amount of certainty that I have not changed it from the POR value. I will try enabling No Snoop for CSB in the descriptor (bit 3, yes?)...this seems a bit counterintuitive to me. What is the hope regarding these two? Some combination I am not seeing? On Thu, Mar 5, 2009 at 11:40 PM, Liu Dave-R63238 dave...@freescale.comwrote: what is the value of ACR register? -- *From:* Ben Menchaca [mailto:ben.mench...@gmail.com] *Sent:* Friday, March 06, 2009 1:38 PM *To:* Liu Dave-R63238 *Cc:* linuxppc-dev@ozlabs.org *Subject:* Re: 83xx: Marking or Allocating Pages as Cache-Inhibited 1. BAT2 in linux is set to WIMG=0010, and covers all 64M 2. PEX_DEVICE_CONTROL in PCI-E Config Space (0x54): 0x1020 3. PEX_xDMA_CTRL is set to 0x0401 at the initiation of the DMA. 4. OWAR0 is set to 0xF005, so NSNP is 0. 5. The DMA descriptor (randomly chosen when I hit a trigger...just ignore the size...) contains 0002AFF3 at offset 0, so nosnoops are cleared. Core is 400MHz, and CSB is 133MHz. - Ben On Thu, Mar 5, 2009 at 11:27 PM, Liu Dave-R63238 dave...@freescale.comwrote: and what settings is DMA description bit 3? -Original Message- From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Liu Dave-R63238 Sent: Friday, March 06, 2009 1:22 PM To: Ben Menchaca; linuxppc-dev@ozlabs.org Subject: RE: 83xx: Marking or Allocating Pages as Cache-Inhibited Did you enable the snoop bit at PEX_WDMA_CTRL[SNOOP] and PEX_RDMA_CTRL[SNOOP]? What is the freq settings? CORE/CSB bus. Thanks, Dave From: linuxppc-dev-bounces+daveliu=freescale@ozlabs.org [mailto:linuxppc-dev-bounces+daveliu linuxppc-dev-bounces%2Bdaveliu= freescale@ozlabs.org] On Behalf Of Ben Menchaca Sent: Friday, March 06, 2009 12:33 PM To: linuxppc-dev@ozlabs.org Subject: 83xx: Marking or Allocating Pages as Cache-Inhibited I am working on a Freescale 8314e design, and the embedded device is configured as a PCI-e endpoint running a 2.6.27-5 kernel. For context, we have written a kernel module which, among other things, uses the RDMA/WDMA engine in the PCI-e IP block. On the host side, these DMAs are coherent. However, on the embedded side, things are quite a bit less rosy; we must manually flush/invalidate cache lines for WDMA/RDMAs to occur successfully. After speaking with (several) FAEs at Freescale, we believe there is a configuration issue that is the cause, but we have yet to have anyone successfully point to it. Disabling the data cache altogether resolves the issue entirely, but of course, also completely tanks performance. As a temporary workaround, I would like to simply mark the pages (obtained currently via dma_alloc_coherent) involved as cache-inhibited. I have attempted to do this via some snippets remaining in fec.c (va_to_pte, uncache_pte to set _PAGE_NO_CACHE, flush_tlb_page, then unmap_pte), but this is almost certainly braindead; va_to_pte is not a part of the 83xx source, as far as I can tell; 8xx only. A quick pointer in the correct direction for marking pages as cache-inhibited on a 2.6.27-5 kernel would be appreciated, or if my approach to a workaround is flawed, a pointer to the correct way would be great. Ben Menchaca ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev