Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
> Atomic operations are undefined behavior on ARM for device or strongly > ordered memory types. So use write-combine variants for mappings. This > corresponds to normal, non-cacheable memory on ARM. For many other > architectures, this change should not change the mapping type. Hi, all I have met this problems on tegra soc. I tried to use pstore-ram, but the system will hang up when run the atomic_cmpxchg() in the buffer_start_add() or buffer_size_add(), whatever I use iomapped or vmapped regions. I tried to apply this patch set, it was failed with vmap, but if use iomapped rgiions, the ldrex/strex can work on this memory, and the ramoops driver can work. Does there have any updates for this issue? Or how can I debug it? Thanks. Wei. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Tue, Apr 09, 2013 at 08:53:18PM -0700, Colin Cross wrote: > On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: > > - return ioremap(start, size); > > + return ioremap_wc(start, size); > > ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory, > so I don't see how this helps solve the problem in the commit message. In reality it isn't, because there's no such thing as "write combining device memory" in the ARM memory model. There are three major memory types: strongly ordered, device and normal memory. Only normal memory can be cached in any way, which includes using write combining. #define ioremap_wc(cookie,size) __arm_ioremap((cookie), (size), MT_DEVICE_WC) [MT_DEVICE_WC] = { /* ioremap_wc */ .prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_WC, * n TR IR OR * BUFFERABLE 001 10 00 00 * DEV_WC 001 10 (see arch/arm/mm/proc-v7-2level.S for the rest of the table and its description.) So, DEV_WC is an alias for BUFFERABLE, which is normal memory, uncacheable in both inner and outer caches. This means that at the moment, ioremap_wc() memory has the same properties as system memory - with all the out of ordering effects you get there. I don't put any guarantee on this though - we may end up having to change it if we find a SoC needing this to really be device memory... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Tue, Apr 16, 2013 at 01:58:27PM +0100, Rob Herring wrote: > On 04/16/2013 03:44 AM, Will Deacon wrote: > > On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote: > >> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring wrote: > >>> Exclusive accesses still have further restrictions. From section 3.4.5: > >>> > >>> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > >>> performed to a memory region > >>>with the Device or Strongly-ordered memory attribute. Unless the > >>> implementation documentation explicitly > >>> states that LDREX and STREX operations to a memory region with the > >>> Device or Strongly-ordered attribute are > >>> permitted, the effect of such operations is UNPREDICTABLE. > >>> > >>> > >>> Given that it is implementation defined, I don't see how Linux can rely > >>> on that behavior. > >> > >> I see, the problem is that while noncached and writecombined appear to > >> be similar mappings, noncached is mapped in PRRR to strongly-ordered, > >> while writecombined is mapped to unbufferable normal memory. > >> > >> I think adding a wmb() to persistent_ram_write is going to be > >> expensive on cpus with outer caches like the L2X0, where wmb() will > >> result in a spinlock. Is there a real SoC where this doesn't work? > > > > A real SoC where exclusives don't work to memory not mapped as normal? Take > > your pick... > > This patch doesn't actually fix problems for me. Exclusives to DDR work > for any memory type for me as the DDR controller has an exclusive > monitor. It takes write-thru cache mapping to get internal RAM to work. I can't find any reference in the ARM ARM but I think you would need cacheable memory for the exclusives to work. A9 for example uses the cacheline exclusiveness to emulate the global monitor. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On 04/16/2013 03:44 AM, Will Deacon wrote: > On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote: >> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring wrote: >>> Exclusive accesses still have further restrictions. From section 3.4.5: >>> >>> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be >>> performed to a memory region >>>with the Device or Strongly-ordered memory attribute. Unless the >>> implementation documentation explicitly >>> states that LDREX and STREX operations to a memory region with the >>> Device or Strongly-ordered attribute are >>> permitted, the effect of such operations is UNPREDICTABLE. >>> >>> >>> Given that it is implementation defined, I don't see how Linux can rely >>> on that behavior. >> >> I see, the problem is that while noncached and writecombined appear to >> be similar mappings, noncached is mapped in PRRR to strongly-ordered, >> while writecombined is mapped to unbufferable normal memory. >> >> I think adding a wmb() to persistent_ram_write is going to be >> expensive on cpus with outer caches like the L2X0, where wmb() will >> result in a spinlock. Is there a real SoC where this doesn't work? > > A real SoC where exclusives don't work to memory not mapped as normal? Take > your pick... This patch doesn't actually fix problems for me. Exclusives to DDR work for any memory type for me as the DDR controller has an exclusive monitor. It takes write-thru cache mapping to get internal RAM to work. Rob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote: > On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring wrote: > > Exclusive accesses still have further restrictions. From section 3.4.5: > > > > • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > > performed to a memory region > >with the Device or Strongly-ordered memory attribute. Unless the > > implementation documentation explicitly > > states that LDREX and STREX operations to a memory region with the > > Device or Strongly-ordered attribute are > > permitted, the effect of such operations is UNPREDICTABLE. > > > > > > Given that it is implementation defined, I don't see how Linux can rely > > on that behavior. > > I see, the problem is that while noncached and writecombined appear to > be similar mappings, noncached is mapped in PRRR to strongly-ordered, > while writecombined is mapped to unbufferable normal memory. > > I think adding a wmb() to persistent_ram_write is going to be > expensive on cpus with outer caches like the L2X0, where wmb() will > result in a spinlock. Is there a real SoC where this doesn't work? A real SoC where exclusives don't work to memory not mapped as normal? Take your pick... Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring wrote: > On 04/15/2013 05:21 PM, Colin Cross wrote: >> On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring wrote: >>> On 04/09/2013 10:53 PM, Colin Cross wrote: On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: > From: Rob Herring > > Atomic operations are undefined behavior on ARM for device or strongly > ordered memory types. So use write-combine variants for mappings. This > corresponds to normal, non-cacheable memory on ARM. For many other > architectures, this change should not change the mapping type. This is going to make ramconsole less reliable. A debugging printk followed by a __raw_writel that causes an immediate hard crash is likely to lose the last updates, including the most useful message, in the write buffers. >>> >>> It would have to be a write that hangs the bus. In my experience with >>> AXI, the bus doesn't actually hang until you hit max outstanding >>> transactions. >> >> I've seen many cases where a single write to device memory in an >> unclocked slave will completely and instantly hang all cpus, and the >> next write will never happen. >> >>> I think exclusive stores will limit the buffering, but that is probably >>> not architecturally guaranteed. >>> >>> I could put a wb() in at the end of persistent_ram_write. >>> Also, isn't this patch unnecessary after patch 3 in this set? >>> >>> It is still needed in the main memory case to be architecturally correct >>> to avoid multiple mappings of different memory types and exclusive >>> accesses to device memory. At least on an A9, it doesn't really seem to >>> matter. I could remove this for the ioremap case. >> >> According to my reading of the latest ARM ARM (Issue C, section >> A3.5.7), and Catalin's excellent explanation >> (http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html), >> it is no longer considered unpredictable to have both cached and >> non-cached mappings to the same memory, as long as you use proper >> cache maintenance between accessing the two mappings. >> >> In pstore_ram the cached mapping will never be accessed (and we don't >> care about speculative accesses), so no cache maintenance is >> necessary. I don't see any need for this patch, and I see plenty of >> possible problems. > > Exclusive accesses still have further restrictions. From section 3.4.5: > > • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be > performed to a memory region >with the Device or Strongly-ordered memory attribute. Unless the > implementation documentation explicitly > states that LDREX and STREX operations to a memory region with the > Device or Strongly-ordered attribute are > permitted, the effect of such operations is UNPREDICTABLE. > > > Given that it is implementation defined, I don't see how Linux can rely > on that behavior. I see, the problem is that while noncached and writecombined appear to be similar mappings, noncached is mapped in PRRR to strongly-ordered, while writecombined is mapped to unbufferable normal memory. I think adding a wmb() to persistent_ram_write is going to be expensive on cpus with outer caches like the L2X0, where wmb() will result in a spinlock. Is there a real SoC where this doesn't work? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On 04/15/2013 05:21 PM, Colin Cross wrote: > On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring wrote: >> On 04/09/2013 10:53 PM, Colin Cross wrote: >>> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: From: Rob Herring Atomic operations are undefined behavior on ARM for device or strongly ordered memory types. So use write-combine variants for mappings. This corresponds to normal, non-cacheable memory on ARM. For many other architectures, this change should not change the mapping type. >>> >>> This is going to make ramconsole less reliable. A debugging printk >>> followed by a __raw_writel that causes an immediate hard crash is >>> likely to lose the last updates, including the most useful message, in >>> the write buffers. >> >> It would have to be a write that hangs the bus. In my experience with >> AXI, the bus doesn't actually hang until you hit max outstanding >> transactions. > > I've seen many cases where a single write to device memory in an > unclocked slave will completely and instantly hang all cpus, and the > next write will never happen. > >> I think exclusive stores will limit the buffering, but that is probably >> not architecturally guaranteed. >> >> I could put a wb() in at the end of persistent_ram_write. >> >>> Also, isn't this patch unnecessary after patch 3 in this set? >> >> It is still needed in the main memory case to be architecturally correct >> to avoid multiple mappings of different memory types and exclusive >> accesses to device memory. At least on an A9, it doesn't really seem to >> matter. I could remove this for the ioremap case. > > According to my reading of the latest ARM ARM (Issue C, section > A3.5.7), and Catalin's excellent explanation > (http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html), > it is no longer considered unpredictable to have both cached and > non-cached mappings to the same memory, as long as you use proper > cache maintenance between accessing the two mappings. > > In pstore_ram the cached mapping will never be accessed (and we don't > care about speculative accesses), so no cache maintenance is > necessary. I don't see any need for this patch, and I see plenty of > possible problems. Exclusive accesses still have further restrictions. From section 3.4.5: • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be performed to a memory region with the Device or Strongly-ordered memory attribute. Unless the implementation documentation explicitly states that LDREX and STREX operations to a memory region with the Device or Strongly-ordered attribute are permitted, the effect of such operations is UNPREDICTABLE. Given that it is implementation defined, I don't see how Linux can rely on that behavior. Rob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring wrote: > On 04/09/2013 10:53 PM, Colin Cross wrote: >> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: >>> From: Rob Herring >>> >>> Atomic operations are undefined behavior on ARM for device or strongly >>> ordered memory types. So use write-combine variants for mappings. This >>> corresponds to normal, non-cacheable memory on ARM. For many other >>> architectures, this change should not change the mapping type. >> >> This is going to make ramconsole less reliable. A debugging printk >> followed by a __raw_writel that causes an immediate hard crash is >> likely to lose the last updates, including the most useful message, in >> the write buffers. > > It would have to be a write that hangs the bus. In my experience with > AXI, the bus doesn't actually hang until you hit max outstanding > transactions. I've seen many cases where a single write to device memory in an unclocked slave will completely and instantly hang all cpus, and the next write will never happen. > I think exclusive stores will limit the buffering, but that is probably > not architecturally guaranteed. > > I could put a wb() in at the end of persistent_ram_write. > >> Also, isn't this patch unnecessary after patch 3 in this set? > > It is still needed in the main memory case to be architecturally correct > to avoid multiple mappings of different memory types and exclusive > accesses to device memory. At least on an A9, it doesn't really seem to > matter. I could remove this for the ioremap case. According to my reading of the latest ARM ARM (Issue C, section A3.5.7), and Catalin's excellent explanation (http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html), it is no longer considered unpredictable to have both cached and non-cached mappings to the same memory, as long as you use proper cache maintenance between accessing the two mappings. In pstore_ram the cached mapping will never be accessed (and we don't care about speculative accesses), so no cache maintenance is necessary. I don't see any need for this patch, and I see plenty of possible problems. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On 04/09/2013 10:53 PM, Colin Cross wrote: > On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: >> From: Rob Herring >> >> Atomic operations are undefined behavior on ARM for device or strongly >> ordered memory types. So use write-combine variants for mappings. This >> corresponds to normal, non-cacheable memory on ARM. For many other >> architectures, this change should not change the mapping type. > > This is going to make ramconsole less reliable. A debugging printk > followed by a __raw_writel that causes an immediate hard crash is > likely to lose the last updates, including the most useful message, in > the write buffers. It would have to be a write that hangs the bus. In my experience with AXI, the bus doesn't actually hang until you hit max outstanding transactions. I think exclusive stores will limit the buffering, but that is probably not architecturally guaranteed. I could put a wb() in at the end of persistent_ram_write. > Also, isn't this patch unnecessary after patch 3 in this set? It is still needed in the main memory case to be architecturally correct to avoid multiple mappings of different memory types and exclusive accesses to device memory. At least on an A9, it doesn't really seem to matter. I could remove this for the ioremap case. Rob >> Signed-off-by: Rob Herring >> Cc: Anton Vorontsov >> Cc: Colin Cross >> Cc: Kees Cook >> Cc: Tony Luck >> Cc: linux-kernel@vger.kernel.org >> --- >> fs/pstore/ram_core.c |4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c >> index 0306303..e126d9f 100644 >> --- a/fs/pstore/ram_core.c >> +++ b/fs/pstore/ram_core.c >> @@ -337,7 +337,7 @@ static void *persistent_ram_vmap(phys_addr_t start, >> size_t size) >> page_start = start - offset_in_page(start); >> page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE); >> >> - prot = pgprot_noncached(PAGE_KERNEL); >> + prot = pgprot_writecombine(PAGE_KERNEL); > Is this necessary? Won't pgprot_noncached already be normal memory? > >> pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL); >> if (!pages) { >> @@ -364,7 +364,7 @@ static void *persistent_ram_iomap(phys_addr_t start, >> size_t size) >> return NULL; >> } >> >> - return ioremap(start, size); >> + return ioremap_wc(start, size); > > ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory, > so I don't see how this helps solve the problem in the commit message. > >> } >> >> static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size, >> -- >> 1.7.10.4 >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings
On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring wrote: > From: Rob Herring > > Atomic operations are undefined behavior on ARM for device or strongly > ordered memory types. So use write-combine variants for mappings. This > corresponds to normal, non-cacheable memory on ARM. For many other > architectures, this change should not change the mapping type. This is going to make ramconsole less reliable. A debugging printk followed by a __raw_writel that causes an immediate hard crash is likely to lose the last updates, including the most useful message, in the write buffers. Also, isn't this patch unnecessary after patch 3 in this set? > Signed-off-by: Rob Herring > Cc: Anton Vorontsov > Cc: Colin Cross > Cc: Kees Cook > Cc: Tony Luck > Cc: linux-kernel@vger.kernel.org > --- > fs/pstore/ram_core.c |4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c > index 0306303..e126d9f 100644 > --- a/fs/pstore/ram_core.c > +++ b/fs/pstore/ram_core.c > @@ -337,7 +337,7 @@ static void *persistent_ram_vmap(phys_addr_t start, > size_t size) > page_start = start - offset_in_page(start); > page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE); > > - prot = pgprot_noncached(PAGE_KERNEL); > + prot = pgprot_writecombine(PAGE_KERNEL); Is this necessary? Won't pgprot_noncached already be normal memory? > pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL); > if (!pages) { > @@ -364,7 +364,7 @@ static void *persistent_ram_iomap(phys_addr_t start, > size_t size) > return NULL; > } > > - return ioremap(start, size); > + return ioremap_wc(start, size); ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory, so I don't see how this helps solve the problem in the commit message. > } > > static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size, > -- > 1.7.10.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/3] pstore-ram: use write-combine mappings
From: Rob Herring Atomic operations are undefined behavior on ARM for device or strongly ordered memory types. So use write-combine variants for mappings. This corresponds to normal, non-cacheable memory on ARM. For many other architectures, this change should not change the mapping type. Signed-off-by: Rob Herring Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Tony Luck Cc: linux-kernel@vger.kernel.org --- fs/pstore/ram_core.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c index 0306303..e126d9f 100644 --- a/fs/pstore/ram_core.c +++ b/fs/pstore/ram_core.c @@ -337,7 +337,7 @@ static void *persistent_ram_vmap(phys_addr_t start, size_t size) page_start = start - offset_in_page(start); page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE); - prot = pgprot_noncached(PAGE_KERNEL); + prot = pgprot_writecombine(PAGE_KERNEL); pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL); if (!pages) { @@ -364,7 +364,7 @@ static void *persistent_ram_iomap(phys_addr_t start, size_t size) return NULL; } - return ioremap(start, size); + return ioremap_wc(start, size); } static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size, -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/