Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Fri, 8 Aug 2008, Catalin Marinas wrote: There are already CPUs with weaker memory ordering model than ARM (e.g. Alpha) and they are supported by Linux. Of course, there may be problems with drivers since most of them are developed in x86. For the OMAP SoC, most of the drivers are specific to ARM. The devices just aren't available for x86 or other architectures. (With a few exceptions - smc91x and MUSB are the two that come to mind.) - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, Aug 07, 2008 at 06:07:50PM -0500, Woodruff, Richard wrote: If I write a series of control register commands to device A, then write a go operation to device B, I would hope all of A's writes had completed before B gets the go. SO gives you this. DEVICE may not with out barriers. This, I think, is where the problem lies. Device regions of the same type *are* ordered with respect to each other. So, shared device accesses occur in program order. Unshared device accesses occur in program order. However, shared device accesses may occur out of order with unshared device accesses or memory accesses. Unshared device accesses may occur out of order with shared device accesses or memory accesses. So, if both device A and device B are mapped as shared devices, then accesses to both occur in program order. If device A is mapped as a shared device and device B as an unshared device, then you have to use read backs and possibly barriers to ensure ordering. Remember, a barrier only affects up to the CPU. It doesn't affect write posting downstream of the CPU. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, 7 Aug 2008, Catalin Marinas wrote: Many of the architecture people in ARM seem to be on holiday, I'll try to get clarification in about a week time. That would be really great. For reference, here is the assembly code in question: http://www.mail-archive.com/linux-omap@vger.kernel.org/msg01349.html It's the code following the omap34xx_sram_configure_core_dpll entry point. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, 2008-08-07 at 22:20 +0100, Russell King - ARM Linux wrote: On Thu, Aug 07, 2008 at 03:38:55PM -0500, Woodruff, Richard wrote: From: Russell King - ARM Linux [mailto:[EMAIL PROTECTED] Is DEVICE really safe for things other than FIFOs with out the use of barriers? As far as I'm aware, yes - and that comment is based solely upon the fact that no one has reported any problems with the kernel which have been tracked down to using the device memory type on ARMv6 and above... We do in some drivers today get spurious interrupts when DEVICE is used but don't see them when using SO. ... until now, or even that very sentence. That is our fault then I suppose for not discussing this on arm-linux. In OMAP2 and OMAP3 this has been observed. In vendor kernels where time stands still and lots of validation has happened we did stick with SO for OMAP2. On some internal kernels already we have gone to SO for OMAP3 as customers ramp and need the errors gone. The faster the system clocks the more it seems to show. To do that, and then ask about when Linux is going to start exploiting the weak memory types is a little unfair don't you think? There are already CPUs with weaker memory ordering model than ARM (e.g. Alpha) and they are supported by Linux. Of course, there may be problems with drivers since most of them are developed in x86. On the strongly-ordered instead of normal uncached memory, the unaligned accesses to SO memory which are not faulted have unpredictable behaviour (according to the ARM ARM, though some v7 implementations may not be bothered). If you use such memory for skbuff for example, is there any risk of unaligned accesses when the network packets are processed? Is there any other example that would make this fail? The thing with these effects, especially spurious IRQs is there usually are several reasons they show up and several ways to make them go away. In the beginning there have been lots then they drop off as the system software matures. Then if the program survives long enough to be optimized they start to show up again but in lesser numbers. This has been the OMAP2/3 experience so far. Going SO to control regions has stamped them out at this point. What you're therefore asking for is a weak memory ordering model which doesn't require any effort on the software programmers part - that's a CPU architecture thing which you'll need to talk to ARM about. x86 can do this for the most part because x86's development has been such that the hardware has had to work around the software to make improvements. On ARM, normally when there's updates, software has to work around the hardware. For ARM CPU (RISC architecture) to get faster while keeping the power consumption low, it must become weakly ordered (maybe CISC architectures can cope with this). Anyway, it seems that ia64 requires some barriers as well. That's more like evolution in the CPU field and while the software becomes more complex, the overall performance is better. That's not unexpected if you don't have the right barriers in place at the end of things such as IRQ controllers ack/mask functions. Yes. I've submitted patches (to linux-omap) and Catalin did submit patches (to arm-linux) for PIC barriers. In the past they have been rejected by Tony or you for different reasons. Tony last rejected it because he thought it should be generic at the ARM level. I don't recall what your last stance was. Looking back, I never commented on that patch. I did on the previous patch which was adding DSBs in a way which would break stuff. The patch to add them to the interrupt controllers has never been reposted. However, adding barriers may not be the correct answer for this. See Documentation/io_ordering.txt - reading back from a safe register on the target device ensures that the previous writes should hit the device before the read completes, without the overhead of a full barrier. This point is even more important if you have some form of write posting between the CPU and the device (eg, a PCI bus) - a DSB won't reach down to the target PCI device which may be behind some write-posting bridges. So, in the case of arch/arm/common/gic.c, we should be reading one of the gic control registers after the writes. In the case of arch/arm/mach-omap2/irq.c, reading the INTC_REVISION reg after masking should be a sufficient solution. I need to check in ARM when people come from holidays but a simple LDR might not be enough to guarantee that a CPSIE etc. happens after it. You may need to add either an LDR + CMP (or some other usage of the loaded register) or LDR + DSB. I agree that DSB alone is not enough. Use a dual mapping to manage a device (2 ioremaps). You use a SO mapping to write to registers of that device. Then when you go to write to its FIFO use a DEVICE mapping. I believe ARMv7 has some
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Fri, Aug 08, 2008 at 12:44:49PM +0100, Catalin Marinas wrote: There are already CPUs with weaker memory ordering model than ARM (e.g. Alpha) and they are supported by Linux. Of course, there may be problems with drivers since most of them are developed in x86. There are, and they are _constantly_ complaining about drivers not having the necessary barriers. Consider that for a moment - how long has Linux supported had weakly ordered architectures, and how long does it take to fix ordering problems... 10 or so years? So, in the case of arch/arm/common/gic.c, we should be reading one of the gic control registers after the writes. In the case of arch/arm/mach-omap2/irq.c, reading the INTC_REVISION reg after masking should be a sufficient solution. I need to check in ARM when people come from holidays but a simple LDR might not be enough to guarantee that a CPSIE etc. happens after it. You may need to add either an LDR + CMP (or some other usage of the loaded register) or LDR + DSB. I agree that DSB alone is not enough. Okay, I give up on this issue. Weak memory ordering seems to be a very very big can of worms. And then there's this: 14:07 rmk so we're back to making readl() itself do something with the data... which brings us back to that question about why bother with weak ordering 14:10 willy you can't have weak ordering for device control registers 14:11 rmk yes you can, provided they're ordered wrt each other. 14:11 willy weak ordering works great for SMP or for just covering up latency 14:12 willy no, you can't. see writel(); readl(); udelay(1); writel();. You didn't wait for 1 microsecond before accessing the device again. Or, to put it another way, it seems that on Linux _all_ devices must be strongly ordered or be seen to Linux as being strongly ordered (iow, readl and writel and friends _must_ have a barrier.) And of course, putting barriers into readl and writel, we might as well use strongly ordered mappings anyway, because that'll save us a few bytes of program memory. TBH, this is becoming soo much of a joke, it's untrue. Let's go back to having a strongly ordered memory model. Please. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
Hello Catalin, On Wed, 6 Aug 2008, Catalin Marinas wrote: On Tue, 2008-08-05 at 07:15 -0500, Woodruff, Richard wrote: Is the controller allowed to write dirty cache lines out at any time it likes? Surely a better fix is to drain the cache of the changes before changing the clock for the SDRAM? - Previously the SRAM was marked as cached. _Execution_ using that attribute will generate line fetches to SRAM. This will cause displacement write-outs of resident DDR lines. Similarly, _load/store_ sequences in that code have the same effect. This cast outs dead lock the CPU and it can't fetch to progress. [...] * Flushing the entire L1 L2 cache frequently is very expensive and better not done if you don't have. Also, it is not sufficient for the context-restore path which needs to NOT live in DDR. The code need to execute in a non-cached region. I don't think there is any guarantee that dirty cache line won't be evicted to SDRAM even if your code uses uncached memory only. The CPU is allowed to do speculative reads from the normal cached memory and these reads may force a dirty cache line to be written back to memory. You may need to do at least a cache clean operation (invalidate not necessary). If we turn off speculative reads via a CP15 control register Z-bit write for the duration of the SRAM code execution, and use either normal non-cached memory or strongly-ordered memory for the SRAM code, will that effectively prevent any cache line writeback during that time? (assuming interrupts are disabled, that is). Also, a somewhat-related question about strongly-ordered memory regions: these are described as bracketing accesses to those regions with data memory barriers. In this case, are those barriers specific to the strongly-ordered pages, or will they affect all memory transactions, even to normal cached memory? - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Wed, Aug 06, 2008 at 07:21:14AM -0500, Woodruff, Richard wrote: Most of the weak memory attributes in newer ARMs are not exploited today in tree. I'll guess this was more a correctness and capability judgment from Russell. Not entirely true. We do as much as is safe to do - which is basically using 'device' mappings for devices and ioremap, and 'memory' mappings for the main memory, module and vmalloc mappings. What we don't do is mark DMA memory ask being normal uncached memory, thereby allowing that to be reordered with device accesses - we make it strongly ordered. The reason being that the kernel doesn't have barriers necessary to ensure that writes to DMA memory hit physical memory before the device access to enable DMA hits the DMA controller. Those kinds of bugs can be absolute hell to track down - think about a DMA controller accessing an uninitialised DMA descriptor, resulting in it scribbing over random bits of memory. The only real way to do this is to audit lots of drivers to ensure that: 1. DMA is not started until accesses to memory allocated by dma_alloc_coherent() have hit memory 2. accesses to dma_alloc_coherent() memory always read current data, even if the DMA controller has just updated the descriptor you're reading. Linux presently - and quite rightly - assumes that accesses to DMA coherent memory _are_ coherent with DMA. If not, the API would be a joke. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, 2008-08-07 at 08:30 +0100, Russell King - ARM Linux wrote: What we don't do is mark DMA memory ask being normal uncached memory, thereby allowing that to be reordered with device accesses - we make it strongly ordered. The reason being that the kernel doesn't have barriers necessary to ensure that writes to DMA memory hit physical memory before the device access to enable DMA hits the DMA controller. We have mb() and related which provides the ordering (there is also mmiowb() but my understanding is that we don't need this on ARM). http://lwn.net/Articles/283776/ Those kinds of bugs can be absolute hell to track down - think about a DMA controller accessing an uninitialised DMA descriptor, resulting in it scribbing over random bits of memory. Yes, indeed, but ARM is not the only architecture with a weak memory ordering model so drivers should be fixed, in theory. Linux presently - and quite rightly - assumes that accesses to DMA coherent memory _are_ coherent with DMA. If not, the API would be a joke. As I understand it, the DMA mapping doesn't guarantee any ordering, drivers must use barriers. According to Documentation/DMA-mapping.txt: - Consistent DMA mappings which are usually mapped at driver initialization, unmapped at the end and for which the hardware should guarantee that the device and the CPU can access the data in parallel and will see updates made by each other without any explicit software flushing. [...] IMPORTANT: Consistent DMA memory does not preclude the usage of proper memory barriers. The CPU may reorder stores to consistent memory just as it may normal memory. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
From: Russell King - ARM Linux [mailto:[EMAIL PROTECTED] Most of the weak memory attributes in newer ARMs are not exploited today in tree. I'll guess this was more a correctness and capability judgment from Russell. Not entirely true. We do as much as is safe to do - which is basically using 'device' mappings for devices and ioremap, and 'memory' mappings for the main memory, module and vmalloc mappings. Is DEVICE really safe for things other than FIFOs with out the use of barriers? Accesses are in order but they can be buffered. The bus is free to post/buffer these writes as long as it preserves the order (and it does). I want to know the write happened before moving on in some code. Yes some barrier can be added to the code to fix it but its not there in most drivers today. I recall you gave an explanation on list a while back where MPCORE had to use DEVICE or it wouldn't work. That restriction is not there for others. We do in some drivers today get spurious interrupts when DEVICE is used but don't see them when using SO. Originally the IC-Architect wanted two memory windows per device, one SO for register control and one DEVICE for FIFO access. Given that we do DMA (which doesn't care about how ARM sees the world) on the performance hungry devices not doing this was ok. What we don't do is mark DMA memory ask being normal uncached memory, thereby allowing that to be reordered with device accesses - we make it strongly ordered. The reason being that the kernel doesn't have barriers necessary to ensure that writes to DMA memory hit physical memory before the device access to enable DMA hits the DMA controller. For an experiment a couple years back we did convert the dma alloc pool addresses as NC. All worked -except- for OHCI-USB which started failing some tests. Regards, Richard W. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, Aug 07, 2008 at 01:56:40PM -0500, Woodruff, Richard wrote: From: Russell King - ARM Linux [mailto:[EMAIL PROTECTED] Most of the weak memory attributes in newer ARMs are not exploited today in tree. I'll guess this was more a correctness and capability judgment from Russell. Not entirely true. We do as much as is safe to do - which is basically using 'device' mappings for devices and ioremap, and 'memory' mappings for the main memory, module and vmalloc mappings. Is DEVICE really safe for things other than FIFOs with out the use of barriers? As far as I'm aware, yes - and that comment is based solely upon the fact that no one has reported any problems with the kernel which have been tracked down to using the device memory type on ARMv6 and above... We do in some drivers today get spurious interrupts when DEVICE is used but don't see them when using SO. ... until now, or even that very sentence. That's not unexpected if you don't have the right barriers in place at the end of things such as IRQ controllers ack/mask functions. Can you give me more information - which OMAP platform, which IRQ controller, which device is easiest to provoke this behaviour, and I'll look at it. Originally the IC-Architect wanted two memory windows per device, one SO for register control and one DEVICE for FIFO access. Given that we do DMA (which doesn't care about how ARM sees the world) on the performance hungry devices not doing this was ok. I'm not sure what point you're making there. For an experiment a couple years back we did convert the dma alloc pool addresses as NC. All worked -except- for OHCI-USB which started failing some tests. If we go down the route of marking DMA as 'normal memory non-cacheable' we're going to have a never ending stream of drivers which don't work correctly. We're forever going to be bug hunting drivers, having to add barriers as required. Arguably those barriers should be there already, but if drivers are developed on platforms without weak ordering, authors just don't think about it, and _certainly_ can't test them. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
From: Russell King - ARM Linux [mailto:[EMAIL PROTECTED] Is DEVICE really safe for things other than FIFOs with out the use of barriers? As far as I'm aware, yes - and that comment is based solely upon the fact that no one has reported any problems with the kernel which have been tracked down to using the device memory type on ARMv6 and above... We do in some drivers today get spurious interrupts when DEVICE is used but don't see them when using SO. ... until now, or even that very sentence. That is our fault then I suppose for not discussing this on arm-linux. In OMAP2 and OMAP3 this has been observed. In vendor kernels where time stands still and lots of validation has happened we did stick with SO for OMAP2. On some internal kernels already we have gone to SO for OMAP3 as customers ramp and need the errors gone. The faster the system clocks the more it seems to show. The thing with these effects, especially spurious IRQs is there usually are several reasons they show up and several ways to make them go away. In the beginning there have been lots then they drop off as the system software matures. Then if the program survives long enough to be optimized they start to show up again but in lesser numbers. This has been the OMAP2/3 experience so far. Going SO to control regions has stamped them out at this point. That's not unexpected if you don't have the right barriers in place at the end of things such as IRQ controllers ack/mask functions. Yes. I've submitted patches (to linux-omap) and Catalin did submit patches (to arm-linux) for PIC barriers. In the past they have been rejected by Tony or you for different reasons. Tony last rejected it because he thought it should be generic at the ARM level. I don't recall what your last stance was. It is consistently observed, that with these irq-controller barriers in place, spurious irqs go down (but not necessarily away). Our internal kernels still have this in them for OMAP2 and OMAP3. Can you give me more information - which OMAP platform, which IRQ controller, which device is easiest to provoke this behaviour, and I'll look at it. Lately, a full OMAP3 system running with the 3d-GFX driver is causing these. Camera driver operation has been one which raised them on and off. If it persists exporting an environment to you should be possible. I expect it will take time and coordination to do this. The pure linux-omap kernel has periodically seen spurious irqs with UART. However, if you use the irq-controller barriers they tend to go away. Originally the IC-Architect wanted two memory windows per device, one SO for register control and one DEVICE for FIFO access. Given that we do DMA (which doesn't care about how ARM sees the world) on the performance hungry devices not doing this was ok. I'm not sure what point you're making there. Use a dual mapping to manage a device (2 ioremaps). You use a SO mapping to write to registers of that device. Then when you go to write to its FIFO use a DEVICE mapping. Say TX IRQ happens at UART, I might check status bits through a SO mapping, but when it comes time to fill the FIFO I write to the DEVICE mapping. Maybe I can even arrange it such that I burst in order using the natural FIFO width. Even if you don't burst you can take advantage of the bus posting effects. Fill the FIFO and get out of there with out a big stall time. Like I said previously, a system likely will use DMA to the FIFO if performance matters, so not optimizing here has been the choice. For an experiment a couple years back we did convert the dma alloc pool addresses as NC. All worked -except- for OHCI-USB which started failing some tests. If we go down the route of marking DMA as 'normal memory non-cacheable' we're going to have a never ending stream of drivers which don't work correctly. We're forever going to be bug hunting drivers, having to add barriers as required. Arguably those barriers should be there already, but if drivers are developed on platforms without weak ordering, authors just don't think about it, and _certainly_ can't test them. Is this just the case for an attribute to be made available from an API change/addition to allow a driver to make use of it? The default can always be conservative. The trend is ARMs are depending more on pipeline and prefetch tricks to perform. For these tricks to work weak memory features need to be used at times. Regards, Richard W. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, Aug 07, 2008 at 03:38:55PM -0500, Woodruff, Richard wrote: From: Russell King - ARM Linux [mailto:[EMAIL PROTECTED] Is DEVICE really safe for things other than FIFOs with out the use of barriers? As far as I'm aware, yes - and that comment is based solely upon the fact that no one has reported any problems with the kernel which have been tracked down to using the device memory type on ARMv6 and above... We do in some drivers today get spurious interrupts when DEVICE is used but don't see them when using SO. ... until now, or even that very sentence. That is our fault then I suppose for not discussing this on arm-linux. In OMAP2 and OMAP3 this has been observed. In vendor kernels where time stands still and lots of validation has happened we did stick with SO for OMAP2. On some internal kernels already we have gone to SO for OMAP3 as customers ramp and need the errors gone. The faster the system clocks the more it seems to show. To do that, and then ask about when Linux is going to start exploiting the weak memory types is a little unfair don't you think? The thing with these effects, especially spurious IRQs is there usually are several reasons they show up and several ways to make them go away. In the beginning there have been lots then they drop off as the system software matures. Then if the program survives long enough to be optimized they start to show up again but in lesser numbers. This has been the OMAP2/3 experience so far. Going SO to control regions has stamped them out at this point. What you're therefore asking for is a weak memory ordering model which doesn't require any effort on the software programmers part - that's a CPU architecture thing which you'll need to talk to ARM about. x86 can do this for the most part because x86's development has been such that the hardware has had to work around the software to make improvements. On ARM, normally when there's updates, software has to work around the hardware. That's not unexpected if you don't have the right barriers in place at the end of things such as IRQ controllers ack/mask functions. Yes. I've submitted patches (to linux-omap) and Catalin did submit patches (to arm-linux) for PIC barriers. In the past they have been rejected by Tony or you for different reasons. Tony last rejected it because he thought it should be generic at the ARM level. I don't recall what your last stance was. Looking back, I never commented on that patch. I did on the previous patch which was adding DSBs in a way which would break stuff. The patch to add them to the interrupt controllers has never been reposted. However, adding barriers may not be the correct answer for this. See Documentation/io_ordering.txt - reading back from a safe register on the target device ensures that the previous writes should hit the device before the read completes, without the overhead of a full barrier. This point is even more important if you have some form of write posting between the CPU and the device (eg, a PCI bus) - a DSB won't reach down to the target PCI device which may be behind some write-posting bridges. So, in the case of arch/arm/common/gic.c, we should be reading one of the gic control registers after the writes. In the case of arch/arm/mach-omap2/irq.c, reading the INTC_REVISION reg after masking should be a sufficient solution. But, not a barrier. However, if you use the irq-controller barriers they tend to go away. Great, so solving that should prevent them. Originally the IC-Architect wanted two memory windows per device, one SO for register control and one DEVICE for FIFO access. Given that we do DMA (which doesn't care about how ARM sees the world) on the performance hungry devices not doing this was ok. I'm not sure what point you're making there. Use a dual mapping to manage a device (2 ioremaps). You use a SO mapping to write to registers of that device. Then when you go to write to its FIFO use a DEVICE mapping. I believe ARMv7 has some restrictions on dual mapping of the same space with different types, so don't expect this technique to always work. Say TX IRQ happens at UART, I might check status bits through a SO mapping, but when it comes time to fill the FIFO I write to the DEVICE mapping. Why? Firstly, the read _has_ to complete before the program can continue. (If it hasn't completed, you don't have the data to decide what to do next.) Secondly, any previous device writes will have to complete before the read completes. So what does reading the status bits through a SO mapping gain you? The answer is, all other reads and writes previously issued by the program completing. Does that affect the status that the UART is giving you? For an experiment a couple years back we did convert the dma alloc pool addresses as NC. All worked -except- for OHCI-USB which started failing some tests. If we go down the route of
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Thu, Aug 07, 2008 at 10:20:33PM +0100, Russell King - ARM Linux wrote: In the case of arch/arm/mach-omap2/irq.c, reading the INTC_REVISION reg after masking should be a sufficient solution. And here's a patch to do exactly that. diff --git a/arch/arm/mach-omap2/irq.c b/arch/arm/mach-omap2/irq.c index 9ef15b3..27610b1 100644 --- a/arch/arm/mach-omap2/irq.c +++ b/arch/arm/mach-omap2/irq.c @@ -48,6 +48,7 @@ static struct omap_irq_bank { static void omap_ack_irq(unsigned int irq) { __raw_writel(0x1, irq_banks[0].base_reg + INTC_CONTROL); + __raw_readl(irq_banks[0].base_reg + INTC_REVISION); } static void omap_mask_irq(unsigned int irq) @@ -61,6 +62,7 @@ static void omap_mask_irq(unsigned int irq) } __raw_writel(1 irq, irq_banks[0].base_reg + INTC_MIR_SET0 + offset); + __raw_readl(irq_banks[0].base_reg + INTC_REVISION); } static void omap_unmask_irq(unsigned int irq) -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Mon, 2008-08-04 at 17:40 -0600, Paul Walmsley wrote: Add the MT_MEMORY_STRONGLY_ORDERED memory type for ARM strongly ordered memory. This is used on OMAP3 for on-board SRAM. On OMAP, SRAM is used for code that changes the SDRAM controller's clock, temporarily blocking access to SDRAM. During this period, as code executes from SRAM, the ARM cache controller can attempt to write dirty cache lines back to SDRAM to make room for SRAM cache lines, causing the MPU subsystem to hang. To avoid this, we mark SRAM as strongly- ordered memory. Why not use normal uncached memory? Strongly ordered is pretty inefficient as it cannot do any reordering or write buffer merging (it's like having a memory barrier before and after each instruction). Speculative accesses are not allowed either. Strongly ordered memory is not really meant for executing code from. + [MT_MEMORY_STRONGLY_ORDERED] = { + .prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | + PMD_SECT_UNCACHED, You can add PMD_SECT_TEX(1) for normal uncached memory. -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
On Tue, 2008-08-05 at 07:15 -0500, Woodruff, Richard wrote: Is the controller allowed to write dirty cache lines out at any time it likes? Surely a better fix is to drain the cache of the changes before changing the clock for the SDRAM? - Previously the SRAM was marked as cached. _Execution_ using that attribute will generate line fetches to SRAM. This will cause displacement write-outs of resident DDR lines. Similarly, _load/store_ sequences in that code have the same effect. This cast outs dead lock the CPU and it can't fetch to progress. [...] * Flushing the entire L1 L2 cache frequently is very expensive and better not done if you don't have. Also, it is not sufficient for the context-restore path which needs to NOT live in DDR. The code need to execute in a non-cached region. I don't think there is any guarantee that dirty cache line won't be evicted to SDRAM even if your code uses uncached memory only. The CPU is allowed to do speculative reads from the normal cached memory and these reads may force a dirty cache line to be written back to memory. You may need to do at least a cache clean operation (invalidate not necessary). -- Catalin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] ARM MMU: add strongly-ordered memory type
Why not use normal uncached memory? Strongly ordered is pretty inefficient as it cannot do any reordering or write buffer merging (it's like having a memory barrier before and after each instruction). Speculative accesses are not allowed either. Strongly ordered memory is not really meant for executing code from. It could be. This is a discussion we were having off line. The code in question is a small bit of assembly interacting with hardware mainly and has not been audited for full pipeline/buffering correctness. Most of the weak memory attributes in newer ARMs are not exploited today in tree. I'll guess this was more a correctness and capability judgment from Russell. Regards, Richard W. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM MMU: add strongly-ordered memory type
On Mon, Aug 04, 2008 at 05:40:57PM -0600, Paul Walmsley wrote: Add the MT_MEMORY_STRONGLY_ORDERED memory type for ARM strongly ordered memory. This is used on OMAP3 for on-board SRAM. On OMAP, SRAM is used for code that changes the SDRAM controller's clock, temporarily blocking access to SDRAM. During this period, as code executes from SRAM, the ARM cache controller can attempt to write dirty cache lines back to SDRAM to make room for SRAM cache lines, causing the MPU subsystem to hang. To avoid this, we mark SRAM as strongly- ordered memory. Is the controller allowed to write dirty cache lines out at any time it likes? Surely a better fix is to drain the cache of the changes before changing the clock for the SDRAM? Problem noted by Richard Woodruff [EMAIL PROTECTED]. Fix derived from the TI CDP codebase. Signed-off-by: Paul Walmsley [EMAIL PROTECTED] --- arch/arm/mm/mmu.c |5 + include/asm-arm/mach/map.h | 13 +++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 2d6d682..5b56539 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -239,6 +239,11 @@ static struct mem_type mem_types[] = { .prot_sect = PMD_TYPE_SECT, .domain= DOMAIN_KERNEL, }, + [MT_MEMORY_STRONGLY_ORDERED] = { + .prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | + PMD_SECT_UNCACHED, + .domain= DOMAIN_KERNEL, + }, }; const struct mem_type *get_mem_type(unsigned int type) diff --git a/include/asm-arm/mach/map.h b/include/asm-arm/mach/map.h index 7ef3c83..8cb46b7 100644 --- a/include/asm-arm/mach/map.h +++ b/include/asm-arm/mach/map.h @@ -19,12 +19,13 @@ struct map_desc { }; /* types 0-3 are defined in asm/io.h */ -#define MT_CACHECLEAN4 -#define MT_MINICLEAN 5 -#define MT_LOW_VECTORS 6 -#define MT_HIGH_VECTORS 7 -#define MT_MEMORY8 -#define MT_ROM 9 +#define MT_CACHECLEAN4 +#define MT_MINICLEAN 5 +#define MT_LOW_VECTORS 6 +#define MT_HIGH_VECTORS 7 +#define MT_MEMORY8 +#define MT_ROM 9 +#define MT_MEMORY_STRONGLY_ORDERED 10 #define MT_NONSHARED_DEVICE MT_DEVICE_NONSHARED #define MT_IXP2000_DEVICEMT_DEVICE_IXP2000 --- List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel FAQ:http://www.arm.linux.org.uk/mailinglists/faq.php Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php -- -- Ben Q: What's a light-year? A: One-third less calories than a regular year. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html