RE: [PATCH 6/6] ARM: l2x0: Optimise the range based operations
-Original Message- From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap- ow...@vger.kernel.org] On Behalf Of Shilimkar, Santosh Sent: Tuesday, October 05, 2010 10:25 AM To: Russell King - ARM Linux Cc: linux-arm-ker...@lists.infradead.org; catalin.mari...@arm.com; t...@linutronix.de; linux-omap@vger.kernel.org Subject: RE: [PATCH 6/6] ARM: l2x0: Optimise the range based operations [] mmcblk0: error -5 transferring data, sector 1, nr 7, card status 0x900 end_request: I/O error, dev mmcblk0, sector 1 Buffer I/O error on device mmcblk0, logical block 0 port 1 high speed -5 is -EIO, which is a FIFO overrun error, so somehow these changes are causing the CPU or bus accesses to be slower. I don't see the problem on OMAP MMC. May be some how additional check is making these operations touch slower which lead to the under run. Will have a look at it again. May for this merge window you can drop 'Optimise the range based operations' and 'Determine cache size' patches. The Pl310 cache way size given in KB and 'Determine cache size' missed to include that. Have updated the git tree with refreshed patch. Here is the updated patch.. -- From 8b351fbc4da738a0727854cb88933c4051657384 Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar santosh.shilim...@ti.com Date: Sun, 11 Jul 2010 14:35:37 +0530 Subject: [PATCH 6/8 v2] ARM: l2x0: Determine the cache size The cache size is needed for to optimise range based maintainance operations Signed-off-by: Santosh Shilimkar santosh.shilim...@ti.com Acked-by: Catalin Marinas catalin.mari...@arm.com Acked-by: Linus Walleij linus.wall...@stericsson.com --- arch/arm/include/asm/hardware/cache-l2x0.h |1 + arch/arm/mm/cache-l2x0.c | 13 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/hardware/cache-l2x0.h b/arch/arm/include/asm/hardware/cache-l2x0.h index d833355..4633d2a 100644 --- a/arch/arm/include/asm/hardware/cache-l2x0.h +++ b/arch/arm/include/asm/hardware/cache-l2x0.h @@ -55,6 +55,7 @@ #define L2X0_CACHE_ID_PART_MASK(0xf 6) #define L2X0_CACHE_ID_PART_L210(1 6) #define L2X0_CACHE_ID_PART_L310(3 6) +#define L2X0_AUX_CTRL_WAY_SIZE_MASK(0x3 17) #ifndef __ASSEMBLY__ extern void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask); diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c index 9310d61..262c752 100644 --- a/arch/arm/mm/cache-l2x0.c +++ b/arch/arm/mm/cache-l2x0.c @@ -28,6 +28,7 @@ static void __iomem *l2x0_base; static DEFINE_SPINLOCK(l2x0_lock); static uint32_t l2x0_way_mask; /* Bitmask of active ways */ +static uint32_t l2x0_size; static inline void cache_wait_way(void __iomem *reg, unsigned long mask) { @@ -242,6 +243,7 @@ void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask) { __u32 aux; __u32 cache_id; + __u32 way_size = 0; int ways; const char *type; @@ -276,6 +278,13 @@ void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask) l2x0_way_mask = (1 ways) - 1; /* +* L2 cache Size = Way size * Number of ways +*/ + way_size = (aux L2X0_AUX_CTRL_WAY_SIZE_MASK) 17; + way_size = 1 (way_size + 3); + l2x0_size = ways * way_size * SZ_1K; + + /* * Check if l2x0 controller is already enabled. * If you are booting from non-secure mode * accessing the below registers will fault. @@ -300,6 +309,6 @@ void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask) outer_cache.disable = l2x0_disable; printk(KERN_INFO %s cache controller enabled\n, type); - printk(KERN_INFO l2x0: %d ways, CACHE_ID 0x%08x, AUX_CTRL 0x%08x\n, -ways, cache_id, aux); + printk(KERN_INFO l2x0: %d ways, CACHE_ID 0x%08x, AUX_CTRL 0x%08x, Cache size: %d B\n, + ways, cache_id, aux, l2x0_size); } -- 1.6.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] ARM: l2x0: Optimise the range based operations
On Tue, Sep 07, 2010 at 01:27:23PM +0530, Santosh Shilimkar wrote: For the big buffers which are in excess of cache size, the maintaince operations by PA are very slow. For such buffers the maintainace operations can be speeded up by using the WAY based method. This causes my Versatile Express to corrupt MMC transfers. Reverting both this and the 'Determine cache size' patches makes it work again. (Note that just reverting this one doesn't result in a working situation.) Good boot: L310 cache controller enabled l2x0: 8 ways, CACHE_ID 0x41c3, AUX_CTRL 0x0246 ... mmci-pl18x mb:mmci: mmc0: MMCI rev 0 cfg 00 at 0x10005000 irq 41,42 aaci-pl041 mb:aaci: ARM AC'97 Interface at 0x10004000, irq 43, fifo 512 ALSA device list: #0: ARM AC'97 Interface at 0x10004000, irq 43 TCP cubic registered mmc0: host does not support reading read-only switch. assuming write-enable. mmc0: new SD card at address e624 mmcblk0: mmc0:e624 SD02G 1.89 GiB NET: Registered protocol family 17 VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 0 mmcblk0: p1 Initalizing network drop monitor service Waiting 5sec before mounting root device... port 1 high speed Bad boot (with just 'Determine cache size' patch applied): L310 cache controller enabled l2x0: 8 ways, CACHE_ID 0x41c3, AUX_CTRL 0x0246, Cache size: 512 KB ... mmci-pl18x mb:mmci: mmc0: MMCI rev 0 cfg 00 at 0x10005000 irq 41,42 aaci-pl041 mb:aaci: ARM AC'97 Interface at 0x10004000, irq 43, fifo 512 ALSA device list: #0: ARM AC'97 Interface at 0x10004000, irq 43 mmc0: host does not support reading read-only switch. assuming write-enable. mmc0: new SD card at address e624 mmcblk0: mmc0:e624 SD02G 1.89 GiB TCP cubic registered NET: Registered protocol family 17 mmcblk0: retrying using single block read VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 0 mmcblk0: error -5 transferring data, sector 0, nr 8, card status 0x900 Initalizing network drop monitor service random garbage mmcblk0: error -5 transferring data, sector 1, nr 7, card status 0x900 end_request: I/O error, dev mmcblk0, sector 1 Buffer I/O error on device mmcblk0, logical block 0 port 1 high speed -5 is -EIO, which is a FIFO overrun error, so somehow these changes are causing the CPU or bus accesses to be slower. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/6] ARM: l2x0: Optimise the range based operations
-Original Message- From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk] Sent: Tuesday, October 05, 2010 2:53 AM To: Shilimkar, Santosh Cc: linux-arm-ker...@lists.infradead.org; catalin.mari...@arm.com; t...@linutronix.de; linux-omap@vger.kernel.org Subject: Re: [PATCH 6/6] ARM: l2x0: Optimise the range based operations On Tue, Sep 07, 2010 at 01:27:23PM +0530, Santosh Shilimkar wrote: For the big buffers which are in excess of cache size, the maintaince operations by PA are very slow. For such buffers the maintainace operations can be speeded up by using the WAY based method. This causes my Versatile Express to corrupt MMC transfers. Reverting both this and the 'Determine cache size' patches makes it work again. (Note that just reverting this one doesn't result in a working situation.) Mostly MMC buffers are smaller than 512KB so the optimization won't even be invoked. Good boot: L310 cache controller enabled l2x0: 8 ways, CACHE_ID 0x41c3, AUX_CTRL 0x0246 ... mmci-pl18x mb:mmci: mmc0: MMCI rev 0 cfg 00 at 0x10005000 irq 41,42 aaci-pl041 mb:aaci: ARM AC'97 Interface at 0x10004000, irq 43, fifo 512 ALSA device list: #0: ARM AC'97 Interface at 0x10004000, irq 43 TCP cubic registered mmc0: host does not support reading read-only switch. assuming write- enable. mmc0: new SD card at address e624 mmcblk0: mmc0:e624 SD02G 1.89 GiB NET: Registered protocol family 17 VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 0 mmcblk0: p1 Initalizing network drop monitor service Waiting 5sec before mounting root device... port 1 high speed Bad boot (with just 'Determine cache size' patch applied): L310 cache controller enabled l2x0: 8 ways, CACHE_ID 0x41c3, AUX_CTRL 0x0246, Cache size: 512 KB ... mmci-pl18x mb:mmci: mmc0: MMCI rev 0 cfg 00 at 0x10005000 irq 41,42 aaci-pl041 mb:aaci: ARM AC'97 Interface at 0x10004000, irq 43, fifo 512 ALSA device list: #0: ARM AC'97 Interface at 0x10004000, irq 43 mmc0: host does not support reading read-only switch. assuming write- enable. mmc0: new SD card at address e624 mmcblk0: mmc0:e624 SD02G 1.89 GiB TCP cubic registered NET: Registered protocol family 17 mmcblk0: retrying using single block read VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 0 mmcblk0: error -5 transferring data, sector 0, nr 8, card status 0x900 Initalizing network drop monitor service random garbage mmcblk0: error -5 transferring data, sector 1, nr 7, card status 0x900 end_request: I/O error, dev mmcblk0, sector 1 Buffer I/O error on device mmcblk0, logical block 0 port 1 high speed -5 is -EIO, which is a FIFO overrun error, so somehow these changes are causing the CPU or bus accesses to be slower. I don't see the problem on OMAP MMC. May be some how additional check is making these operations touch slower which lead to the under run. Will have a look at it again. May for this merge window you can drop 'Optimise the range based operations' and ''Determine cache size' patches. Regards, Santosh -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html