CCing the ARM custodian. Albert, what do you think of Alexey's comments below? Actually, having read it properly myself I think Alexey is confusing cache flushing with cache invalidation, I've left the CC in place though in case you have any thoughts on the matter.
On Fri, 2014-04-25 at 08:48 +0000, Alexey Brodkin wrote: > I thought a bit more about this situation and now I'm not that sure if > we need to align addresses we pass to cache invalidate/flush functions. > > Because IMHO drivers shouldn't care about specifics of particular > platform or architecture. Otherwise we'll need to patch each and every > driver only for cache invalidate/flush functions. I looked how these > functions are used in other drivers and see that in most of cases no > additional alignment precautions were implemented. People just pass > start and end addresses. > > In its turn platform and architecture provides cache invalidate/flush > functions implement its functionality depending on hardware specifics. > > For example on architectures that may only flush/invalidate with > granularity of 1 cache line cache invalidate/flush functions make sure > to start processing from the start of the cache line to which start > address falls and end processing when cache line where end address falls > is processed. > > I may assume that there're architectures that automatically understand > from which cache line to start and at which line to stop processing. > > But if your architecture requires cache line aligned addresses to be > used for start/end addresses you may look for examples in ARC > (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/arc/cpu/arc700/cache.c),, > MIPS > (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/mips/cpu/mips32/cpu.c), > SH > (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/sh/cpu/sh4/cache.c), > > and what's interesting even implementation you use have semi-proper > start/end addresses handling - > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/lib/cache-pl310.c This is the driver for one particular ARM cache controller and not the one used for the SoC. In any case it does "proper" start/end handling only for cache flush operations, not cache invalidate. Cache invalidate is a potentially destructive operation (throwing away data in the caches), having it operate on anything more than the precise region requested would be very surprising to almost anyone I think. > > Here's your invalidation procedure: > ============ > /* invalidate memory from start to stop-1 */ > void v7_outer_cache_inval_range(u32 start, u32 stop) > { > /* PL310 currently supports only 32 bytes cache line */ > u32 pa, line_size = 32; > > /* > * If start address is not aligned to cache-line do not > * invalidate the first cache-line > */ > if (start & (line_size - 1)) { > printf("ERROR: %s - start address is not aligned - 0x%08x\n", > __func__, start); > /* move to next cache line */ > start = (start + line_size - 1) & ~(line_size - 1); > } > > /* > * If stop address is not aligned to cache-line do not > * invalidate the last cache-line > */ > if (stop & (line_size - 1)) { > printf("ERROR: %s - stop address is not aligned - 0x%08x\n", > __func__, stop); > /* align to the beginning of this cache line */ > stop &= ~(line_size - 1); > } > > for (pa = start; pa < stop; pa = pa + line_size) > writel(pa, &pl310->pl310_inv_line_pa); > > pl310_cache_sync(); > } > ============ > > 1. I don't understand why start from the next cache line if start > address is not aligned to cache line boundary? I'd say that you want to > invalidate cache line that contains unaligned start address. Otherwise > first bytes won't be invalidated, right? > > 2. Why do we throw _error_ message. I may understand if you emit > _warning_ message in case of debug build (with DEBUG defined). Well in > current implementation (see 1) it could be error because behavior is > really dangerous. But if you start from correct cache line only warning > in debug mode makes sense (IMHO). > > 3. Stop/end address in contrast might need to be extended depending on > HW implementation (see above comment). > > And here's your flush procedure: > =========== > void v7_outer_cache_flush_range(u32 start, u32 stop) > { > /* PL310 currently supports only 32 bytes cache line */ > u32 pa, line_size = 32; > > /* > * Align to the beginning of cache-line - this ensures that > * the first 5 bits are 0 as required by PL310 TRM > */ > start &= ~(line_size - 1); > > for (pa = start; pa < stop; pa = pa + line_size) > writel(pa, &pl310->pl310_clean_inv_line_pa); > > pl310_cache_sync(); > } > =========== > > Which looks very correct to me. I'm wondering if there was a reason to > have so different implementation of functions that do very similar > things. I think you are missing the important differences between a cache flush and a cache invalidate. > So at this point I would ask you to modify cache invalidate function for > your architecture. This way you prevent mentioned issues with other > drivers. As I describe above, I don't think this would be at all wise. > > I'm not seeing any others, in practice or by eye-balling the code. > > > > > 4. Why don't you squeeze all 3 patches in 1 and name it like "fix > > > alignment issues with caches on some platforms"? Basically with all 3 > > > patches you fix one and only issue and application of any one of those 3 > > > patches doesn't solve your problem, right? > > > > These are the issues as I discovered them one by one. I can fold them if > > you like but doing them separately will aid bisection if one of them > > turns out to be wrong in some way. As you prefer. > > Keeping in mind things written above I'd say that patches 2 & 3 are not > needed at all, while patch 1 makes perfect sense and fixes an obvious > issue. > > Regards, > Alexey > _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot