Hello everyone, I'm working on a Cortex-A9 SoC equipped with 2 GB of RAM.
However, Linux is only given a fraction (typically 256 MB) of the RAM to manage (via the mem= bootparam) while the rest is managed using "OS-agnostic software". This "other memory" is meant to be shared between different hardware blocks of the SoC. We have a custom "memory_copy" kernel module, to copy between "Linux-managed RAM" and "SoC-wide RAM". However, the performance of this routine is... disappointingly underwhelming (8.5 MB/s). Taking a closer look at the implementation, I spotted some inefficiencies. 1) data is first copied (in chunks) to a temporary kernel buffer 2) for each word, a hardware remap is setup, then the word is copied, then the hardware remap is reset. (This hardware remap technique dates back to when we used MIPS.) I thought I could both make the implementation simpler, and boost the performance. A) I used ioremap to have Linux map the "SoC-wide RAM" physical addresses to virtual addresses that can be used in the module. B) I then use copy_{to,from}_user directly between the user-space buffer and the "SoC-wide RAM". This approach is ~20x faster than the original. My main question is: Is this safe/guaranteed to work all the time? (as long as the "SoC-wide RAM" is indeed RAM, not MM registers) Secondary thoughts/questions: We have routines for accesses in units of {8,16,32} bits. Since we're dealing with memory, I don't think the width of the accesses is important, right? (for correctness) AFAIU, ioremap maps as MT_DEVICE, i.e. uncached, no WC, all memory optimizations disabled, etc. There might be some performance improvements by using cached accesses, and manually flushing when the copy is done. Also, I don't know if copy_{to,from}_user is optimized using SIMD/NEON? Maybe there is some perf left on the table there? Regards. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/