> > What about run-time patching memcpy() after the caches are initialised? > > Yeah, that's the solution we use on 64-bit. > > It also means you can have cpu specific optimisations, which can be patched in > or out using the cpu feature patching.
I've noticed x86 doing that. For newer Intel parts it patches in 'rep movsb' but unfortunately memcpy_io is always #defined to memcpy. For uncached targets the hardware can't optimise rep movsb - so you end up with byte accesses. These work can be rather slower than expected. This also affects userspace copies to mmap()ed PCIe space. David _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev