Dear Andy Green, In message <20130726065323.27333.82421.stgit@localhost.localdomain> you wrote: > While studying the reason why kernel copy from NOR was so slow on our > platform, > I realized U-Boot is pulling it from 32-bit NOR in 8-bit chunks needlessly. > > bootm uses memmove() and that just takes the approach by default to move u8s > around. > > This optimization prefers memcpy() implementation (done mostly in 32-bit reads > and writes) if there's no overlap in source and dest, resulting in a huge > speedup on our platform (480ms copy from 32-bit NOR ---> 140ms)
Sorry, but I dislike your patch. Instead of making assumptions on the performance of memcpy() and adding the overhead of an additional function call (which can be expensive especially for short copy operations) it would make more sense to pull the "copy a word at a time" code from memcpy() into memmove(), too. On the other hand - if you really care about performance, then why do you not make sure that you provide optimized implementations for such functions and consequently #define __HAVE_ARCH_MEMMOVE (and __HAVE_ARCH_MEMCPY) ? Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de It is dangerous to be sincere unless you are also stupid. - George Bernard Shaw _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev