DPDK team, I am looking at rte_memcpy.h implementation and I wasn't sure whether coding in that file is done for any specific reason. I see superfluous type casting in functions and instead of using loop for offset changes during copy, separate invocation (same function) is done repeatedly.
I modified the code to remove unnecessary type casting and used loop for offset changes. I compared the code generated by gcc (4.8.2) and in both cases it looked same. In addition, "make test" for memcpy performance gave similar results. I will send out a patch so you can check the changes I did and let me know if it is good to make those changes. Thanks, Ravi