[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

2015-04-15 Thread Ravi Kerur
On Tue, Apr 14, 2015 at 7:53 PM, Stephen Hemminger < stephen at networkplumber.org> wrote: > On Tue, 14 Apr 2015 14:31:53 -0700 > Ravi Kerur wrote: > > > + > > + for (i = 0; i < 2; i++) > > + rte_mov32(dst + i * 32, src + i * 32); > > } > Unless you force compiler to unroll the l

[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

2015-04-15 Thread Ravi Kerur
On Tue, Apr 14, 2015 at 11:32 PM, Pawel Wodkowski < pawelx.wodkowski at intel.com> wrote: > On 2015-04-14 23:31, Ravi Kerur wrote: > >> + >> + for (i = 0; i < 8; i++) { >> + ymm = _mm256_loadu_si256((const __m256i *)(src + >> i * 32)); >> +

[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

2015-04-15 Thread Pawel Wodkowski
On 2015-04-14 23:31, Ravi Kerur wrote: > + > + for (i = 0; i < 8; i++) { > + ymm = _mm256_loadu_si256((const __m256i *)(src + i * > 32)); > + _mm256_storeu_si256((__m256i *)(dst + i * 32), ymm); > + } > + > n -= 256; > -

[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

2015-04-14 Thread Stephen Hemminger
On Tue, 14 Apr 2015 14:31:53 -0700 Ravi Kerur wrote: > + > + for (i = 0; i < 2; i++) > + rte_mov32(dst + i * 32, src + i * 32); > } Unless you force compiler to unroll the loop, it will be slower.

[dpdk-dev] [PATCH] Clean up rte_memcpy.h file

2015-04-14 Thread Ravi Kerur
Remove unnecessary type casting in functions. Use loop to adjust offset during copy instead of separate invocations. Signed-off-by: Ravi Kerur --- .../common/include/arch/x86/rte_memcpy.h | 317 ++--- 1 file changed, 151 insertions(+), 166 deletions(-) diff --git a/lib