[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
On Mon, 21 Mar 2016 06:24:37 -0700 (PDT) Thomas Monjalon wrote: > 2016-03-21 13:21, Jan Viktorin: > > On Mon, 21 Mar 2016 13:42:31 +0800 > > Jianbo Liu wrote: > > > > > On 20 March 2016 at 03:58, Jan Viktorin > > > wrote: > > > > The flag is used to enable memcpy optimizations in EAL. As it is not > > > > always > > > > the performance benefit, the flag allows to disable it. > > > > > > > > Signed-off-by: Jan Viktorin > > > > --- > > > > config/defconfig_arm-armv7a-linuxapp-gcc | 1 + > > > > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 8 ++-- > > > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc > > > > b/config/defconfig_arm-armv7a-linuxapp-gcc > > > > index 96c3343..2c60c2c 100644 > > > > --- a/config/defconfig_arm-armv7a-linuxapp-gcc > > > > +++ b/config/defconfig_arm-armv7a-linuxapp-gcc > > > > @@ -36,6 +36,7 @@ CONFIG_RTE_ARCH="arm" > > > > CONFIG_RTE_ARCH_ARM=y > > > > CONFIG_RTE_ARCH_ARMv7=y > > > > CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" > > > > +CONFIG_RTE_ARCH_ARM_NEON_MEMCPY=y > > > > > > > If it's not always benefit, why not disable here since it is common > > > armv7a config, and enable in your or other user's own config file? > > > > Jianbo, you are right. In that case, I'd just turn it off by default. > > And when there is a new platform-specific defconfig, it can enable it. > > > > Anyway, I am thinking of adding some comment into the rte_memcpy_32.h > > file describing the potential of the NEON code. What about: > > > > /* Enable in your defconfig to accelerate memcpy operations. Consider > >enabling this for Cortex-A15. For Cortex-A7 and Cortex-A9, It might > >accelerate short data copies (< 64 B). */ > > > > Thomas, do you consider this enough? > > Yes it is perfect. > Why not put it in defconfig_arm-armv7a-linuxapp-gcc? So, for now, I leave the patch as is and just add the comment. Jan
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
On 20 March 2016 at 03:58, Jan Viktorin wrote: > The flag is used to enable memcpy optimizations in EAL. As it is not always > the performance benefit, the flag allows to disable it. > > Signed-off-by: Jan Viktorin > --- > config/defconfig_arm-armv7a-linuxapp-gcc | 1 + > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 8 ++-- > 2 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc > b/config/defconfig_arm-armv7a-linuxapp-gcc > index 96c3343..2c60c2c 100644 > --- a/config/defconfig_arm-armv7a-linuxapp-gcc > +++ b/config/defconfig_arm-armv7a-linuxapp-gcc > @@ -36,6 +36,7 @@ CONFIG_RTE_ARCH="arm" > CONFIG_RTE_ARCH_ARM=y > CONFIG_RTE_ARCH_ARMv7=y > CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" > +CONFIG_RTE_ARCH_ARM_NEON_MEMCPY=y > If it's not always benefit, why not disable here since it is common armv7a config, and enable in your or other user's own config file? Thanks! Jianbo
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
On Mon, 21 Mar 2016 13:42:31 +0800 Jianbo Liu wrote: > On 20 March 2016 at 03:58, Jan Viktorin wrote: > > The flag is used to enable memcpy optimizations in EAL. As it is not always > > the performance benefit, the flag allows to disable it. > > > > Signed-off-by: Jan Viktorin > > --- > > config/defconfig_arm-armv7a-linuxapp-gcc | 1 + > > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 8 ++-- > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc > > b/config/defconfig_arm-armv7a-linuxapp-gcc > > index 96c3343..2c60c2c 100644 > > --- a/config/defconfig_arm-armv7a-linuxapp-gcc > > +++ b/config/defconfig_arm-armv7a-linuxapp-gcc > > @@ -36,6 +36,7 @@ CONFIG_RTE_ARCH="arm" > > CONFIG_RTE_ARCH_ARM=y > > CONFIG_RTE_ARCH_ARMv7=y > > CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" > > +CONFIG_RTE_ARCH_ARM_NEON_MEMCPY=y > > > If it's not always benefit, why not disable here since it is common > armv7a config, and enable in your or other user's own config file? Jianbo, you are right. In that case, I'd just turn it off by default. And when there is a new platform-specific defconfig, it can enable it. Anyway, I am thinking of adding some comment into the rte_memcpy_32.h file describing the potential of the NEON code. What about: /* Enable in your defconfig to accelerate memcpy operations. Consider enabling this for Cortex-A15. For Cortex-A7 and Cortex-A9, It might accelerate short data copies (< 64 B). */ Thomas, do you consider this enough? Jan > > Thanks! > Jianbo -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
2016-03-21 13:21, Jan Viktorin: > On Mon, 21 Mar 2016 13:42:31 +0800 > Jianbo Liu wrote: > > > On 20 March 2016 at 03:58, Jan Viktorin wrote: > > > The flag is used to enable memcpy optimizations in EAL. As it is not > > > always > > > the performance benefit, the flag allows to disable it. > > > > > > Signed-off-by: Jan Viktorin > > > --- > > > config/defconfig_arm-armv7a-linuxapp-gcc | 1 + > > > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 8 ++-- > > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > > > diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc > > > b/config/defconfig_arm-armv7a-linuxapp-gcc > > > index 96c3343..2c60c2c 100644 > > > --- a/config/defconfig_arm-armv7a-linuxapp-gcc > > > +++ b/config/defconfig_arm-armv7a-linuxapp-gcc > > > @@ -36,6 +36,7 @@ CONFIG_RTE_ARCH="arm" > > > CONFIG_RTE_ARCH_ARM=y > > > CONFIG_RTE_ARCH_ARMv7=y > > > CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" > > > +CONFIG_RTE_ARCH_ARM_NEON_MEMCPY=y > > > > > If it's not always benefit, why not disable here since it is common > > armv7a config, and enable in your or other user's own config file? > > Jianbo, you are right. In that case, I'd just turn it off by default. > And when there is a new platform-specific defconfig, it can enable it. > > Anyway, I am thinking of adding some comment into the rte_memcpy_32.h > file describing the potential of the NEON code. What about: > > /* Enable in your defconfig to accelerate memcpy operations. Consider >enabling this for Cortex-A15. For Cortex-A7 and Cortex-A9, It might >accelerate short data copies (< 64 B). */ > > Thomas, do you consider this enough? Yes it is perfect. Why not put it in defconfig_arm-armv7a-linuxapp-gcc?
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
2016-03-20 10:46, Jan Viktorin: > On Sun, 20 Mar 2016 10:41:10 +0100 > Jan Viktorin wrote: > > > On Sat, 19 Mar 2016 21:14:57 +0100 > > Thomas Monjalon wrote: > > > > > 2016-03-19 20:58, Jan Viktorin: > > > > The flag is used to enable memcpy optimizations in EAL. As it is not > > > > always > > > > the performance benefit, the flag allows to disable it. > > > > > > Ideally the default should be to choose the best optimization. > > > If it is not possible, it would help to have some comments explaining > > > how to choose wether enabling NEON memcpy or not. > > The related statistics are mentioned here: > > commit 04a2fde35daf5e9a271e72331a70b48b951d7568 > Author: Vlastimil Kosar > Date: Tue Nov 3 00:47:20 2015 +0100 > > eal/arm: add vector memcpy for ARMv7 > > It's quite difficult to easily summarize it, especially for so many > CPUs... If it is difficult for you, it will be impossible for the users of this config option. When someone will ask what is the best value for his CPU, what will you answer? At least, we can add a comment explaining that the performance is not always better, depending of the buffer size and the CPU.
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
2016-03-20 10:41, Jan Viktorin: > On Sat, 19 Mar 2016 21:14:57 +0100 > Thomas Monjalon wrote: > > > 2016-03-19 20:58, Jan Viktorin: > > > The flag is used to enable memcpy optimizations in EAL. As it is not > > > always > > > the performance benefit, the flag allows to disable it. > > > > Ideally the default should be to choose the best optimization. > > If it is not possible, it would help to have some comments explaining > > how to choose wether enabling NEON memcpy or not. > > Ok, we can rename the option to CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY, > delete it from the defconfig and change the test in rte_memcpy_32.h to > > #ifndef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY > > Alternatively, to have a positive test like > > #ifdef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY > > I can create a bigger change that moves the non-neon-memcpy up in the > file... > > Should I resend the whole series as v3? No, I don't think changing the name of the config or moving code will change anything. We just need to understand when it must be enabled or disabled.
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
On Sun, 20 Mar 2016 10:41:10 +0100 Jan Viktorin wrote: > On Sat, 19 Mar 2016 21:14:57 +0100 > Thomas Monjalon wrote: > > > 2016-03-19 20:58, Jan Viktorin: > > > The flag is used to enable memcpy optimizations in EAL. As it is not > > > always > > > the performance benefit, the flag allows to disable it. > > > > Ideally the default should be to choose the best optimization. > > If it is not possible, it would help to have some comments explaining > > how to choose wether enabling NEON memcpy or not. The related statistics are mentioned here: commit 04a2fde35daf5e9a271e72331a70b48b951d7568 Author: Vlastimil Kosar Date: Tue Nov 3 00:47:20 2015 +0100 eal/arm: add vector memcpy for ARMv7 It's quite difficult to easily summarize it, especially for so many CPUs... > > Ok, we can rename the option to CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY, > delete it from the defconfig and change the test in rte_memcpy_32.h to > > #ifndef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY > > Alternatively, to have a positive test like > > #ifdef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY > > I can create a bigger change that moves the non-neon-memcpy up in the > file... > > Should I resend the whole series as v3? > > Regards > Jan -- Jan ViktorinE-mail: Viktorin at RehiveTech.com System ArchitectWeb:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
On Sat, 19 Mar 2016 21:14:57 +0100 Thomas Monjalon wrote: > 2016-03-19 20:58, Jan Viktorin: > > The flag is used to enable memcpy optimizations in EAL. As it is not always > > the performance benefit, the flag allows to disable it. > > Ideally the default should be to choose the best optimization. > If it is not possible, it would help to have some comments explaining > how to choose wether enabling NEON memcpy or not. Ok, we can rename the option to CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY, delete it from the defconfig and change the test in rte_memcpy_32.h to #ifndef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY Alternatively, to have a positive test like #ifdef CONFIG_RTE_ARCH_ARM_AVOID_NEON_MEMCPY I can create a bigger change that moves the non-neon-memcpy up in the file... Should I resend the whole series as v3? Regards Jan
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
2016-03-19 20:58, Jan Viktorin: > The flag is used to enable memcpy optimizations in EAL. As it is not always > the performance benefit, the flag allows to disable it. Ideally the default should be to choose the best optimization. If it is not possible, it would help to have some comments explaining how to choose wether enabling NEON memcpy or not.
[dpdk-dev] [PATCH v3 4/4] eal/arm: introduce CONFIG_RTE_ARCH_ARM_NEON_MEMCPY
The flag is used to enable memcpy optimizations in EAL. As it is not always the performance benefit, the flag allows to disable it. Signed-off-by: Jan Viktorin --- config/defconfig_arm-armv7a-linuxapp-gcc | 1 + lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 8 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc index 96c3343..2c60c2c 100644 --- a/config/defconfig_arm-armv7a-linuxapp-gcc +++ b/config/defconfig_arm-armv7a-linuxapp-gcc @@ -36,6 +36,7 @@ CONFIG_RTE_ARCH="arm" CONFIG_RTE_ARCH_ARM=y CONFIG_RTE_ARCH_ARMv7=y CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9" +CONFIG_RTE_ARCH_ARM_NEON_MEMCPY=y CONFIG_RTE_FORCE_INTRINSICS=y CONFIG_RTE_ARCH_STRICT_ALIGN=y diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h index ad8bc65..988125b 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h @@ -42,7 +42,11 @@ extern "C" { #include "generic/rte_memcpy.h" -#ifdef RTE_MACHINE_CPUFLAG_NEON +#ifdef RTE_ARCH_ARM_NEON_MEMCPY + +#ifndef RTE_MACHINE_CPUFLAG_NEON +#error "Cannot optimize memcpy by NEON as the CPU seems to not support this" +#endif /* ARM NEON Intrinsics are used to copy data */ #include @@ -325,7 +329,7 @@ rte_memcpy_func(void *dst, const void *src, size_t n) return memcpy(dst, src, n); } -#endif /* RTE_MACHINE_CPUFLAG_NEON */ +#endif /* RTE_ARCH_ARM_NEON_MEMCPY */ #ifdef __cplusplus } -- 2.7.0