> -----Original Message-----
> From: Gavin Hu <[email protected]>
> Sent: Tuesday, July 23, 2019 9:14 PM
> To: [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> Jerin Jacob Kollanukkaran <[email protected]>; Pavan Nikhilesh
> Bhagavatula <[email protected]>;
> [email protected]; [email protected]
> Subject: [EXT] [PATCH v3 4/5] spinlock: use wfe to reduce contention on
> aarch64
>
> In acquiring a spinlock, cores repeatedly poll the lock variable.
> This is replaced by rte_wait_until_equal API.
>
> 5~10% performance gain was measured by running spinlock_autotest on
> 14 isolated cores of ThunderX2.
>
> Signed-off-by: Gavin Hu <[email protected]>
> Reviewed-by: Ruifeng Wang <[email protected]>
> Reviewed-by: Phil Yang <[email protected]>
> Reviewed-by: Steve Capper <[email protected]>
> Reviewed-by: Ola Liljedahl <[email protected]>
> Reviewed-by: Honnappa Nagarahalli <[email protected]>
> Tested-by: Pavan Nikhilesh <[email protected]>
> ---
> .../common/include/arch/arm/rte_spinlock.h | 25
> ++++++++++++++++++++++
> .../common/include/generic/rte_spinlock.h | 2 +-
> 2 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
> b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
> index 1a6916b..f25d17f 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
> @@ -16,6 +16,31 @@ extern "C" {
> #include <rte_common.h>
> #include "generic/rte_spinlock.h"
>
> +/* armv7a does support WFE, but an explicit wake-up signal using SEV is
> + * required (must be preceded by DSB to drain the store buffer) and
> + * this is less performant, so keep armv7a implementation unchanged.
> + */
> +#if defined(RTE_USE_WFE) && defined(RTE_ARCH_ARM64) static inline
See below. Please avoid complicated conditional compilation logic for
scalability and readability.
> void
> +rte_spinlock_lock(rte_spinlock_t *sl) {
> + unsigned int tmp;
> + /* http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.
> + * faqs/ka16809.html
> + */
> + asm volatile(
> + "sevl\n"
> + "1: wfe\n"
> + "2: ldaxr %w[tmp], %w[locked]\n"
> + "cbnz %w[tmp], 1b\n"
> + "stxr %w[tmp], %w[one], %w[locked]\n"
> + "cbnz %w[tmp], 2b\n"
> + : [tmp] "=&r" (tmp), [locked] "+Q"(sl->locked)
> + : [one] "r" (1)
> + : "cc", "memory");
> +}
> +#endif
> +
> static inline int rte_tm_supported(void) {
> return 0;
> diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h
> b/lib/librte_eal/common/include/generic/rte_spinlock.h
> index 87ae7a4..cf4f15b 100644
> --- a/lib/librte_eal/common/include/generic/rte_spinlock.h
> +++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
> @@ -57,7 +57,7 @@ rte_spinlock_init(rte_spinlock_t *sl) static inline void
> rte_spinlock_lock(rte_spinlock_t *sl);
>
> -#ifdef RTE_FORCE_INTRINSICS
> +#if defined(RTE_FORCE_INTRINSICS) && !defined(RTE_USE_WFE)
I would like to avoid hacking around adjusting generic code to meet specific
requirement.
For example, if someone enables RTE_USE_WFE for armv7 it will break
And it will pain for the new architecture to use RTE_FORCE_INTRINSICS.
Looks like the time has come to disable RTE_FORCE_INTRINSICS for arm64.
Since this patch is targeted for next release. How about enable native
Implementation for RTE_FORCE_INTRINSICS used code for arm64 like spinlock,
ticketlock like x86.
If you guys don't have the bandwidth to convert all blocks, let us know, we can
collaborate
and Marvell can take up some RTE_FORCE_INTRINSICS conversion for next release.
> static inline void
> rte_spinlock_lock(rte_spinlock_t *sl)
> {
> --
> 2.7.4