Re: [PATCH] i386: Add reduce_*_ep[i|u][8|16] series intrinsics

2023-04-18 Thread Hongtao Liu via Gcc-patches
On Tue, Apr 18, 2023 at 3:13 PM Hu, Lin1 via Gcc-patches
 wrote:
>
> More details: Intrinsics guide add these 128/256-bit intrinsics as follow: 
> https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=reduce__expand=5814.
>
> So we intend to enable these intrinsics for GCC-14.
>
> -Original Message-
> From: Gcc-patches  On 
> Behalf Of Hu, Lin1 via Gcc-patches
> Sent: Tuesday, April 18, 2023 3:03 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Add reduce_*_ep[i|u][8|16] series intrinsics
>
> Hi all,
>
> The patch aims to support reduce_*_ep[i|u][8|16] series intrinsics, and has 
> been tested on x86_64-pc-linux-gnu. OK for trunk?
Ok.
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> * config/i386/avx2intrin.h
> (_MM_REDUCE_OPERATOR_BASIC_EPI16): New macro.
> (_MM_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
> (_MM256_REDUCE_OPERATOR_BASIC_EPI16): Ditto.
> (_MM256_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
> (_MM_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
> (_MM_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
> (_MM256_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
> (_MM256_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
> (_mm_reduce_add_epi16): New instrinsics.
> (_mm_reduce_mul_epi16): Ditto.
> (_mm_reduce_and_epi16): Ditto.
> (_mm_reduce_or_epi16): Ditto.
> (_mm_reduce_max_epi16): Ditto.
> (_mm_reduce_max_epu16): Ditto.
> (_mm_reduce_min_epi16): Ditto.
> (_mm_reduce_min_epu16): Ditto.
> (_mm256_reduce_add_epi16): Ditto.
> (_mm256_reduce_mul_epi16): Ditto.
> (_mm256_reduce_and_epi16): Ditto.
> (_mm256_reduce_or_epi16): Ditto.
> (_mm256_reduce_max_epi16): Ditto.
> (_mm256_reduce_max_epu16): Ditto.
> (_mm256_reduce_min_epi16): Ditto.
> (_mm256_reduce_min_epu16): Ditto.
> (_mm_reduce_add_epi8): Ditto.
> (_mm_reduce_mul_epi8): Ditto.
> (_mm_reduce_and_epi8): Ditto.
> (_mm_reduce_or_epi8): Ditto.
> (_mm_reduce_max_epi8): Ditto.
> (_mm_reduce_max_epu8): Ditto.
> (_mm_reduce_min_epi8): Ditto.
> (_mm_reduce_min_epu8): Ditto.
> (_mm256_reduce_add_epi8): Ditto.
> (_mm256_reduce_mul_epi8): Ditto.
> (_mm256_reduce_and_epi8): Ditto.
> (_mm256_reduce_or_epi8): Ditto.
> (_mm256_reduce_max_epi8): Ditto.
> (_mm256_reduce_max_epu8): Ditto.
> (_mm256_reduce_min_epi8): Ditto.
> (_mm256_reduce_min_epu8): Ditto.
> * config/i386/avx512vlbwintrin.h:
> (_mm_mask_reduce_add_epi16): Ditto.
> (_mm_mask_reduce_mul_epi16): Ditto.
> (_mm_mask_reduce_and_epi16): Ditto.
> (_mm_mask_reduce_or_epi16): Ditto.
> (_mm_mask_reduce_max_epi16): Ditto.
> (_mm_mask_reduce_max_epu16): Ditto.
> (_mm_mask_reduce_min_epi16): Ditto.
> (_mm_mask_reduce_min_epu16): Ditto.
> (_mm256_mask_reduce_add_epi16): Ditto.
> (_mm256_mask_reduce_mul_epi16): Ditto.
> (_mm256_mask_reduce_and_epi16): Ditto.
> (_mm256_mask_reduce_or_epi16): Ditto.
> (_mm256_mask_reduce_max_epi16): Ditto.
> (_mm256_mask_reduce_max_epu16): Ditto.
> (_mm256_mask_reduce_min_epi16): Ditto.
> (_mm256_mask_reduce_min_epu16): Ditto.
> (_mm_mask_reduce_add_epi8): Ditto.
> (_mm_mask_reduce_mul_epi8): Ditto.
> (_mm_mask_reduce_and_epi8): Ditto.
> (_mm_mask_reduce_or_epi8): Ditto.
> (_mm_mask_reduce_max_epi8): Ditto.
> (_mm_mask_reduce_max_epu8): Ditto.
> (_mm_mask_reduce_min_epi8): Ditto.
> (_mm_mask_reduce_min_epu8): Ditto.
> (_mm256_mask_reduce_add_epi8): Ditto.
> (_mm256_mask_reduce_mul_epi8): Ditto.
> (_mm256_mask_reduce_and_epi8): Ditto.
> (_mm256_mask_reduce_or_epi8): Ditto.
> (_mm256_mask_reduce_max_epi8): Ditto.
> (_mm256_mask_reduce_max_epu8): Ditto.
> (_mm256_mask_reduce_min_epi8): Ditto.
> (_mm256_mask_reduce_min_epu8): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512vlbw-reduce-op-1.c: New test.
> ---
>  gcc/config/i386/avx2intrin.h  | 347 ++
>  gcc/config/i386/avx512vlbwintrin.h| 256 +
>  .../gcc.target/i386/avx512vlbw-reduce-op-1.c  | 206 +++
>  3 files changed, 809 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512vlbw-reduce-op-1.c
>
> diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h 
> index 1b9c8169a96..9b8c13b7233 100644
> --- a/gcc/config/i386/avx2intrin.h
> +++ b/gcc/config/i386/avx2intrin.h
> @@ -1915,6 +1915,353 @@ _mm256_mask_i64gather_epi32 (__m128i __src, int const 
> *__base,
>(int) (SCALE))
>  #endif  /* __OPTIMIZE__ */
>
> +#define _MM_REDUCE_OPERATOR_BASIC_EPI16(op) \
> +  __v8hi __T1 = 

RE: [PATCH] i386: Add reduce_*_ep[i|u][8|16] series intrinsics

2023-04-18 Thread Hu, Lin1 via Gcc-patches
More details: Intrinsics guide add these 128/256-bit intrinsics as follow: 
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=reduce__expand=5814.

So we intend to enable these intrinsics for GCC-14.

-Original Message-
From: Gcc-patches  On Behalf 
Of Hu, Lin1 via Gcc-patches
Sent: Tuesday, April 18, 2023 3:03 PM
To: gcc-patches@gcc.gnu.org
Cc: Liu, Hongtao ; ubiz...@gmail.com
Subject: [PATCH] i386: Add reduce_*_ep[i|u][8|16] series intrinsics

Hi all,

The patch aims to support reduce_*_ep[i|u][8|16] series intrinsics, and has 
been tested on x86_64-pc-linux-gnu. OK for trunk?

BRs,
Lin

gcc/ChangeLog:

* config/i386/avx2intrin.h
(_MM_REDUCE_OPERATOR_BASIC_EPI16): New macro.
(_MM_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
(_MM256_REDUCE_OPERATOR_BASIC_EPI16): Ditto.
(_MM256_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
(_MM_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
(_MM_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
(_MM256_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
(_MM256_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
(_mm_reduce_add_epi16): New instrinsics.
(_mm_reduce_mul_epi16): Ditto.
(_mm_reduce_and_epi16): Ditto.
(_mm_reduce_or_epi16): Ditto.
(_mm_reduce_max_epi16): Ditto.
(_mm_reduce_max_epu16): Ditto.
(_mm_reduce_min_epi16): Ditto.
(_mm_reduce_min_epu16): Ditto.
(_mm256_reduce_add_epi16): Ditto.
(_mm256_reduce_mul_epi16): Ditto.
(_mm256_reduce_and_epi16): Ditto.
(_mm256_reduce_or_epi16): Ditto.
(_mm256_reduce_max_epi16): Ditto.
(_mm256_reduce_max_epu16): Ditto.
(_mm256_reduce_min_epi16): Ditto.
(_mm256_reduce_min_epu16): Ditto.
(_mm_reduce_add_epi8): Ditto.
(_mm_reduce_mul_epi8): Ditto.
(_mm_reduce_and_epi8): Ditto.
(_mm_reduce_or_epi8): Ditto.
(_mm_reduce_max_epi8): Ditto.
(_mm_reduce_max_epu8): Ditto.
(_mm_reduce_min_epi8): Ditto.
(_mm_reduce_min_epu8): Ditto.
(_mm256_reduce_add_epi8): Ditto.
(_mm256_reduce_mul_epi8): Ditto.
(_mm256_reduce_and_epi8): Ditto.
(_mm256_reduce_or_epi8): Ditto.
(_mm256_reduce_max_epi8): Ditto.
(_mm256_reduce_max_epu8): Ditto.
(_mm256_reduce_min_epi8): Ditto.
(_mm256_reduce_min_epu8): Ditto.
* config/i386/avx512vlbwintrin.h:
(_mm_mask_reduce_add_epi16): Ditto.
(_mm_mask_reduce_mul_epi16): Ditto.
(_mm_mask_reduce_and_epi16): Ditto.
(_mm_mask_reduce_or_epi16): Ditto.
(_mm_mask_reduce_max_epi16): Ditto.
(_mm_mask_reduce_max_epu16): Ditto.
(_mm_mask_reduce_min_epi16): Ditto.
(_mm_mask_reduce_min_epu16): Ditto.
(_mm256_mask_reduce_add_epi16): Ditto.
(_mm256_mask_reduce_mul_epi16): Ditto.
(_mm256_mask_reduce_and_epi16): Ditto.
(_mm256_mask_reduce_or_epi16): Ditto.
(_mm256_mask_reduce_max_epi16): Ditto.
(_mm256_mask_reduce_max_epu16): Ditto.
(_mm256_mask_reduce_min_epi16): Ditto.
(_mm256_mask_reduce_min_epu16): Ditto.
(_mm_mask_reduce_add_epi8): Ditto.
(_mm_mask_reduce_mul_epi8): Ditto.
(_mm_mask_reduce_and_epi8): Ditto.
(_mm_mask_reduce_or_epi8): Ditto.
(_mm_mask_reduce_max_epi8): Ditto.
(_mm_mask_reduce_max_epu8): Ditto.
(_mm_mask_reduce_min_epi8): Ditto.
(_mm_mask_reduce_min_epu8): Ditto.
(_mm256_mask_reduce_add_epi8): Ditto.
(_mm256_mask_reduce_mul_epi8): Ditto.
(_mm256_mask_reduce_and_epi8): Ditto.
(_mm256_mask_reduce_or_epi8): Ditto.
(_mm256_mask_reduce_max_epi8): Ditto.
(_mm256_mask_reduce_max_epu8): Ditto.
(_mm256_mask_reduce_min_epi8): Ditto.
(_mm256_mask_reduce_min_epu8): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512vlbw-reduce-op-1.c: New test.
---
 gcc/config/i386/avx2intrin.h  | 347 ++
 gcc/config/i386/avx512vlbwintrin.h| 256 +
 .../gcc.target/i386/avx512vlbw-reduce-op-1.c  | 206 +++
 3 files changed, 809 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512vlbw-reduce-op-1.c

diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h index 
1b9c8169a96..9b8c13b7233 100644
--- a/gcc/config/i386/avx2intrin.h
+++ b/gcc/config/i386/avx2intrin.h
@@ -1915,6 +1915,353 @@ _mm256_mask_i64gather_epi32 (__m128i __src, int const 
*__base,
   (int) (SCALE))
 #endif  /* __OPTIMIZE__ */
 
+#define _MM_REDUCE_OPERATOR_BASIC_EPI16(op) \
+  __v8hi __T1 = (__v8hi)__W; \
+  __v8hi __T2 = __builtin_shufflevector (__T1, __T1, 4, 5, 6, 7, 4, 5, 
+6, 7); \
+  __v8hi __T3 = __T1 op __T2; \
+  __v8hi __T4 = __builtin_shufflevector (__T3, __T3, 2, 3, 2, 3, 4, 5, 
+6, 7); \
+  __v8hi __T5 = __T3 op __T4; \
+  __v8hi __T6 = __builtin_shufflevector (__T5, __T5, 1,