Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Richard Biener
On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>
> From: Pan Li 
>
> Update in v2:
>
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
>
> Original log:
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to remove this data
> type size check and unblock the standard name like lrintmn2.
>
> The below test are passed for this patch.
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression tests.
>
> gcc/ChangeLog:
>
> * internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
> * tree-vect-stmts.cc (vectorizable_call): Remove size check.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
> * gcc.target/aarch64/sve/clz_1.c: Ditto.
> * gcc.target/aarch64/sve/popcount_1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/internal-fn.cc  |  3 ++-
>  gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>  .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>  gcc/tree-vect-stmts.cc  | 13 -
>  6 files changed, 6 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 61d5a9e4772..17c0f4c3805 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
> unsigned int noutputs,
> emit_move_insn (lhs_rtx, ops[0].value);
>else
> {
> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +  || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs)));

Can you explain why this is necessary?  In particular what is lhs_rtx
mode vs ops[0].value mode?

>   convert_move (lhs_rtx, ops[0].value, 0);

I'm not sure convert_move handles vector modes correctly.  Richard
probably added this code, CCed.

Richard.

> }
>  }
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> index bdc9856faaf..940d08bbc7b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict 
> src, int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
> z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 2 } } */
> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
> z[0-9]+\.s\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> index 0c7a4e6d768..58b8ff406d2 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, 
> int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, 
> z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 2 } } */
> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
> z[0-9]+\.s\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
> index dfb6f4ac7a5..0eba898307c 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c
> @@ -18,5 +18,4 @@ popcount_64 (unsigned int *restrict dst, uint64_t *restrict 
> src, int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.s, p[0-7]/m, 
> z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 2 } } */
> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
> z[0-9]+\.s\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, 
>

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2
Thanks Richard for comments.

> Can you explain why this is necessary?  In particular what is lhs_rtx
> mode vs ops[0].value mode?

For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.

The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
The ops[0].value is (reg:VNx2DI 104).

The restriction removing make the vector rtl enter expand_fn_using_insn and of 
course hit the INTEGER_P assertion.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, October 26, 2023 4:38 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 
; Richard Sandiford 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>
> From: Pan Li 
>
> Update in v2:
>
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
>
> Original log:
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to remove this data
> type size check and unblock the standard name like lrintmn2.
>
> The below test are passed for this patch.
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression tests.
>
> gcc/ChangeLog:
>
> * internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
> * tree-vect-stmts.cc (vectorizable_call): Remove size check.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
> * gcc.target/aarch64/sve/clz_1.c: Ditto.
> * gcc.target/aarch64/sve/popcount_1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/internal-fn.cc  |  3 ++-
>  gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>  .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>  gcc/tree-vect-stmts.cc  | 13 -
>  6 files changed, 6 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 61d5a9e4772..17c0f4c3805 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
> unsigned int noutputs,
> emit_move_insn (lhs_rtx, ops[0].value);
>else
> {
> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +  || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs)));

Can you explain why this is necessary?  In particular what is lhs_rtx
mode vs ops[0].value mode?

>   convert_move (lhs_rtx, ops[0].value, 0);

I'm not sure convert_move handles vector modes correctly.  Richard
probably added this code, CCed.

Richard.

> }
>  }
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> index bdc9856faaf..940d08bbc7b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict 
> src, int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
> z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 2 } } */
> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
> z[0-9]+\.s\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> index 0c7a4e6d768..58b8ff406d2 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, 
> int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, 
> z[0-

Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Richard Biener



> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
> 
> Thanks Richard for comments.
> 
>> Can you explain why this is necessary?  In particular what is lhs_rtx
>> mode vs ops[0].value mode?
> 
> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
> 
> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
> The ops[0].value is (reg:VNx2DI 104).
> 
> The restriction removing make the vector rtl enter expand_fn_using_insn and 
> of course hit the INTEGER_P assertion.

But I think this shows we mid-selected the optab, a convert_move is certainly 
not correct unconditionally here (the target might not support that)

> Pan
> 
> -Original Message-
> From: Richard Biener  
> Sent: Thursday, October 26, 2023 4:38 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Liu, Hongtao 
> ; Richard Sandiford 
> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
> 
>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>> 
>> From: Pan Li 
>> 
>> Update in v2:
>> 
>> * Fix one ICE of type assertion.
>> * Adjust some test cases for aarch64 sve and riscv vector.
>> 
>> Original log:
>> 
>> The vectoriable_call has one restriction of the size of data type.
>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>> when try to vectorize function call like lrintf.
>> 
>> void
>> test_lrintf (long *out, float *in, unsigned count)
>> {
>>  for (unsigned i = 0; i < count; i++)
>>out[i] = __builtin_lrintf (in[i]);
>> }
>> 
>> lrintf.c:5:26: missed: couldn't vectorize loop
>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>> 
>> Then the standard name pattern like lrintmn2 cannot work for different
>> data type size like SF => DI. This patch would like to remove this data
>> type size check and unblock the standard name like lrintmn2.
>> 
>> The below test are passed for this patch.
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression tests.
>> 
>> gcc/ChangeLog:
>> 
>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> gcc/internal-fn.cc  |  3 ++-
>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>> gcc/tree-vect-stmts.cc  | 13 -
>> 6 files changed, 6 insertions(+), 21 deletions(-)
>> 
>> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
>> index 61d5a9e4772..17c0f4c3805 100644
>> --- a/gcc/internal-fn.cc
>> +++ b/gcc/internal-fn.cc
>> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
>> unsigned int noutputs,
>>emit_move_insn (lhs_rtx, ops[0].value);
>>   else
>>{
>> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>> +  || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs)));
> 
> Can you explain why this is necessary?  In particular what is lhs_rtx
> mode vs ops[0].value mode?
> 
>>  convert_move (lhs_rtx, ops[0].value, 0);
> 
> I'm not sure convert_move handles vector modes correctly.  Richard
> probably added this code, CCed.
> 
> Richard.
> 
>>}
>> }
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
>> index bdc9856faaf..940d08bbc7b 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
>> @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict 
>> src, int size)
>> }
>> 
>> /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
>> z[0-9]+\.s\n} 1 } } */
>> -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2
> But I think this shows we mid-selected the optab, a convert_move is certainly 
> not correct unconditionally here (the target might not support that)

Make sense, we can wait a while for the confirmation from Richard S.

If convert_move is not designed for Vector (looks like mostly up to a point), I 
am not sure if we can fix the assertion like below

...
else If (VECTOR_INTERGER_TYPE (TREE_TYPE(lhs)))
  return;
else
  {
gcc_checking_assert (INTEGRAL_TYPE_TYPE_P (TREE_TYPE (lhs)));
convert_move (lhs_rtx, ops[0].value, 0);
  }

Aka bypass the vector here, but I am afraid this change may make the llrintf 
(SF => DI) not working on standard name.
Let me have a try and keep you posted.

Pan
  

-Original Message-
From: Richard Biener  
Sent: Thursday, October 26, 2023 10:00 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 
; Richard Sandiford 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer



> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
> 
> Thanks Richard for comments.
> 
>> Can you explain why this is necessary?  In particular what is lhs_rtx
>> mode vs ops[0].value mode?
> 
> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
> 
> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
> The ops[0].value is (reg:VNx2DI 104).
> 
> The restriction removing make the vector rtl enter expand_fn_using_insn and 
> of course hit the INTEGER_P assertion.

But I think this shows we mid-selected the optab, a convert_move is certainly 
not correct unconditionally here (the target might not support that)

> Pan
> 
> -Original Message-
> From: Richard Biener  
> Sent: Thursday, October 26, 2023 4:38 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Liu, Hongtao 
> ; Richard Sandiford 
> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
> 
>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>> 
>> From: Pan Li 
>> 
>> Update in v2:
>> 
>> * Fix one ICE of type assertion.
>> * Adjust some test cases for aarch64 sve and riscv vector.
>> 
>> Original log:
>> 
>> The vectoriable_call has one restriction of the size of data type.
>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>> when try to vectorize function call like lrintf.
>> 
>> void
>> test_lrintf (long *out, float *in, unsigned count)
>> {
>>  for (unsigned i = 0; i < count; i++)
>>out[i] = __builtin_lrintf (in[i]);
>> }
>> 
>> lrintf.c:5:26: missed: couldn't vectorize loop
>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>> 
>> Then the standard name pattern like lrintmn2 cannot work for different
>> data type size like SF => DI. This patch would like to remove this data
>> type size check and unblock the standard name like lrintmn2.
>> 
>> The below test are passed for this patch.
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression tests.
>> 
>> gcc/ChangeLog:
>> 
>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> gcc/internal-fn.cc  |  3 ++-
>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>> gcc/tree-vect-stmts.cc  | 13 -
>> 6 files changed, 6 insertions(+), 21 deletions(-)
>> 
>> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
>> index 61d5a9e4772..17c0f4c3805 100644
>> --- a/gcc/internal-fn.cc
>> +++ b/gcc/internal-fn.cc
>> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
>> unsigned int noutputs,
>>emit_move_insn (lhs_rtx, ops[0].value);
>>   else
>>{
>> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>> +   

Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Richard Sandiford
Richard Biener  writes:
>> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
>> 
>> Thanks Richard for comments.
>> 
>>> Can you explain why this is necessary?  In particular what is lhs_rtx
>>> mode vs ops[0].value mode?
>> 
>> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
>> 
>> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
>> The ops[0].value is (reg:VNx2DI 104).
>> 
>> The restriction removing make the vector rtl enter expand_fn_using_insn and 
>> of course hit the INTEGER_P assertion.
>
> But I think this shows we mid-selected the optab, a convert_move is certainly 
> not correct unconditionally here (the target might not support that)

Agreed.  Allowing TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)
makes sense if the called function allows the input and output modes
to vary.  That's true for internal functions that eventually map to
two-mode optabs.  But we can't remove the condition for calls to
other functions, at least not without some fix-ups.

ISTM that the problem being hit is the one described by the removed
comment.

In other words, I don't think simply removing the test from the vectoriser
is correct.  It needs to be replaced by something more selective.

Thanks,
Richard

>> Pan
>> 
>> -Original Message-
>> From: Richard Biener  
>> Sent: Thursday, October 26, 2023 4:38 PM
>> To: Li, Pan2 
>> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
>> ; kito.ch...@gmail.com; Liu, Hongtao 
>> ; Richard Sandiford 
>> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
>> 
>>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>>> 
>>> From: Pan Li 
>>> 
>>> Update in v2:
>>> 
>>> * Fix one ICE of type assertion.
>>> * Adjust some test cases for aarch64 sve and riscv vector.
>>> 
>>> Original log:
>>> 
>>> The vectoriable_call has one restriction of the size of data type.
>>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>>> when try to vectorize function call like lrintf.
>>> 
>>> void
>>> test_lrintf (long *out, float *in, unsigned count)
>>> {
>>>  for (unsigned i = 0; i < count; i++)
>>>out[i] = __builtin_lrintf (in[i]);
>>> }
>>> 
>>> lrintf.c:5:26: missed: couldn't vectorize loop
>>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>>> 
>>> Then the standard name pattern like lrintmn2 cannot work for different
>>> data type size like SF => DI. This patch would like to remove this data
>>> type size check and unblock the standard name like lrintmn2.
>>> 
>>> The below test are passed for this patch.
>>> 
>>> * The x86 bootstrap and regression test.
>>> * The aarch64 regression test.
>>> * The risc-v regression tests.
>>> 
>>> gcc/ChangeLog:
>>> 
>>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>>> 
>>> Signed-off-by: Pan Li 
>>> ---
>>> gcc/internal-fn.cc  |  3 ++-
>>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>>> gcc/tree-vect-stmts.cc  | 13 -
>>> 6 files changed, 6 insertions(+), 21 deletions(-)
>>> 
>>> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
>>> index 61d5a9e4772..17c0f4c3805 100644
>>> --- a/gcc/internal-fn.cc
>>> +++ b/gcc/internal-fn.cc
>>> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
>>> unsigned int noutputs,
>>>emit_move_insn (lhs_rtx, ops[0].value);
>>>   else
>>>{
>>> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>>> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>>> +  || VECTOR_INTEGER_TY

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2
Thanks Richard S for comments.

> In other words, I don't think simply removing the test from the vectoriser
> is correct.  It needs to be replaced by something more selective.

Does it mean we need to check if the internal fun allow different modes/sizes 
here?

For example, standard name lrintmn2 (m, n mode) is allowed here, while rintm2 
(only m mode) isn't.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Friday, October 27, 2023 1:47 AM
To: Richard Biener 
Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; Liu, Hongtao 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

Richard Biener  writes:
>> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
>> 
>> Thanks Richard for comments.
>> 
>>> Can you explain why this is necessary?  In particular what is lhs_rtx
>>> mode vs ops[0].value mode?
>> 
>> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
>> 
>> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
>> The ops[0].value is (reg:VNx2DI 104).
>> 
>> The restriction removing make the vector rtl enter expand_fn_using_insn and 
>> of course hit the INTEGER_P assertion.
>
> But I think this shows we mid-selected the optab, a convert_move is certainly 
> not correct unconditionally here (the target might not support that)

Agreed.  Allowing TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)
makes sense if the called function allows the input and output modes
to vary.  That's true for internal functions that eventually map to
two-mode optabs.  But we can't remove the condition for calls to
other functions, at least not without some fix-ups.

ISTM that the problem being hit is the one described by the removed
comment.

In other words, I don't think simply removing the test from the vectoriser
is correct.  It needs to be replaced by something more selective.

Thanks,
Richard

>> Pan
>> 
>> -Original Message-
>> From: Richard Biener  
>> Sent: Thursday, October 26, 2023 4:38 PM
>> To: Li, Pan2 
>> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
>> ; kito.ch...@gmail.com; Liu, Hongtao 
>> ; Richard Sandiford 
>> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
>> 
>>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>>> 
>>> From: Pan Li 
>>> 
>>> Update in v2:
>>> 
>>> * Fix one ICE of type assertion.
>>> * Adjust some test cases for aarch64 sve and riscv vector.
>>> 
>>> Original log:
>>> 
>>> The vectoriable_call has one restriction of the size of data type.
>>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>>> when try to vectorize function call like lrintf.
>>> 
>>> void
>>> test_lrintf (long *out, float *in, unsigned count)
>>> {
>>>  for (unsigned i = 0; i < count; i++)
>>>out[i] = __builtin_lrintf (in[i]);
>>> }
>>> 
>>> lrintf.c:5:26: missed: couldn't vectorize loop
>>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>>> 
>>> Then the standard name pattern like lrintmn2 cannot work for different
>>> data type size like SF => DI. This patch would like to remove this data
>>> type size check and unblock the standard name like lrintmn2.
>>> 
>>> The below test are passed for this patch.
>>> 
>>> * The x86 bootstrap and regression test.
>>> * The aarch64 regression test.
>>> * The risc-v regression tests.
>>> 
>>> gcc/ChangeLog:
>>> 
>>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>>> 
>>> Signed-off-by: Pan Li 
>>> ---
>>> gcc/internal-fn.cc  |  3 ++-
>>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>>> gcc/tree-vect-stmts.cc  | 13 -
>>> 6 fi

Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-27 Thread Richard Biener
On Fri, Oct 27, 2023 at 4:17 AM Li, Pan2  wrote:
>
> Thanks Richard S for comments.
>
> > In other words, I don't think simply removing the test from the vectoriser
> > is correct.  It needs to be replaced by something more selective.
>
> Does it mean we need to check if the internal fun allow different modes/sizes 
> here?
>
> For example, standard name lrintmn2 (m, n mode) is allowed here, while rintm2 
> (only m mode) isn't.

We need to check whether the "size" of the LHS is somehow participating in the
optab query.  I think the direct_internal_fn_info type0/1 members,
when -1 say that.
If none is -1 (and -2) then the LHS has to match one of the arguments (if there
are two different I'm not sure which we'd pick).

So patch-wise the existing check can probably be skipped when
vectorizable_internal_function
returns an IFN but that function should have the very same check when
vectype_out isn't
participating in the optab selection.

Richard.

>
> Pan
>
> -Original Message-
> From: Richard Sandiford 
> Sent: Friday, October 27, 2023 1:47 AM
> To: Richard Biener 
> Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
> juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; Liu, Hongtao 
> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
>
> Richard Biener  writes:
> >> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
> >>
> >> Thanks Richard for comments.
> >>
> >>> Can you explain why this is necessary?  In particular what is lhs_rtx
> >>> mode vs ops[0].value mode?
> >>
> >> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as 
> >> below.
> >>
> >> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
> >> The ops[0].value is (reg:VNx2DI 104).
> >>
> >> The restriction removing make the vector rtl enter expand_fn_using_insn 
> >> and of course hit the INTEGER_P assertion.
> >
> > But I think this shows we mid-selected the optab, a convert_move is 
> > certainly not correct unconditionally here (the target might not support 
> > that)
>
> Agreed.  Allowing TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)
> makes sense if the called function allows the input and output modes
> to vary.  That's true for internal functions that eventually map to
> two-mode optabs.  But we can't remove the condition for calls to
> other functions, at least not without some fix-ups.
>
> ISTM that the problem being hit is the one described by the removed
> comment.
>
> In other words, I don't think simply removing the test from the vectoriser
> is correct.  It needs to be replaced by something more selective.
>
> Thanks,
> Richard
>
> >> Pan
> >>
> >> -Original Message-
> >> From: Richard Biener 
> >> Sent: Thursday, October 26, 2023 4:38 PM
> >> To: Li, Pan2 
> >> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> >> ; kito.ch...@gmail.com; Liu, Hongtao 
> >> ; Richard Sandiford 
> >> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of 
> >> vectorizer
> >>
> >>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
> >>>
> >>> From: Pan Li 
> >>>
> >>> Update in v2:
> >>>
> >>> * Fix one ICE of type assertion.
> >>> * Adjust some test cases for aarch64 sve and riscv vector.
> >>>
> >>> Original log:
> >>>
> >>> The vectoriable_call has one restriction of the size of data type.
> >>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> >>> when try to vectorize function call like lrintf.
> >>>
> >>> void
> >>> test_lrintf (long *out, float *in, unsigned count)
> >>> {
> >>>  for (unsigned i = 0; i < count; i++)
> >>>out[i] = __builtin_lrintf (in[i]);
> >>> }
> >>>
> >>> lrintf.c:5:26: missed: couldn't vectorize loop
> >>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
> >>>
> >>> Then the standard name pattern like lrintmn2 cannot work for different
> >>> data type size like SF => DI. This patch would like to remove this data
> >>> type size check and unblock the standard name like lrintmn2.
> >>>
> >>> The below test are passed for this patch.
> >>>
> >>> * The x86 bootstrap and regression test.
> >>> * The aarch64 regression test.
> >>> * The risc-v regression tests.
&