>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>     return NULL_RTX;
>   if (!targetm.modes_tieable_p (int_mode, mode))
>     return NULL_RTX;

Yes, will return NULL_RTX for in the first if, given src_int_mode is E_DImode 
while src_mode is 
E_V2SFmode and mode is E_V4QImode. The extract_low_bits convert the modes 
E_V2SFmode/E_V4QImode 
to E_DImode/E_SImode in advance before tieable checking, validate_subreg and 
gen_lowpart.

Not sure if my understanding is correct but looks extract_low_bits cannot take 
care of vector modes 
up to a point because vector modes are always untieable to its' int mode, and 
then return NULL_RTX.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Wednesday, February 28, 2024 9:41 AM
To: Jeff Law <jeffreya...@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang <yanzhang.w...@intel.com>; rdapp....@gmail.com; Liu, Hongtao 
<hongtao....@intel.com>
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

> Pan, can you confirm what path we take through extract_low_bits?

Thanks Jeff for comments, will have a try soon and keep you posted.

Pan

-----Original Message-----
From: Jeff Law <jeffreya...@gmail.com> 
Sent: Tuesday, February 27, 2024 11:03 PM
To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang <yanzhang.w...@intel.com>; rdapp....@gmail.com; Liu, Hongtao 
<hongtao....@intel.com>
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 2/26/24 07:22, pan2...@intel.com wrote:
> From: Pan Li <pan2...@intel.com>
> 
> We allowed vector type for get_stored_val when read is less than or
> equal to store in previous.  Unfortunately, we missed to adjust the
> validate_subreg part accordingly.  When the vector type's size is
> less than vector register, it will be considered as invalid in the
> validate_subreg.
> 
> Consider the validate_subreg is kind of a can with worms and we are
> in stage 4.  We will fix the issue from the DES side, and make sure
> the subreg is valid for both the read_mode and store_mode before
> perform the real gen_lowpart.
> 
> The below test are passed for this patch:
> 
> * The x86 bootstrap test.
> * The x86 regression test.
> * The riscv regression test.
> * The aarch64 regression test.
> 
> gcc/ChangeLog:
> 
>       * dse.cc (get_stored_val): Add validate_subreg check before
>       perform the gen_lowpart for rtl.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger
>       the ICE.
>       * gcc.target/riscv/rvv/base/bug-6.c: New test.
> 
> Signed-off-by: Pan Li <pan2...@intel.com>
> ---
>   gcc/dse.cc                                    |  4 +++-
>   gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c    |  2 +-
>   .../gcc.target/riscv/rvv/base/bug-6.c         | 22 +++++++++++++++++++
>   3 files changed, 26 insertions(+), 2 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
> 
> diff --git a/gcc/dse.cc b/gcc/dse.cc
> index edc7a1dfecf..1596da91da0 100644
> --- a/gcc/dse.cc
> +++ b/gcc/dse.cc
> @@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
> read_mode,
>                                copy_rtx (store_info->const_rhs));
>     else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
>       && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE 
> (store_mode))
> -    && targetm.modes_tieable_p (read_mode, store_mode))
> +    && targetm.modes_tieable_p (read_mode, store_mode)
> +    && validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs),
> +                     subreg_lowpart_offset (read_mode, store_mode)))
>       read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
>     else
>       read_reg = extract_low_bits (read_mode, store_mode,

So we're just changing whether or not we call gen_lowpart directly or go 
through extract_low_bits, which may in turn generate subreg, call 
gen_lowpart itself and a few other things.

I'm guessing that extract_low_bits is going to return NULL in this case 
via this code (specifically the second test).

>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>     return NULL_RTX;
>   if (!targetm.modes_tieable_p (int_mode, mode))
>     return NULL_RTX;


Pan, can you confirm what path we take through extract_low_bits?

One might argue that we should just call into extract_low_bits 
unconditionally since it'll ultimately call gen_lowpart when it safely 
can.  The downside is that's a bigger change than I'd like at this stage 
in our development cycle.

I wouldn't be surprised if other direct uses of gen_lowpart have similar 
problems.





> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
> index f79b4c142ae..624a00a4f32 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
> @@ -1,5 +1,5 @@
>   /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-fre1" } */
> +/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */
>   
>   struct A { float x, y; };
>   struct B { struct A u; };
So this may compromise the original intent of this test.  What I would 
suggest instead is to create a new test with the dg-do & dg-options you 
want with a #include "ssa-fre-44.c".

So to move forward.  Let's confirm the path we take through 
extract_low_bits matches expectations and fixup the testsuite change.

Jeff

Reply via email to