On 2/26/24 07:22, pan2...@intel.com wrote:
From: Pan Li <pan2...@intel.com>
We allowed vector type for get_stored_val when read is less than or
equal to store in previous. Unfortunately, we missed to adjust the
validate_subreg part accordingly. When the vector type's size is
less than vector register, it will be considered as invalid in the
validate_subreg.
Consider the validate_subreg is kind of a can with worms and we are
in stage 4. We will fix the issue from the DES side, and make sure
the subreg is valid for both the read_mode and store_mode before
perform the real gen_lowpart.
The below test are passed for this patch:
* The x86 bootstrap test.
* The x86 regression test.
* The riscv regression test.
* The aarch64 regression test.
gcc/ChangeLog:
* dse.cc (get_stored_val): Add validate_subreg check before
perform the gen_lowpart for rtl.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger
the ICE.
* gcc.target/riscv/rvv/base/bug-6.c: New test.
Signed-off-by: Pan Li <pan2...@intel.com>
---
gcc/dse.cc | 4 +++-
gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c | 2 +-
.../gcc.target/riscv/rvv/base/bug-6.c | 22 +++++++++++++++++++
3 files changed, 26 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c
diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..1596da91da0 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode
read_mode,
copy_rtx (store_info->const_rhs));
else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
&& known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
- && targetm.modes_tieable_p (read_mode, store_mode))
+ && targetm.modes_tieable_p (read_mode, store_mode)
+ && validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs),
+ subreg_lowpart_offset (read_mode, store_mode)))
read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
else
read_reg = extract_low_bits (read_mode, store_mode,
So we're just changing whether or not we call gen_lowpart directly or go
through extract_low_bits, which may in turn generate subreg, call
gen_lowpart itself and a few other things.
I'm guessing that extract_low_bits is going to return NULL in this case
via this code (specifically the second test).
if (!targetm.modes_tieable_p (src_int_mode, src_mode))
return NULL_RTX;
if (!targetm.modes_tieable_p (int_mode, mode))
return NULL_RTX;
Pan, can you confirm what path we take through extract_low_bits?
One might argue that we should just call into extract_low_bits
unconditionally since it'll ultimately call gen_lowpart when it safely
can. The downside is that's a bigger change than I'd like at this stage
in our development cycle.
I wouldn't be surprised if other direct uses of gen_lowpart have similar
problems.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
index f79b4c142ae..624a00a4f32 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1" } */
+/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */
struct A { float x, y; };
struct B { struct A u; };
So this may compromise the original intent of this test. What I would
suggest instead is to create a new test with the dg-do & dg-options you
want with a #include "ssa-fre-44.c".
So to move forward. Let's confirm the path we take through
extract_low_bits matches expectations and fixup the testsuite change.
Jeff