On 2/26/24 07:22, pan2...@intel.com wrote:
From: Pan Li <pan2...@intel.com>

We allowed vector type for get_stored_val when read is less than or
equal to store in previous.  Unfortunately, we missed to adjust the
validate_subreg part accordingly.  When the vector type's size is
less than vector register, it will be considered as invalid in the
validate_subreg.

Consider the validate_subreg is kind of a can with worms and we are
in stage 4.  We will fix the issue from the DES side, and make sure
the subreg is valid for both the read_mode and store_mode before
perform the real gen_lowpart.

The below test are passed for this patch:

* The x86 bootstrap test.
* The x86 regression test.
* The riscv regression test.
* The aarch64 regression test.

gcc/ChangeLog:

        * dse.cc (get_stored_val): Add validate_subreg check before
        perform the gen_lowpart for rtl.

gcc/testsuite/ChangeLog:

        * gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger
        the ICE.
        * gcc.target/riscv/rvv/base/bug-6.c: New test.

Signed-off-by: Pan Li <pan2...@intel.com>
---
  gcc/dse.cc                                    |  4 +++-
  gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c    |  2 +-
  .../gcc.target/riscv/rvv/base/bug-6.c         | 22 +++++++++++++++++++
  3 files changed, 26 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..1596da91da0 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
                                 copy_rtx (store_info->const_rhs));
    else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
      && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-    && targetm.modes_tieable_p (read_mode, store_mode))
+    && targetm.modes_tieable_p (read_mode, store_mode)
+    && validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs),
+                       subreg_lowpart_offset (read_mode, store_mode)))
      read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
    else
      read_reg = extract_low_bits (read_mode, store_mode,

So we're just changing whether or not we call gen_lowpart directly or go through extract_low_bits, which may in turn generate subreg, call gen_lowpart itself and a few other things.

I'm guessing that extract_low_bits is going to return NULL in this case via this code (specifically the second test).

  if (!targetm.modes_tieable_p (src_int_mode, src_mode))
    return NULL_RTX;
  if (!targetm.modes_tieable_p (int_mode, mode))
    return NULL_RTX;


Pan, can you confirm what path we take through extract_low_bits?

One might argue that we should just call into extract_low_bits unconditionally since it'll ultimately call gen_lowpart when it safely can. The downside is that's a bigger change than I'd like at this stage in our development cycle.

I wouldn't be surprised if other direct uses of gen_lowpart have similar problems.





diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
index f79b4c142ae..624a00a4f32 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre1" } */
+/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */
struct A { float x, y; };
  struct B { struct A u; };
So this may compromise the original intent of this test. What I would suggest instead is to create a new test with the dg-do & dg-options you want with a #include "ssa-fre-44.c".

So to move forward. Let's confirm the path we take through extract_low_bits matches expectations and fixup the testsuite change.

Jeff

Reply via email to