On 07/31/2016 04:44 AM, Bernd Edlinger wrote:

like this?

Index: emit-rtl.c
===================================================================
--- emit-rtl.c  (revision 238891)
+++ emit-rtl.c  (working copy)
@@ -1156,7 +1156,11 @@
      {
  #if defined(POINTERS_EXTEND_UNSIGNED)
        if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED)
-          || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED))
+          || (GET_CODE (x) == ZERO_EXTEND && ! POINTERS_EXTEND_UNSIGNED)
+          || (paradoxical_subreg_p (x)
+              && ! (SUBREG_PROMOTED_VAR_P (x)
+                    && SUBREG_CHECK_PROMOTED_SIGN (x,
+                                                   POINTERS_EXTEND_UNSIGNED))))
          && !targetm.have_ptr_extend ())
        can_be_reg_pointer = false;
  #endif

In the test case of pr71779 the subreg is no promoted var, so this
has no influence at this time.  Also I have not POINTERS_EXTEND_SIGNED
target, but for the symmetry it ought to check explicitly for
ZERO_EXTEND as well, and allow the pointer value to pass thru a
TRUNCATE.
I believe MIPS is likely the only target that extends signed, a few need special extension code (ia64/s390). But this looks reasonable to me. I don't think it's worth the complication of dealing with truncation.




I debugged the cse again, to see how it works and why it mis-compiles
this example.

I found out that the trouble starts one instruction earlier:

[ Fixing the missing bits from the insn using your later message... ]

(insn 19 40 20 2 (set (subreg:DI (reg:QI 93) 0)
         (const_int 255 [0xff])) pr.c:8 50 {*movdi_aarch64}
      (expr_list:REG_DEAD (reg:DI 110 [ D.3037 ])
         (nil)))



cse_main sees the constant value and maps:
(reg:QI 93)  =>  (const_int 255 [0xff])
OK. This looks OK to me. We know unambiguously that (reg 93) has the value 0xff -- that's because (reg 93) is a QImode register. There are no outside QImode.




plus (I mean that is wrong):
(subreg:DI (reg:QI 93) 0)  =>  (const_int 255 [0xff])

When the next insn is scanned

(insn 20 19 21 2 (set (reg:DI 94)
         (subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64}
      (expr_list:REG_DEAD (reg:QI 93)
         (nil)))

I see fold_rtx (subreg:DI (reg:QI 93) 0))
return (const_int 255 [0xff]) which is wrong.
I think this is the key point where things have gone wrong. While we know the QImode bits are 0xff the bits outside QImode are undefined. So we can't legitimately return 0xff when folding that rtx.




now cse maps:
(reg:DI 94)  =>  (const_int 255 [0xff])
And now we propagate the mistake from the previous step....





Now I think I found a better place for a patch, where the first bogus
mapping is recorded:

Index: cse.c
===================================================================
--- cse.c       (revision 238891)
+++ cse.c       (working copy)
@@ -5898,15 +5898,7 @@ cse_insn (rtx_insn *insn)
            || GET_MODE (dest) == BLKmode
            /* If we didn't put a REG_EQUAL value or a source into the hash
               table, there is no point is recording DEST.  */
-           || sets[i].src_elt == 0
-           /* If DEST is a paradoxical SUBREG and SRC is a ZERO_EXTEND
-              or SIGN_EXTEND, don't record DEST since it can cause
-              some tracking to be wrong.
-
-              ??? Think about this more later.  */
-           || (paradoxical_subreg_p (dest)
-               && (GET_CODE (sets[i].src) == SIGN_EXTEND
-                   || GET_CODE (sets[i].src) == ZERO_EXTEND)))
+           || sets[i].src_elt == 0)
          continue;

        /* STRICT_LOW_PART isn't part of the value BEING set,
@@ -5925,6 +5917,11 @@ cse_insn (rtx_insn *insn)
              sets[i].dest_hash = HASH (dest, GET_MODE (dest));
            }

+       /* If DEST is a paradoxical SUBREG, don't record DEST since it can
+          cause some tracking to be wrong.  */
+       if (paradoxical_subreg_p (dest))
+         continue;
+
        elt = insert (dest, sets[i].src_elt,
                      sets[i].dest_hash, GET_MODE (dest));
Instead of saying "cause some tracking to be wrong", it might be better to say "the bits outside the mode of GET_MODE (SUBREG_REG (dest)) are undefined".





So apparently there was already an attempt of a fix for a similar bug,
and svn blame points to:

svn log -v -r8354
------------------------------------------------------------------------
r8354 | kenner | 1994-10-28 23:55:05 +0100 (Fri, 28 Oct 1994) | 3 lines
Changed paths:
    M /trunk/gcc/cse.c

(cse_insn): Don't record a DEST a paradoxical SUBREG and SRC is a
SIGN_EXTEND or ZERO_EXTEND.

------------------------------------------------------------------------

This way we can still map the underlying QI register to 255 but
not the SUBREG if it is a paradoxical subreg.

In the test case this patch still works (output code does not change).

What do you think?
Looks like you've probably nailed it. It'll be interesting see if there's any fallout (though our RTL optimizer testing is pretty weak, so even if there were, I doubt we'd catch it).

ISTM that this would be a great test for the unit testing framework that David M. has been building. You should look for David's simplify-rtx tests (posted, but not committed pending some thinking about a few issues) as a guide.

jeff

Reply via email to