Hi Ulrich,
  In http://gcc.gnu.org/ml/gcc-patches/2004-07/msg01557.html, you
changed the return value of find_reloads_address to be tristate, in the
process modifying the meaning of a win from LEGITIMIZE_RELOAD_ADDRESS.
Prior to your change, a win meant that LEGITIMIZE_RELOAD_ADDRESS had
guaranteed that the address would match one of the extra memory
constraints if it didn't match some other constraint.  After this
change, a win meant that the address as a whole might need further
reloads.  This has caused me a little trouble, because I can't find a
way of telling reload to leave the address alone.

Perhaps I ought to explain the problem I'm trying to solve.  In pr24997,
we see reload trying to fix an altivec address that looks like

 ((rb+ri)+const)&-16

When this whole expression is put into a register, we get an ICE because
no ppc insn matches a complex expression like this.  I figured I could
help reload a little by stripping off the AND and returning
(rb+const)+ri from LEGITIMIZE_RELOAD_ADDRESS, requesting a reload into a
base reg for (rb+const). (*)  After this had been reloaded, the address
would be rb2+ri, which is a valid indexed address.  This works in so far
as the ICE is cured and GCC generates valid code.  However, we don't use
an indexed address.  Instead, we get an indirect address due to
find_reloads not matching constraints for the insn, and further
reloading rb2+ri into another base reg.

Now, I don't really fault your change to the constraint matching because
it certainly seemed fragile before, and there is no way to distinguish
between alternates.  I don't advocate changing things back the way they
were, because targets may have changed their LEGITIMIZE_RELOAD_ADDRESS 
according to the new semantics.  What I'd like from you or other reload
experts is an indication of the right way to fix this problem.  ;-)

I can see the following options:

a) Before matching constraints in find_reloads, substitute dummy regs
for any reloads that have been identified.  I'm not sure how much work
is involved in doing this, or whether it is even possible.  It sounds
like this would be the best solution technically, as then the output
of LEGITIMIZE_RELOAD_ADDRESS is properly checked.

b) Modify LEGITIMIZE_RELOAD_ADDRESS to return a constraint letter that
the address is guaranteed to match after reloading.  A bit of mechanical
work changing all targets.

c) Modify the ppc 'Z' constraint to match the indexed address reload
generates.  This would rely on the pattern we generate in
LEGITIMIZE_RELOAD_ADDRESS never being generated elsewhere.

d) Hacks like the patch below, that effectively perform the reload
substitution with a dummy reg.  I fear this isn't proper, even though it
seems to work..


(*) This is exactly what code in find_reloads_address does on
encoutering invalid indexed address.  The trouble is that its
transformation isn't valid until the reloads are done, and we check
constraints before doing the substitutions.  :-(

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 107416)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3354,19 +3360,68 @@ rs6000_legitimize_reload_address (rtx x,
 
   /* Reload an offset address wrapped by an AND that represents the
      masking of the lower bits.  Strip the outer AND and let reload
-     convert the offset address into an indirect address.  */
+     convert the offset address into an indirect address.  Do the
+     same for indexed addresses with an offset.  */
   if (TARGET_ALTIVEC
       && ALTIVEC_VECTOR_MODE (mode)
       && GET_CODE (x) == AND
-      && GET_CODE (XEXP (x, 0)) == PLUS
-      && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG
-      && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT
       && GET_CODE (XEXP (x, 1)) == CONST_INT
       && INTVAL (XEXP (x, 1)) == -16)
     {
-      x = XEXP (x, 0);
-      *win = 1;
-      return x;
+      rtx ad = XEXP (x, 0);
+      if (GET_CODE (ad) == PLUS
+         && GET_CODE (XEXP (ad, 1)) == CONST_INT)
+       {
+         if (GET_CODE (XEXP (ad, 0)) == REG)
+           {
+             x = ad;
+             *win = 1;
+             return x;
+           }
+         else if (GET_CODE (XEXP (ad, 0)) == PLUS
+                  && GET_CODE (XEXP (XEXP (ad, 0), 0)) == REG
+                  && GET_CODE (XEXP (XEXP (ad, 0), 1)) == REG)
+           {
+#if 1
+             rtx rb = XEXP (XEXP (ad, 0), 0);
+             rtx ri = XEXP (XEXP (ad, 0), 1);
+             rtx c = XEXP (ad, 1);
+
+             /* Use an indexed address as in the original instruction,
+                but reload rb+c part.  Generate the final form of the
+                address here, so that we match Z constraint.  Use r1
+                as the base reg.  It will be replaced with the actual
+                reload reg later.  */
+             x = gen_rtx_PLUS (GET_MODE (ad),
+                               gen_rtx_REG (GET_MODE (ad), 1),
+                               ri);
+             ad = gen_rtx_PLUS (GET_MODE (ad), rb, c);
+             push_reload (ad, NULL_RTX, &XEXP (x, 0), NULL,
+                          BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
+                          opnum, (enum reload_type) type);
+#elif 0
+             rtx rb = XEXP (XEXP (ad, 0), 0);
+             rtx ri = XEXP (XEXP (ad, 0), 1);
+             rtx c = XEXP (ad, 1);
+
+             /* Use an indexed address as in the original instruction,
+                but reload rb+c part.  */
+             x = gen_rtx_PLUS (GET_MODE (ad),
+                               gen_rtx_PLUS (GET_MODE (ad), rb, c),
+                               ri);
+             push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
+                          BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
+                          opnum, (enum reload_type) type);
+#else
+             x = ad;
+             push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
+                          BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
+                          opnum, (enum reload_type) type);
+#endif
+             *win = 1;
+             return x;
+           }
+       }
     }
 
   if (TARGET_TOC

The three different hacks above generate the following code snippet from
the testcase in pr24997.  "first" is the best.

first               second              third
    add 8,7,11          add 8,7,11          add 8,7,11
    add 9,0,11          add 9,0,11          add 9,0,11
    addi 0,11,12        addi 0,11,12        addi 0,11,12
    lvx 12,5,11
                                            add 11,11,1
    neg 9,9             neg 9,9             neg 9,9
    add 7,5,0           add 7,5,0           add 7,5,0
.LVL621:            .LVL621:            .LVL621:
    lvsr 11,0,9         lvsr 11,0,9         lvsr 11,0,9
                        add 9,5,11
                                            addi 9,11,16512
                        lvx 12,0,9          lvx 12,0,9
    li 9,0              li 9,0              li 9,0
    blt- 7,.L669        blt- 7,.L669        blt- 7,.L669
.L489:              .L489:              .L489:

Reply via email to