Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-29 Thread Ulrich Weigand
H.J. Lu wrote:
 * reload.c (struct replacement): Remove SUBREG_LOC member.
 (push_reload): Do not set it.
 (push_replacement): Likewise.
 (subst_reload): Remove dead code.
 (copy_replacements): Remove assertion.
 (copy_replacements_1): Do not handle SUBREG_LOC.
 (move_replacements): Likewise.
 (find_replacement): Remove dead code.  Use 
  reload_adjust_reg_for_mode.
 Detect subregs via recursive descent instead of via SUBREG_LOC.
 
 
  It works much better.  I am testing it now.
 
 
 It works.  There are no regressions on Linux/ia32 nor Linux/x86-64.
 Can you check it in and mention PR rtl-optimization/49114 ChangeLog?

OK, I've checked the patch in now.  Thanks for testing!

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread H.J. Lu
On Mon, Jun 27, 2011 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 3:19 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 reload generates:

 (insn 914 912 0 (set (reg:SI 0 ax)
         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
             (const_int 8 [0x8]))) 248 {*lea_1_x32}
      (nil))

 from

 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

 Interesting.  The pseudo should have been replaced by the
 hard register (reg:DI 1) during the preceding call to
      op0 = find_replacement (XEXP (in, 0));
 (since reload 0 should have pushed a replacement record.)

 Interestingly enough, in the final output that replacement *is*
 performed in the REG_EQUIV note:

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

 which is why I hadn't expected this to be a problem here.

 Can you try to find out why the find_replacement doesn't work
 with your test case?


 I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
 a problem?


 find_replacement never checks subreg:

 Breakpoint 3, find_replacement (loc=0x7068ab00)
    at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
 6411          if (reloadreg  r-where == loc)
 (reg:DI 0 ax)
 (reg/v/f:DI 182 [ b ])
 (gdb) call debug_rtx (*loc)
 (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
 (gdb)


This patch checks SUBREG pointer if Pmode != ptr_mode.  OK
for trunk?

Thanks.

-- 
H.J.
---
2011-06-28  H.J. Lu  hongjiu...@intel.com

PR rtl-optimization/49114
* reload.c (find_replacement): Properly handle SUBREG pointers.

diff --git a/gcc/reload.c b/gcc/reload.c
index 3ad46b9..829e45b 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -6415,6 +6415,36 @@ find_replacement (rtx *loc)

  return reloadreg;
}
+  else if (Pmode != ptr_mode
+   !r-subreg_loc
+   reloadreg
+   (r-mode == Pmode || GET_MODE (reloadreg) == Pmode)
+   REG_P (reloadreg)
+   GET_CODE (*loc) == SUBREG
+   REG_P (SUBREG_REG (*loc))
+   REG_POINTER (SUBREG_REG (*loc))
+   GET_MODE (*loc) == ptr_mode
+   r-where == SUBREG_REG (*loc))
+   {
+ int offset;
+
+ if (r-mode != VOIDmode  GET_MODE (reloadreg) != r-mode)
+   reloadreg = gen_rtx_REG (r-mode, REGNO (reloadreg));
+
+ if ((WORDS_BIG_ENDIAN || BYTES_BIG_ENDIAN)
+  GET_MODE_SIZE (Pmode)  GET_MODE_SIZE (ptr_mode))
+   {
+ offset = GET_MODE_SIZE (Pmode) - GET_MODE_SIZE (ptr_mode);
+ if (! BYTES_BIG_ENDIAN)
+   offset = (offset / UNITS_PER_WORD) * UNITS_PER_WORD;
+ else if (! WORDS_BIG_ENDIAN)
+   offset %= UNITS_PER_WORD;
+   }
+  else
+offset = 0;
+
+ return gen_rtx_SUBREG (ptr_mode, reloadreg, offset);
+   }
   else if (reloadreg  r-subreg_loc == loc)
{
  /* RELOADREG must be either a REG or a SUBREG.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread Ulrich Weigand
H.J. Lu wrote:
  find_replacement never checks subreg:
 
  Breakpoint 3, find_replacement (loc=3D0x7068ab00)
  =A0 =A0at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
  6411 =A0 =A0 =A0 =A0 =A0if (reloadreg  r-where =3D=3D loc)
  (reg:DI 0 ax)
  (reg/v/f:DI 182 [ b ])
  (gdb) call debug_rtx (*loc)
  (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
  (gdb)
 
 
 This seems to work.  Does it make any senses?

Ah, I see.  This was supposed to be handled via the SUBREG_LOC
member of the replacement struct.  Unfortunately, it turns out
that this is no longer reliably set these days ...

At first I was concerned that this might also cause problems at
the other location where replacements are processed, subst_reloads.
However, it turns out that code in subst_reloads is dead these days
anyway, as the reloadreg is *always* a REG, and never a SUBREG.

Once that code (and similar code in find_replacement that tries
to handle SUBREG reloadregs) is removed, the only remaining user
of the SUBREG_LOC field is actually find_replacement.  But here
we're doing a recursive descent through an RTL anyway, so we
always know we're replacing inside a SUBREG.

This makes the whole SUBREG_LOC field obsolete.

The patch below implements those changes (untested so far).
Can you verify that this works for you as well?

Thanks,
Ulrich


ChangeLog:

* reload.c (struct replacement): Remove SUBREG_LOC member.
(push_reload): Do not set it.
(push_replacement): Likewise.
(subst_reload): Remove dead code.
(copy_replacements): Remove assertion.
(copy_replacements_1): Do not handle SUBREG_LOC.
(move_replacements): Likewise.
(find_replacement): Remove dead code.  Detect subregs via
recursive descent instead of via SUBREG_LOC.

Index: gcc/reload.c
===
*** gcc/reload.c(revision 175580)
--- gcc/reload.c(working copy)
*** static int replace_reloads;
*** 158,165 
  struct replacement
  {
rtx *where; /* Location to store in */
-   rtx *subreg_loc;/* Location of SUBREG if WHERE is inside
-  a SUBREG; 0 otherwise.  */
int what;   /* which reload this is for */
enum machine_mode mode; /* mode it must have */
  };
--- 158,163 
*** push_reload (rtx in, rtx out, rtx *inloc
*** 1496,1502 
{
  struct replacement *r = replacements[n_replacements++];
  r-what = i;
- r-subreg_loc = in_subreg_loc;
  r-where = inloc;
  r-mode = inmode;
}
--- 1494,1499 
*** push_reload (rtx in, rtx out, rtx *inloc
*** 1505,1511 
  struct replacement *r = replacements[n_replacements++];
  r-what = i;
  r-where = outloc;
- r-subreg_loc = out_subreg_loc;
  r-mode = outmode;
}
  }
--- 1502,1507 
*** push_replacement (rtx *loc, int reloadnu
*** 1634,1640 
struct replacement *r = replacements[n_replacements++];
r-what = reloadnum;
r-where = loc;
-   r-subreg_loc = 0;
r-mode = mode;
  }
  }
--- 1630,1635 
*** subst_reloads (rtx insn)
*** 6287,6319 
  if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
reloadreg = reload_adjust_reg_for_mode (reloadreg, r-mode);
  
! /* If we are putting this into a SUBREG and RELOADREG is a
!SUBREG, we would be making nested SUBREGs, so we have to fix
!this up.  Note that r-where == SUBREG_REG (*r-subreg_loc).  */
! 
! if (r-subreg_loc != 0  GET_CODE (reloadreg) == SUBREG)
!   {
! if (GET_MODE (*r-subreg_loc)
! == GET_MODE (SUBREG_REG (reloadreg)))
!   *r-subreg_loc = SUBREG_REG (reloadreg);
! else
!   {
! int final_offset =
!   SUBREG_BYTE (*r-subreg_loc) + SUBREG_BYTE (reloadreg);
! 
! /* When working with SUBREGs the rule is that the byte
!offset must be a multiple of the SUBREG's mode.  */
! final_offset = (final_offset /
! GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
! final_offset = (final_offset *
! GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
! 
! *r-where = SUBREG_REG (reloadreg);
! SUBREG_BYTE (*r-subreg_loc) = final_offset;
!   }
!   }
! else
!   *r-where = reloadreg;
}
/* If reload got no reg and isn't optional, something's wrong.  */
else
--- 6282,6288 
  if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
reloadreg = reload_adjust_reg_for_mode (reloadreg, r-mode);
  
! *r-where = reloadreg;
}
/* If reload got no reg and isn't optional, 

Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread H.J. Lu
On Tue, Jun 28, 2011 at 7:24 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:
  find_replacement never checks subreg:
 
  Breakpoint 3, find_replacement (loc=3D0x7068ab00)
  =A0 =A0at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
  6411 =A0 =A0 =A0 =A0 =A0if (reloadreg  r-where =3D=3D loc)
  (reg:DI 0 ax)
  (reg/v/f:DI 182 [ b ])
  (gdb) call debug_rtx (*loc)
  (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
  (gdb)
 

 This seems to work.  Does it make any senses?

 Ah, I see.  This was supposed to be handled via the SUBREG_LOC
 member of the replacement struct.  Unfortunately, it turns out
 that this is no longer reliably set these days ...

 At first I was concerned that this might also cause problems at
 the other location where replacements are processed, subst_reloads.
 However, it turns out that code in subst_reloads is dead these days
 anyway, as the reloadreg is *always* a REG, and never a SUBREG.

 Once that code (and similar code in find_replacement that tries
 to handle SUBREG reloadregs) is removed, the only remaining user
 of the SUBREG_LOC field is actually find_replacement.  But here
 we're doing a recursive descent through an RTL anyway, so we
 always know we're replacing inside a SUBREG.

 This makes the whole SUBREG_LOC field obsolete.

 The patch below implements those changes (untested so far).
 Can you verify that this works for you as well?

 Thanks,
 Ulrich


 ChangeLog:

        * reload.c (struct replacement): Remove SUBREG_LOC member.
        (push_reload): Do not set it.
        (push_replacement): Likewise.
        (subst_reload): Remove dead code.
        (copy_replacements): Remove assertion.
        (copy_replacements_1): Do not handle SUBREG_LOC.
        (move_replacements): Likewise.
        (find_replacement): Remove dead code.  Detect subregs via
        recursive descent instead of via SUBREG_LOC.

 Index: gcc/reload.c
 ===
 *** gcc/reload.c        (revision 175580)
 --- gcc/reload.c        (working copy)
 *** static int replace_reloads;
 *** 158,165 
  struct replacement
  {
    rtx *where;                 /* Location to store in */
 -   rtx *subreg_loc;            /* Location of SUBREG if WHERE is inside
 -                                  a SUBREG; 0 otherwise.  */
    int what;                   /* which reload this is for */
    enum machine_mode mode;     /* mode it must have */
  };
 --- 158,163 
 *** push_reload (rtx in, rtx out, rtx *inloc
 *** 1496,1502 
        {
          struct replacement *r = replacements[n_replacements++];
          r-what = i;
 -         r-subreg_loc = in_subreg_loc;
          r-where = inloc;
          r-mode = inmode;
        }
 --- 1494,1499 
 *** push_reload (rtx in, rtx out, rtx *inloc
 *** 1505,1511 
          struct replacement *r = replacements[n_replacements++];
          r-what = i;
          r-where = outloc;
 -         r-subreg_loc = out_subreg_loc;
          r-mode = outmode;
        }
      }
 --- 1502,1507 
 *** push_replacement (rtx *loc, int reloadnu
 *** 1634,1640 
        struct replacement *r = replacements[n_replacements++];
        r-what = reloadnum;
        r-where = loc;
 -       r-subreg_loc = 0;
        r-mode = mode;
      }
  }
 --- 1630,1635 
 *** subst_reloads (rtx insn)
 *** 6287,6319 
          if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
            reloadreg = reload_adjust_reg_for_mode (reloadreg, r-mode);

 !         /* If we are putting this into a SUBREG and RELOADREG is a
 !            SUBREG, we would be making nested SUBREGs, so we have to fix
 !            this up.  Note that r-where == SUBREG_REG (*r-subreg_loc).  */
 !
 !         if (r-subreg_loc != 0  GET_CODE (reloadreg) == SUBREG)
 !           {
 !             if (GET_MODE (*r-subreg_loc)
 !                 == GET_MODE (SUBREG_REG (reloadreg)))
 !               *r-subreg_loc = SUBREG_REG (reloadreg);
 !             else
 !               {
 !                 int final_offset =
 !                   SUBREG_BYTE (*r-subreg_loc) + SUBREG_BYTE (reloadreg);
 !
 !                 /* When working with SUBREGs the rule is that the byte
 !                    offset must be a multiple of the SUBREG's mode.  */
 !                 final_offset = (final_offset /
 !                                 GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
 !                 final_offset = (final_offset *
 !                                 GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
 !
 !                 *r-where = SUBREG_REG (reloadreg);
 !                 SUBREG_BYTE (*r-subreg_loc) = final_offset;
 !               }
 !           }
 !         else
 !           *r-where = reloadreg;
        }
        /* If reload got no reg and isn't optional, something's wrong.  */
        else
 --- 6282,6288 
          if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
            reloadreg = 

Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread H.J. Lu
On Tue, Jun 28, 2011 at 7:47 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Jun 28, 2011 at 7:24 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:
  find_replacement never checks subreg:
 
  Breakpoint 3, find_replacement (loc=3D0x7068ab00)
  =A0 =A0at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
  6411 =A0 =A0 =A0 =A0 =A0if (reloadreg  r-where =3D=3D loc)
  (reg:DI 0 ax)
  (reg/v/f:DI 182 [ b ])
  (gdb) call debug_rtx (*loc)
  (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
  (gdb)
 

 This seems to work.  Does it make any senses?

 Ah, I see.  This was supposed to be handled via the SUBREG_LOC
 member of the replacement struct.  Unfortunately, it turns out
 that this is no longer reliably set these days ...

 At first I was concerned that this might also cause problems at
 the other location where replacements are processed, subst_reloads.
 However, it turns out that code in subst_reloads is dead these days
 anyway, as the reloadreg is *always* a REG, and never a SUBREG.

 Once that code (and similar code in find_replacement that tries
 to handle SUBREG reloadregs) is removed, the only remaining user
 of the SUBREG_LOC field is actually find_replacement.  But here
 we're doing a recursive descent through an RTL anyway, so we
 always know we're replacing inside a SUBREG.

 This makes the whole SUBREG_LOC field obsolete.

 The patch below implements those changes (untested so far).
 Can you verify that this works for you as well?

 Thanks,
 Ulrich


 ChangeLog:

        * reload.c (struct replacement): Remove SUBREG_LOC member.
        (push_reload): Do not set it.
        (push_replacement): Likewise.
        (subst_reload): Remove dead code.
        (copy_replacements): Remove assertion.
        (copy_replacements_1): Do not handle SUBREG_LOC.
        (move_replacements): Likewise.
        (find_replacement): Remove dead code.  Detect subregs via
        recursive descent instead of via SUBREG_LOC.

 Index: gcc/reload.c
 ===
 *** gcc/reload.c        (revision 175580)
 --- gcc/reload.c        (working copy)
 *** static int replace_reloads;
 *** 158,165 
  struct replacement
  {
    rtx *where;                 /* Location to store in */
 -   rtx *subreg_loc;            /* Location of SUBREG if WHERE is inside
 -                                  a SUBREG; 0 otherwise.  */
    int what;                   /* which reload this is for */
    enum machine_mode mode;     /* mode it must have */
  };
 --- 158,163 
 *** push_reload (rtx in, rtx out, rtx *inloc
 *** 1496,1502 
        {
          struct replacement *r = replacements[n_replacements++];
          r-what = i;
 -         r-subreg_loc = in_subreg_loc;
          r-where = inloc;
          r-mode = inmode;
        }
 --- 1494,1499 
 *** push_reload (rtx in, rtx out, rtx *inloc
 *** 1505,1511 
          struct replacement *r = replacements[n_replacements++];
          r-what = i;
          r-where = outloc;
 -         r-subreg_loc = out_subreg_loc;
          r-mode = outmode;
        }
      }
 --- 1502,1507 
 *** push_replacement (rtx *loc, int reloadnu
 *** 1634,1640 
        struct replacement *r = replacements[n_replacements++];
        r-what = reloadnum;
        r-where = loc;
 -       r-subreg_loc = 0;
        r-mode = mode;
      }
  }
 --- 1630,1635 
 *** subst_reloads (rtx insn)
 *** 6287,6319 
          if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
            reloadreg = reload_adjust_reg_for_mode (reloadreg, r-mode);

 !         /* If we are putting this into a SUBREG and RELOADREG is a
 !            SUBREG, we would be making nested SUBREGs, so we have to fix
 !            this up.  Note that r-where == SUBREG_REG (*r-subreg_loc).  
 */
 !
 !         if (r-subreg_loc != 0  GET_CODE (reloadreg) == SUBREG)
 !           {
 !             if (GET_MODE (*r-subreg_loc)
 !                 == GET_MODE (SUBREG_REG (reloadreg)))
 !               *r-subreg_loc = SUBREG_REG (reloadreg);
 !             else
 !               {
 !                 int final_offset =
 !                   SUBREG_BYTE (*r-subreg_loc) + SUBREG_BYTE (reloadreg);
 !
 !                 /* When working with SUBREGs the rule is that the byte
 !                    offset must be a multiple of the SUBREG's mode.  */
 !                 final_offset = (final_offset /
 !                                 GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
 !                 final_offset = (final_offset *
 !                                 GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
 !
 !                 *r-where = SUBREG_REG (reloadreg);
 !                 SUBREG_BYTE (*r-subreg_loc) = final_offset;
 !               }
 !           }
 !         else
 !           *r-where = reloadreg;
        }
        /* If reload got no reg and isn't optional, something's wrong.  */
        else
 --- 6282,6288 
          if (GET_MODE 

Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread Ulrich Weigand
H.J. Lu wrote:
  it doesn't work;
 
  allocation.f: In function 'allocation':
  allocation.f:1048:0: internal compiler error: in subreg_get_info, at
  rtlanal.c:3235
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See http://gcc.gnu.org/bugs.html for instructions.

  since subreg_regno_offset only works on hard registers.

Hmm, OK.  That look like another latent bug in the original code ...

 + if (r-mode != VOIDmode  GET_MODE (reloadreg) != r-mode)
 +   reloadreg = gen_rtx_REG (r-mode, REGNO (reloadreg));

(As an aside, this is wrong; it's already wrong in the place where you
copied it from.  This should now use reload_adjust_reg_for_mode just
like subst_reload does.)

 +
 + if ((WORDS_BIG_ENDIAN || BYTES_BIG_ENDIAN)
 +  GET_MODE_SIZE (Pmode)  GET_MODE_SIZE (ptr_mode))
 +   {
 + offset = GET_MODE_SIZE (Pmode) - GET_MODE_SIZE (ptr_mode);
 + if (! BYTES_BIG_ENDIAN)
 +   offset = (offset / UNITS_PER_WORD) * UNITS_PER_WORD;
 + else if (! WORDS_BIG_ENDIAN)
 +   offset %= UNITS_PER_WORD;
 +   }
 +  else
 +offset = 0;
 +
 + return gen_rtx_SUBREG (ptr_mode, reloadreg, offset);
 
 works for me.

This doesn't seem correct either, it completely ignores the SUBREG_BYTE
of the original SUBREG ...   Also, I don't quite see why this should
have anything special for Pmode / ptr_mode.

It seems simplest to just use simplify_gen_subreg here.  Can you try
the following version?

Thanks,
Ulrich


ChangeLog:

* reload.c (struct replacement): Remove SUBREG_LOC member.
(push_reload): Do not set it.
(push_replacement): Likewise.
(subst_reload): Remove dead code.
(copy_replacements): Remove assertion.
(copy_replacements_1): Do not handle SUBREG_LOC.
(move_replacements): Likewise.
(find_replacement): Remove dead code.  Use reload_adjust_reg_for_mode.
Detect subregs via recursive descent instead of via SUBREG_LOC.


Index: gcc/reload.c
===
*** gcc/reload.c(revision 175580)
--- gcc/reload.c(working copy)
*** static int replace_reloads;
*** 158,165 
  struct replacement
  {
rtx *where; /* Location to store in */
-   rtx *subreg_loc;/* Location of SUBREG if WHERE is inside
-  a SUBREG; 0 otherwise.  */
int what;   /* which reload this is for */
enum machine_mode mode; /* mode it must have */
  };
--- 158,163 
*** push_reload (rtx in, rtx out, rtx *inloc
*** 1496,1502 
{
  struct replacement *r = replacements[n_replacements++];
  r-what = i;
- r-subreg_loc = in_subreg_loc;
  r-where = inloc;
  r-mode = inmode;
}
--- 1494,1499 
*** push_reload (rtx in, rtx out, rtx *inloc
*** 1505,1511 
  struct replacement *r = replacements[n_replacements++];
  r-what = i;
  r-where = outloc;
- r-subreg_loc = out_subreg_loc;
  r-mode = outmode;
}
  }
--- 1502,1507 
*** push_replacement (rtx *loc, int reloadnu
*** 1634,1640 
struct replacement *r = replacements[n_replacements++];
r-what = reloadnum;
r-where = loc;
-   r-subreg_loc = 0;
r-mode = mode;
  }
  }
--- 1630,1635 
*** subst_reloads (rtx insn)
*** 6287,6319 
  if (GET_MODE (reloadreg) != r-mode  r-mode != VOIDmode)
reloadreg = reload_adjust_reg_for_mode (reloadreg, r-mode);
  
! /* If we are putting this into a SUBREG and RELOADREG is a
!SUBREG, we would be making nested SUBREGs, so we have to fix
!this up.  Note that r-where == SUBREG_REG (*r-subreg_loc).  */
! 
! if (r-subreg_loc != 0  GET_CODE (reloadreg) == SUBREG)
!   {
! if (GET_MODE (*r-subreg_loc)
! == GET_MODE (SUBREG_REG (reloadreg)))
!   *r-subreg_loc = SUBREG_REG (reloadreg);
! else
!   {
! int final_offset =
!   SUBREG_BYTE (*r-subreg_loc) + SUBREG_BYTE (reloadreg);
! 
! /* When working with SUBREGs the rule is that the byte
!offset must be a multiple of the SUBREG's mode.  */
! final_offset = (final_offset /
! GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
! final_offset = (final_offset *
! GET_MODE_SIZE (GET_MODE (*r-subreg_loc)));
! 
! *r-where = SUBREG_REG (reloadreg);
! SUBREG_BYTE (*r-subreg_loc) = final_offset;
!   }
!   }
! else
!   *r-where = reloadreg;
}
/* If reload got no reg and isn't optional, 

Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-28 Thread H.J. Lu
On Tue, Jun 28, 2011 at 8:19 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:
  it doesn't work;
 
  allocation.f: In function 'allocation':
  allocation.f:1048:0: internal compiler error: in subreg_get_info, at
  rtlanal.c:3235
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See http://gcc.gnu.org/bugs.html for instructions.

  since subreg_regno_offset only works on hard registers.

 Hmm, OK.  That look like another latent bug in the original code ...

 +         if (r-mode != VOIDmode  GET_MODE (reloadreg) != r-mode)
 +           reloadreg = gen_rtx_REG (r-mode, REGNO (reloadreg));

 (As an aside, this is wrong; it's already wrong in the place where you
 copied it from.  This should now use reload_adjust_reg_for_mode just
 like subst_reload does.)

 +
 +         if ((WORDS_BIG_ENDIAN || BYTES_BIG_ENDIAN)
 +              GET_MODE_SIZE (Pmode)  GET_MODE_SIZE (ptr_mode))
 +           {
 +             offset = GET_MODE_SIZE (Pmode) - GET_MODE_SIZE (ptr_mode);
 +             if (! BYTES_BIG_ENDIAN)
 +               offset = (offset / UNITS_PER_WORD) * UNITS_PER_WORD;
 +             else if (! WORDS_BIG_ENDIAN)
 +               offset %= UNITS_PER_WORD;
 +           }
 +          else
 +            offset = 0;
 +
 +         return gen_rtx_SUBREG (ptr_mode, reloadreg, offset);

 works for me.

 This doesn't seem correct either, it completely ignores the SUBREG_BYTE
 of the original SUBREG ...   Also, I don't quite see why this should
 have anything special for Pmode / ptr_mode.

 It seems simplest to just use simplify_gen_subreg here.  Can you try
 the following version?

 Thanks,
 Ulrich


 ChangeLog:

        * reload.c (struct replacement): Remove SUBREG_LOC member.
        (push_reload): Do not set it.
        (push_replacement): Likewise.
        (subst_reload): Remove dead code.
        (copy_replacements): Remove assertion.
        (copy_replacements_1): Do not handle SUBREG_LOC.
        (move_replacements): Likewise.
        (find_replacement): Remove dead code.  Use reload_adjust_reg_for_mode.
        Detect subregs via recursive descent instead of via SUBREG_LOC.



It works much better.  I am testing it now.

Thanks.

-- 
H.J.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread Ulrich Weigand
H.J. Lu wrote:
 On Mon, Jun 27, 2011 at 7:47 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
  The actual problem
  here is that this part of gen_reload takes the approach to transform
 
   out - op0 + op1
 
  into
 
   out - op0
   out - out + op1
 
  which is invalid if writing to out clobbers op1.

 The problem is reload 0 puts OP1 in OUT. Adding
 
 gcc_assert (!reg_overlap_mentioned_p (out, op1));
 
 doesn't help in reload 2.  How can I check if OP1 overlaps with
 OUT in previous reload?

Sorry, I don't understand how previous reloads come into play here.
gen_reload is supposed to load in (which happens to be of the
form op0 + op1) into out, which means it is of course supposed
to clobber out (as long as that doesn't implictly clobber op0
or op1 before they're used).  Any conflicts with other reloads ought
to have been resolved earlier.

Can you elaborate?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 11:28 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:
 On Mon, Jun 27, 2011 at 7:47 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
  The actual problem
  here is that this part of gen_reload takes the approach to transform
 
   out - op0 + op1
 
  into
 
   out - op0
   out - out + op1
 
  which is invalid if writing to out clobbers op1.

 The problem is reload 0 puts OP1 in OUT. Adding

 gcc_assert (!reg_overlap_mentioned_p (out, op1));

 doesn't help in reload 2.  How can I check if OP1 overlaps with
 OUT in previous reload?

 Sorry, I don't understand how previous reloads come into play here.
 gen_reload is supposed to load in (which happens to be of the
 form op0 + op1) into out, which means it is of course supposed
 to clobber out (as long as that doesn't implictly clobber op0
 or op1 before they're used).  Any conflicts with other reloads ought
 to have been resolved earlier.

 Can you elaborate?


Here is the output from reload:

Reloads for insn # 588
Reload 0: reload_in (DI) = (reg/v/f:DI 182 [ b ])
GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0)
reload_in_reg: (reg/v/f:DI 182 [ b ])
reload_reg_rtx: (reg:DI 1 dx)
Reload 1: reload_in (DI) = (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182
[ b ]) 0)
(const_int 8 [0x8])))
GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0)
reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b
]) 0)
(const_int 8 [0x8])))
reload_reg_rtx: (reg:DI 1 dx)
Reload 2: reload_out (DF) = (mem:DF (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI 182 [ b ]) 0)
(const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
A64])
NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI
182 [ b ]) 0)
(const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
A64])

leads to

(insn 1017 587 1020 34 (set (reg:DI 1 dx)
(mem/c:DI (plus:DI (reg/f:DI 7 sp)
(const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:291 62
{*movdi_internal_rex64}
 (nil))

(insn 1020 1017 1022 34 (set (reg:SI 1 dx)
(const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
 (nil))

(insn 1022 1020 1023 34 (set (reg:SI 1 dx)
(reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
 (nil))

(insn 1023 1022 1024 34 (set (reg:SI 1 dx)
(plus:SI (reg:SI 1 dx)
(const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
 (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
(const_int 8 [0x8]))
(nil)))

(insn 1024 1023 588 34 (set (reg:DI 1 dx)
(zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
{*zero_extendsidi2_rex64}
 (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx) 0)
(const_int 8 [0x8])))
(nil)))

(insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), index:
D.15020_278, step: 8, offset: 0B]+0 S8 A64])
(reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
{*movdf_internal_rex64}
 (nil))

Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
However, reload 2
puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking what
reload 0 did.

-- 
H.J.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread Ulrich Weigand
H.J. Lu wrote:

 Reloads for insn # 588
 Reload 0: reload_in (DI) =3D (reg/v/f:DI 182 [ b ])
 GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
 reload_in_reg: (reg/v/f:DI 182 [ b ])
 reload_reg_rtx: (reg:DI 1 dx)
 Reload 1: reload_in (DI) =3D (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:D=
 I 182
 [ b ]) 0)
 (const_int 8 [0x8])=
 ))
 GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
 reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 =
 [ b
 ]) 0)
 (const_int 8 [0x8])=
 ))
 reload_reg_rtx: (reg:DI 1 dx)
 Reload 2: reload_out (DF) =3D (mem:DF (zero_extend:DI (plus:SI (subreg:SI
 (reg/v/f:DI 182 [ b ]) 0)
 (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])
 NO_REGS, RELOAD_FOR_OUTPUT (opnum =3D 0), optional
 reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/=
 f:DI
 182 [ b ]) 0)
 (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])
 
 leads to
 

 (insn 1017 587 1020 34 (set (reg:DI 1 dx)
 (mem/c:DI (plus:DI (reg/f:DI 7 sp)
 (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:29=
 1 62
 {*movdi_internal_rex64}
  (nil))

So this is the reload insn generated from reload 0.  So far so good.

 (insn 1020 1017 1022 34 (set (reg:SI 1 dx)
 (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
  (nil))
 
 (insn 1022 1020 1023 34 (set (reg:SI 1 dx)
 (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
  (nil))
 
 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
 (plus:SI (reg:SI 1 dx)
 (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
  (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
 (const_int 8 [0x8]))
 (nil)))
 
 (insn 1024 1023 588 34 (set (reg:DI 1 dx)
 (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
 {*zero_extendsidi2_rex64}
  (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx)=
  0)
 (const_int 8 [0x8])))
 (nil)))

All these reload insns are generated from reload 1.

 (insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), inde=
 x:
 D.15020_278, step: 8, offset: 0B]+0 S8 A64])
 (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
 {*movdf_internal_rex64}
  (nil))

This is the original reloaded insn.

 Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
 However, reload 2
 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking w=
 hat
 reload 0 did.

Reload 2 is an optional reload which reload chose not to utilize, so this
is not really relevant here in any case.  There is no output reload.

The wrong code above originates from how reload 1 is handled:

gen_reload is called to load the ZERO_EXTEND into (reg:DI 1).  This triggers
the unary predicate path, which recurses into gen_reload to load the operand
of the ZERO_EXTEND (reg:SI 1), and subsequently generates insn 1024.

The recursive call loads (plus:SI (subreg:SI (reg:DI 1)) (const_int 8)) into
(reg:SI 1).  It attempts to do that in a single SET and fails (for some
reason).  It then attempts to load the constant (const_int 8) into the
destination register (insn 1020) [** which is broken **], and re-tries.
This still fails, so it falls through to the last attempt, which is to
instead copy the subreg to the destination (which results in insn 1022
as the subreg is optimized away at this point), followed by adding the
constant.

Note that the point marked with [** which is broken **] is the place
I pointed out in the previous mail.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 11:59 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 Reloads for insn # 588
 Reload 0: reload_in (DI) =3D (reg/v/f:DI 182 [ b ])
         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
         reload_in_reg: (reg/v/f:DI 182 [ b ])
         reload_reg_rtx: (reg:DI 1 dx)
 Reload 1: reload_in (DI) =3D (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:D=
 I 182
 [ b ]) 0)
                                                         (const_int 8 [0x8])=
 ))
         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
         reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 =
 [ b
 ]) 0)
                                                         (const_int 8 [0x8])=
 ))
         reload_reg_rtx: (reg:DI 1 dx)
 Reload 2: reload_out (DF) =3D (mem:DF (zero_extend:DI (plus:SI (subreg:SI
 (reg/v/f:DI 182 [ b ]) 0)
                                                             (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])
         NO_REGS, RELOAD_FOR_OUTPUT (opnum =3D 0), optional
         reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/=
 f:DI
 182 [ b ]) 0)
                                                             (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])

 leads to


 (insn 1017 587 1020 34 (set (reg:DI 1 dx)
         (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                 (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:29=
 1 62
 {*movdi_internal_rex64}
      (nil))

 So this is the reload insn generated from reload 0.  So far so good.

 (insn 1020 1017 1022 34 (set (reg:SI 1 dx)
         (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
      (nil))

 (insn 1022 1020 1023 34 (set (reg:SI 1 dx)
         (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
      (nil))

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
         (plus:SI (reg:SI 1 dx)
             (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
      (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
             (const_int 8 [0x8]))
         (nil)))

 (insn 1024 1023 588 34 (set (reg:DI 1 dx)
         (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
 {*zero_extendsidi2_rex64}
      (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx)=
  0)
                 (const_int 8 [0x8])))
         (nil)))

 All these reload insns are generated from reload 1.

 (insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), inde=
 x:
 D.15020_278, step: 8, offset: 0B]+0 S8 A64])
         (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
 {*movdf_internal_rex64}
      (nil))

 This is the original reloaded insn.

 Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
 However, reload 2
 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking w=
 hat
 reload 0 did.

 Reload 2 is an optional reload which reload chose not to utilize, so this
 is not really relevant here in any case.  There is no output reload.

 The wrong code above originates from how reload 1 is handled:

 gen_reload is called to load the ZERO_EXTEND into (reg:DI 1).  This triggers
 the unary predicate path, which recurses into gen_reload to load the operand
 of the ZERO_EXTEND (reg:SI 1), and subsequently generates insn 1024.

 The recursive call loads (plus:SI (subreg:SI (reg:DI 1)) (const_int 8)) into
 (reg:SI 1).  It attempts to do that in a single SET and fails (for some
 reason).  It then attempts to load the constant (const_int 8) into the
 destination register (insn 1020) [** which is broken **], and re-tries.
 This still fails, so it falls through to the last attempt, which is to
 instead copy the subreg to the destination (which results in insn 1022
 as the subreg is optimized away at this point), followed by adding the
 constant.

 Note that the point marked with [** which is broken **] is the place
 I pointed out in the previous mail.


reload generates:

(insn 914 912 0 (set (reg:SI 0 ax)
(plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
(const_int 8 [0x8]))) 248 {*lea_1_x32}
 (nil))

from

insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

Since (reg/v/f:DI 182 [ b ]) is a pseudo register, it is
rejected by

  if ((strict  ! REG_OK_FOR_BASE_STRICT_P (reg))
  || (! strict  ! REG_OK_FOR_BASE_NONSTRICT_P (reg)))
/* Base is not valid.  */
return false;

in ix86_legitimate_address_p.

-- 
H.J.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 1:42 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 11:59 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 Reloads for insn # 588
 Reload 0: reload_in (DI) =3D (reg/v/f:DI 182 [ b ])
         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
         reload_in_reg: (reg/v/f:DI 182 [ b ])
         reload_reg_rtx: (reg:DI 1 dx)
 Reload 1: reload_in (DI) =3D (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:D=
 I 182
 [ b ]) 0)
                                                         (const_int 8 [0x8])=
 ))
         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
         reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 =
 [ b
 ]) 0)
                                                         (const_int 8 [0x8])=
 ))
         reload_reg_rtx: (reg:DI 1 dx)
 Reload 2: reload_out (DF) =3D (mem:DF (zero_extend:DI (plus:SI (subreg:SI
 (reg/v/f:DI 182 [ b ]) 0)
                                                             (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])
         NO_REGS, RELOAD_FOR_OUTPUT (opnum =3D 0), optional
         reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/=
 f:DI
 182 [ b ]) 0)
                                                             (const_int 8
 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
 A64])

 leads to


 (insn 1017 587 1020 34 (set (reg:DI 1 dx)
         (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                 (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:29=
 1 62
 {*movdi_internal_rex64}
      (nil))

 So this is the reload insn generated from reload 0.  So far so good.

 (insn 1020 1017 1022 34 (set (reg:SI 1 dx)
         (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
      (nil))

 (insn 1022 1020 1023 34 (set (reg:SI 1 dx)
         (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
      (nil))

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
         (plus:SI (reg:SI 1 dx)
             (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
      (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
             (const_int 8 [0x8]))
         (nil)))

 (insn 1024 1023 588 34 (set (reg:DI 1 dx)
         (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
 {*zero_extendsidi2_rex64}
      (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx)=
  0)
                 (const_int 8 [0x8])))
         (nil)))

 All these reload insns are generated from reload 1.

 (insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), inde=
 x:
 D.15020_278, step: 8, offset: 0B]+0 S8 A64])
         (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
 {*movdf_internal_rex64}
      (nil))

 This is the original reloaded insn.

 Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
 However, reload 2
 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking w=
 hat
 reload 0 did.

 Reload 2 is an optional reload which reload chose not to utilize, so this
 is not really relevant here in any case.  There is no output reload.

 The wrong code above originates from how reload 1 is handled:

 gen_reload is called to load the ZERO_EXTEND into (reg:DI 1).  This triggers
 the unary predicate path, which recurses into gen_reload to load the 
 operand
 of the ZERO_EXTEND (reg:SI 1), and subsequently generates insn 1024.

 The recursive call loads (plus:SI (subreg:SI (reg:DI 1)) (const_int 8)) into
 (reg:SI 1).  It attempts to do that in a single SET and fails (for some
 reason).  It then attempts to load the constant (const_int 8) into the
 destination register (insn 1020) [** which is broken **], and re-tries.
 This still fails, so it falls through to the last attempt, which is to
 instead copy the subreg to the destination (which results in insn 1022
 as the subreg is optimized away at this point), followed by adding the
 constant.

 Note that the point marked with [** which is broken **] is the place
 I pointed out in the previous mail.


 reload generates:

 (insn 914 912 0 (set (reg:SI 0 ax)
        (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
            (const_int 8 [0x8]))) 248 {*lea_1_x32}
     (nil))

 from

 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

 Since (reg/v/f:DI 182 [ b ]) is a pseudo register, it is
 rejected by

      if ((strict  ! REG_OK_FOR_BASE_STRICT_P (reg))
          || (! strict  ! REG_OK_FOR_BASE_NONSTRICT_P (reg)))
        /* Base is not valid.  */
        return false;

 in ix86_legitimate_address_p.


Even if I added

+;; Use by reload
+(define_insn *lea_0_x32
+  [(set (match_operand:SI 0 register_operand =r)
+  (plus:SI
+(match_operand:SI 1 register_operand r)
+(match_operand:SI 2 immediate_operand Yl)))]
+  TARGET_X32
+  lea{l}\t{%a1, %0|%0, %a1}
+  [(set_attr type lea)
+   (set_attr mode SI)])

to generate

(insn 914 912 0 (set (reg:SI 0 ax)
   (plus:SI (subreg:SI (reg/v/f:DI 182 

Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread Ulrich Weigand
H.J. Lu wrote:

 reload generates:
 
 (insn 914 912 0 (set (reg:SI 0 ax)
 (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
 (const_int 8 [0x8]))) 248 {*lea_1_x32}
  (nil))
 
 from
 
 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

Interesting.  The pseudo should have been replaced by the
hard register (reg:DI 1) during the preceding call to
  op0 = find_replacement (XEXP (in, 0));
(since reload 0 should have pushed a replacement record.)

Interestingly enough, in the final output that replacement *is*
performed in the REG_EQUIV note:

(insn 1023 1022 1024 34 (set (reg:SI 1 dx)
(plus:SI (reg:SI 1 dx)
(const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
 (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
(const_int 8 [0x8]))
(nil)))

which is why I hadn't expected this to be a problem here.

Can you try to find out why the find_replacement doesn't work
with your test case?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 reload generates:

 (insn 914 912 0 (set (reg:SI 0 ax)
         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
             (const_int 8 [0x8]))) 248 {*lea_1_x32}
      (nil))

 from

 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

 Interesting.  The pseudo should have been replaced by the
 hard register (reg:DI 1) during the preceding call to
      op0 = find_replacement (XEXP (in, 0));
 (since reload 0 should have pushed a replacement record.)

 Interestingly enough, in the final output that replacement *is*
 performed in the REG_EQUIV note:

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

 which is why I hadn't expected this to be a problem here.

 Can you try to find out why the find_replacement doesn't work
 with your test case?


I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
a problem?


-- 
H.J.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 3:19 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 reload generates:

 (insn 914 912 0 (set (reg:SI 0 ax)
         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
             (const_int 8 [0x8]))) 248 {*lea_1_x32}
      (nil))

 from

 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

 Interesting.  The pseudo should have been replaced by the
 hard register (reg:DI 1) during the preceding call to
      op0 = find_replacement (XEXP (in, 0));
 (since reload 0 should have pushed a replacement record.)

 Interestingly enough, in the final output that replacement *is*
 performed in the REG_EQUIV note:

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

 which is why I hadn't expected this to be a problem here.

 Can you try to find out why the find_replacement doesn't work
 with your test case?


 I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
 a problem?


find_replacement never checks subreg:

Breakpoint 3, find_replacement (loc=0x7068ab00)
at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
6411  if (reloadreg  r-where == loc)
(reg:DI 0 ax)
(reg/v/f:DI 182 [ b ])
(gdb) call debug_rtx (*loc)
(subreg:SI (reg/v/f:DI 182 [ b ]) 0)
(gdb)


-- 
H.J.


Re: PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const

2011-06-27 Thread H.J. Lu
On Mon, Jun 27, 2011 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 3:19 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
 H.J. Lu wrote:

 reload generates:

 (insn 914 912 0 (set (reg:SI 0 ax)
         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
             (const_int 8 [0x8]))) 248 {*lea_1_x32}
      (nil))

 from

 insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

 Interesting.  The pseudo should have been replaced by the
 hard register (reg:DI 1) during the preceding call to
      op0 = find_replacement (XEXP (in, 0));
 (since reload 0 should have pushed a replacement record.)

 Interestingly enough, in the final output that replacement *is*
 performed in the REG_EQUIV note:

 (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

 which is why I hadn't expected this to be a problem here.

 Can you try to find out why the find_replacement doesn't work
 with your test case?


 I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
 a problem?


 find_replacement never checks subreg:

 Breakpoint 3, find_replacement (loc=0x7068ab00)
    at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
 6411          if (reloadreg  r-where == loc)
 (reg:DI 0 ax)
 (reg/v/f:DI 182 [ b ])
 (gdb) call debug_rtx (*loc)
 (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
 (gdb)


This seems to work.  Does it make any senses?

Thanks.


-- 
H.J.
---
diff --git a/gcc/reload.c b/gcc/reload.c
index 3ad46b9..bdc47d3 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -6415,6 +6415,26 @@ find_replacement (rtx *loc)

  return reloadreg;
}
+  else if (reloadreg
+   REG_P (reloadreg)
+   !r-subreg_loc
+   GET_CODE (*loc) == SUBREG
+   r-where == SUBREG_REG (*loc))
+   {
+ int offset;
+
+ if (r-mode != VOIDmode  GET_MODE (reloadreg) != r-mode)
+   reloadreg = gen_rtx_REG (r-mode, REGNO (reloadreg));
+
+ offset = (GET_MODE_SIZE (GET_MODE (reloadreg))
+   - GET_MODE_SIZE (GET_MODE (*loc)));
+
+ if (! BYTES_BIG_ENDIAN)
+   offset = (offset / UNITS_PER_WORD) * UNITS_PER_WORD;
+ else if (! WORDS_BIG_ENDIAN)
+   offset %= UNITS_PER_WORD;
+ return gen_rtx_SUBREG (GET_MODE (*loc), reloadreg, offset);
+   }
   else if (reloadreg  r-subreg_loc == loc)
{
  /* RELOADREG must be either a REG or a SUBREG.