On Fri, Aug 13, 2021 at 9:21 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 2:48 AM Hongyu Wang <hongyu.w...@intel.com> wrote:
> >
> > Hi,
> >
> > For lea + zero_extendsidi insns, if dest of lea and src of zext are the
> > same, combine them with single leal under 64bit target since 32bit
> > register will be automatically zero-extended.
> >
> > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > Ok for master?
> >
> > gcc/ChangeLog:
> >
> >         PR target/101716
> >         * config/i386/i386.md (*lea<mode>_zext): New define_insn.
> >         (define_peephole2): New peephole2 to combine zero_extend
> >         with lea.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/101716
> >         * gcc.target/i386/pr101716.c: New test.
>
> This form should be covered by ix86_decompose_address via
> address_no_seg_operand predicate. Combine creates:
>
> Trying 6 -> 7:
>    6: {r86:DI=r87:DI<<0x1;clobber flags:CC;}
>      REG_DEAD r87:DI
>      REG_UNUSED flags:CC
>    7: r85:DI=zero_extend(r86:DI#0)
>      REG_DEAD r86:DI
> Failed to match this instruction:
> (set (reg:DI 85)
>    (and:DI (ashift:DI (reg:DI 87)
>            (const_int 1 [0x1]))
>        (const_int 4294967294 [0xfffffffe])))
>
> which does not fit:
>
>       else if (GET_CODE (addr) == AND
>            && const_32bit_mask (XEXP (addr, 1), DImode))
>
> After reload, we lose SUBREG, so REE does not trigger on:
>
> (insn 17 3 7 2 (set (reg:DI 0 ax [86])
>        (mult:DI (reg:DI 5 di [87])
>            (const_int 2 [0x2]))) "pr101716.c":4:13 204 {*leadi}
>     (nil))
> (insn 7 17 13 2 (set (reg:DI 0 ax [85])
>        (zero_extend:DI (reg:SI 0 ax [86]))) "pr101716.c":4:19 136
> {*zero_extendsidi2}
>     (nil))
>
> So, the question is if the combine pass really needs to zero-extend
> with 0xfffffffe, the left shift << 1 guarantees zero in the LSB, so
> 0xffffffff should be better and in line with canonical zero-extension
> RTX.

Also, ix86_decompose_address accepts ASHIFT RTX when ASHIFT is
embedded in the PLUS chain, but naked ASHIFT is rejected (c.f. the
call in ix86_legitimate_address_p) for some (historic?) reason. It
looks to me that this restriction is not necessary, since
ix86_legitimize_address can canonicalize ASHIFT RTXes without
problems. The attached patch that survives bootstrap and regtest can
help in your case.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 4d4ab6a03d6..9395716dd60 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10018,8 +10018,7 @@ ix86_live_on_entry (bitmap regs)
 
 /* Extract the parts of an RTL expression that is a valid memory address
    for an instruction.  Return 0 if the structure of the address is
-   grossly off.  Return -1 if the address contains ASHIFT, so it is not
-   strictly valid, but still used for computing length of lea instruction.  */
+   grossly off.  */
 
 int
 ix86_decompose_address (rtx addr, struct ix86_address *out)
@@ -10029,7 +10028,6 @@ ix86_decompose_address (rtx addr, struct ix86_address 
*out)
   HOST_WIDE_INT scale = 1;
   rtx scale_rtx = NULL_RTX;
   rtx tmp;
-  int retval = 1;
   addr_space_t seg = ADDR_SPACE_GENERIC;
 
   /* Allow zero-extended SImode addresses,
@@ -10179,7 +10177,6 @@ ix86_decompose_address (rtx addr, struct ix86_address 
*out)
       if ((unsigned HOST_WIDE_INT) scale > 3)
        return 0;
       scale = 1 << scale;
-      retval = -1;
     }
   else
     disp = addr;                       /* displacement */
@@ -10252,7 +10249,7 @@ ix86_decompose_address (rtx addr, struct ix86_address 
*out)
   out->scale = scale;
   out->seg = seg;
 
-  return retval;
+  return 1;
 }
 
 /* Return cost of the memory address x.
@@ -10765,7 +10762,7 @@ ix86_legitimate_address_p (machine_mode, rtx addr, bool 
strict)
   HOST_WIDE_INT scale;
   addr_space_t seg;
 
-  if (ix86_decompose_address (addr, &parts) <= 0)
+  if (ix86_decompose_address (addr, &parts) == 0)
     /* Decomposition failed.  */
     return false;
 

Reply via email to