On Tue, Nov 17, 2015 at 05:21:01PM +0800, Bin Cheng wrote:
> Hi,
> GIMPLE IVO needs to call backend interface to calculate costs for addr
> expressions like below:
>    FORM1: "r73 + r74 + 16380"
>    FORM2: "r73 << 2 + r74 + 16380"
> 
> They are invalid address expression on AArch64, so will be legitimized by
> aarch64_legitimize_address.  Below are what we got from that function:
> 
> For FORM1, the address expression is legitimized into below insn sequence
> and rtx:
>    r84:DI=r73:DI+r74:DI
>    r85:DI=r84:DI+0x3000
>    r83:DI=r85:DI
>    "r83 + 4092"
> 
> For FORM2, the address expression is legitimized into below insn sequence
> and rtx:
>    r108:DI=r73:DI<<0x2
>    r109:DI=r108:DI+r74:DI
>    r110:DI=r109:DI+0x3000
>    r107:DI=r110:DI
>    "r107 + 4092"
> 
> So the costs computed are 12/16 respectively.  The high cost prevents IVO
> from choosing right candidates.  Besides cost computation, I also think the
> legitmization is bad in terms of code generation.
> The root cause in aarch64_legitimize_address can be described by it's
> comment:
>    /* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask),
>       where mask is selected by alignment and size of the offset.
>       We try to pick as large a range for the offset as possible to
>       maximize the chance of a CSE.  However, for aligned addresses
>       we limit the range to 4k so that structures with different sized
>       elements are likely to use the same base.  */
> I think the split of CONST is intended for REG+CONST where the const offset
> is not in the range of AArch64's addressing modes.  Unfortunately, it
> doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<<SCALE+CONST"
> when the CONST are in the range of addressing modes.  As a result, these two
> cases fallthrough this logic, resulting in sub-optimal results.
> 
> It's obvious we can do below legitimization:
> FORM1:
>    r83:DI=r73:DI+r74:DI
>    "r83 + 16380"
> FORM2:
>    r107:DI=0x3ffc
>    r106:DI=r74:DI+r107:DI
>       REG_EQUAL r74:DI+0x3ffc
>    "r106 + r73 << 2"
> 
> This patch handles these two cases as described.

Thanks for the description, it made the patch very easy to review. I only
have a style comment.

> Bootstrap & test on AArch64 along with other patch.  Is it OK?
> 
> 2015-11-04  Bin Cheng  <bin.ch...@arm.com>
>           Jiong Wang  <jiong.w...@arm.com>
> 
>       * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle
>       address expressions like REG+REG+CONST and REG+NON_REG+CONST.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5c8604f..47875ac 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
> machine_mode mode)
>      {
>        HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
>        HOST_WIDE_INT base_offset;
> +      rtx op0 = XEXP (x,0);
> +
> +      if (GET_CODE (op0) == PLUS)
> +     {
> +       rtx op0_ = XEXP (op0, 0);
> +       rtx op1_ = XEXP (op0, 1);

I don't see this trailing _ on a variable name in many places in the source
tree (mostly in the Go frontend), and certainly not in the aarch64 backend.
Can we pick a different name for op0_ and op1_?

> +
> +       /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will
> +          reach here, the 'CONST' may be valid in which case we should
> +          not split.  */
> +       if (REG_P (op0_) && REG_P (op1_))
> +         {
> +           machine_mode addr_mode = GET_MODE (op0);
> +           rtx addr = gen_reg_rtx (addr_mode);
> +
> +           rtx ret = plus_constant (addr_mode, addr, offset);
> +           if (aarch64_legitimate_address_hook_p (mode, ret, false))
> +             {
> +               emit_insn (gen_adddi3 (addr, op0_, op1_));
> +               return ret;
> +             }
> +         }
> +       /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
> +          will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
> +          we split it into Y=REG+CONST, Y+NON_REG.  */
> +       else if (REG_P (op0_) || REG_P (op1_))
> +         {
> +           machine_mode addr_mode = GET_MODE (op0);
> +           rtx addr = gen_reg_rtx (addr_mode);
> +
> +           /* Switch to make sure that register is in op0_.  */
> +           if (REG_P (op1_))
> +             std::swap (op0_, op1_);
> +
> +           rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
> +           if (aarch64_legitimate_address_hook_p (mode, ret, false))
> +             {
> +               addr = force_operand (plus_constant (addr_mode,
> +                                                    op0_, offset),
> +                                     NULL_RTX);
> +               ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
> +               return ret;
> +             }

The logic here is a bit hairy to follow, you construct a PLUS RTX to check
aarch64_legitimate_address_hook_p, then construct a different PLUS RTX
to use as the return value. This can probably be clarified by choosing a
name other than ret for the temporary address expression you construct.

It would also be good to take some of your detailed description and write
that here. Certainly I found the explicit examples in the cover letter
easier to follow than:

> +       /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
> +          will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
> +          we split it into Y=REG+CONST, Y+NON_REG.  */

Otherwise this patch is OK.

Thanks,
James

Reply via email to