Hi Richard,

You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 
400.perlbench and 453.povray -- by 1% and 2% respectively.

400.perlbench,perlbench_base.default,   101,939261,951221
453.povray,povray_base.default,         102,707807,721399

Would you please check whether these can be avoided?  [Let me know if you need 
help reproducing this problem.]

Thank you,

--
Maxim Kuvyrkov
https://www.linaro.org

> On Jan 27, 2020, at 7:41 PM, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
> 
> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
> 
> Failed to match this instruction:
> (set (reg:SI 95)
>    (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
>                (const_int 3 [0x3])
>                (const_int 0 [0])) 0)
>        (const_int 19 [0x13])))
> 
> If we perform the natural simplification to:
> 
> (set (reg:SI 95)
>    (ashift:SI (sign_extract:SI (reg:SI 97)
>                (const_int 3 [0x3])
>                (const_int 0 [0])) 0)
>        (const_int 19 [0x13])))
> 
> then the pattern matches.  And it turns out that we do have a
> simplification like that already, but it would only kick in for
> extractions from a reg, not a subreg.  E.g.:
> 
> (set (reg:SI 95)
>    (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
>                (const_int 3 [0x3])
>                (const_int 0 [0])) 0)
>        (const_int 19 [0x13])))
> 
> would simplify to:
> 
> (set (reg:SI 95)
>    (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
>                (const_int 3 [0x3])
>                (const_int 0 [0])) 0)
>        (const_int 19 [0x13])))
> 
> IMO the subreg case is even more obviously a simplification
> than the bare reg case, since the net effect is to remove
> either one or two subregs, rather than simply change the
> position of a subreg/truncation.
> 
> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
> for -m32 on x86_64-linux-gnu, because we could then simplify
> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
> pattern did already seem to want to handle QImode extractions:
> 
>  "ix86_match_ccmode (insn, CCNOmode)
>   && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>       || GET_MODE (operands[2]) == SImode
>       || GET_MODE (operands[2]) == HImode
>       || GET_MODE (operands[2]) == QImode)
> 
> but I'm not sure how often the QI case would trigger in practice,
> since the zero_extract mode was restricted to HI and above.  I checked
> the other x86 patterns and couldn't see any other instances of this.
> 
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
> OK to install?
> 
> Richard
> 
> 
> 2020-01-27  Richard Sandiford  <richard.sandif...@arm.com>
> 
> gcc/
>       PR rtl-optimization/87763
>       * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>       simplification to handle subregs as well as bare regs.
>       * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
> ---
> gcc/config/i386/i386.md | 2 +-
> gcc/simplify-rtx.c      | 4 +++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 6e9c9bd2fb6..a125ab350bb 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2"
> (define_insn_and_split "*testqi_ext_3"
>   [(set (match_operand 0 "flags_reg_operand")
>         (match_operator 1 "compare_operator"
> -       [(zero_extract:SWI248
> +       [(zero_extract:SWI
>            (match_operand 2 "nonimmediate_operand" "rm")
>            (match_operand 3 "const_int_operand" "n")
>            (match_operand 4 "const_int_operand" "n"))
> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> index eff1d07a253..db4f9339c15 100644
> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op,
>      (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
>      changing len.  */
>   if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
> -      && REG_P (XEXP (op, 0))
> +      && (REG_P (XEXP (op, 0))
> +       || (SUBREG_P (XEXP (op, 0))
> +           && REG_P (SUBREG_REG (XEXP (op, 0)))))
>       && GET_MODE (XEXP (op, 0)) == GET_MODE (op)
>       && CONST_INT_P (XEXP (op, 1))
>       && CONST_INT_P (XEXP (op, 2)))
> -- 
> 2.17.1
> 

Reply via email to