Hi Richard, You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 400.perlbench and 453.povray -- by 1% and 2% respectively.
400.perlbench,perlbench_base.default, 101,939261,951221 453.povray,povray_base.default, 102,707807,721399 Would you please check whether these can be avoided? [Let me know if you need help reproducing this problem.] Thank you, -- Maxim Kuvyrkov https://www.linaro.org > On Jan 27, 2020, at 7:41 PM, Richard Sandiford <richard.sandif...@arm.com> > wrote: > > In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: > > Failed to match this instruction: > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > If we perform the natural simplification to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (reg:SI 97) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > then the pattern matches. And it turns out that we do have a > simplification like that already, but it would only kick in for > extractions from a reg, not a subreg. E.g.: > > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > would simplify to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > IMO the subreg case is even more obviously a simplification > than the bare reg case, since the net effect is to remove > either one or two subregs, rather than simply change the > position of a subreg/truncation. > > However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c > for -m32 on x86_64-linux-gnu, because we could then simplify > a :HI zero_extract to a :QI one. The associated *testqi_ext_3 > pattern did already seem to want to handle QImode extractions: > > "ix86_match_ccmode (insn, CCNOmode) > && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) > || GET_MODE (operands[2]) == SImode > || GET_MODE (operands[2]) == HImode > || GET_MODE (operands[2]) == QImode) > > but I'm not sure how often the QI case would trigger in practice, > since the zero_extract mode was restricted to HI and above. I checked > the other x86 patterns and couldn't see any other instances of this. > > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, > OK to install? > > Richard > > > 2020-01-27 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > PR rtl-optimization/87763 > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > simplification to handle subregs as well as bare regs. > * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. > --- > gcc/config/i386/i386.md | 2 +- > gcc/simplify-rtx.c | 4 +++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 6e9c9bd2fb6..a125ab350bb 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2" > (define_insn_and_split "*testqi_ext_3" > [(set (match_operand 0 "flags_reg_operand") > (match_operator 1 "compare_operator" > - [(zero_extract:SWI248 > + [(zero_extract:SWI > (match_operand 2 "nonimmediate_operand" "rm") > (match_operand 3 "const_int_operand" "n") > (match_operand 4 "const_int_operand" "n")) > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > index eff1d07a253..db4f9339c15 100644 > --- a/gcc/simplify-rtx.c > +++ b/gcc/simplify-rtx.c > @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op, > (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without > changing len. */ > if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT) > - && REG_P (XEXP (op, 0)) > + && (REG_P (XEXP (op, 0)) > + || (SUBREG_P (XEXP (op, 0)) > + && REG_P (SUBREG_REG (XEXP (op, 0))))) > && GET_MODE (XEXP (op, 0)) == GET_MODE (op) > && CONST_INT_P (XEXP (op, 1)) > && CONST_INT_P (XEXP (op, 2))) > -- > 2.17.1 >