Jeff Law <l...@redhat.com> writes: > On Mon, 2020-01-27 at 16:41 +0000, Richard Sandiford wrote: >> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: >> >> Failed to match this instruction: >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> If we perform the natural simplification to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (reg:SI 97) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> then the pattern matches. And it turns out that we do have a >> simplification like that already, but it would only kick in for >> extractions from a reg, not a subreg. E.g.: > Yea. I ran into similar problems with the extract/extend bits in > combine. And we know it's a fairly general problem that we don't > handle SUBREGs anywhere near as consistently as REGs. > > >> >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> would simplify to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> IMO the subreg case is even more obviously a simplification >> than the bare reg case, since the net effect is to remove >> either one or two subregs, rather than simply change the >> position of a subreg/truncation. >> >> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c >> for -m32 on x86_64-linux-gnu, because we could then simplify >> a :HI zero_extract to a :QI one. The associated *testqi_ext_3 >> pattern did already seem to want to handle QImode extractions: >> >> "ix86_match_ccmode (insn, CCNOmode) >> && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) >> || GET_MODE (operands[2]) == SImode >> || GET_MODE (operands[2]) == HImode >> || GET_MODE (operands[2]) == QImode) >> >> but I'm not sure how often the QI case would trigger in practice, >> since the zero_extract mode was restricted to HI and above. I checked >> the other x86 patterns and couldn't see any other instances of this. >> >> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, >> OK to install? >> >> Richard >> >> >> 2020-01-27 Richard Sandiford <richard.sandif...@arm.com> >> >> gcc/ >> PR rtl-optimization/87763 >> * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> simplification to handle subregs as well as bare regs. >> * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. > Do you need to check for and reject paradoxicals here? If not, OK as- > is. If you need to check, then that's pre-approved as well.
Thanks, pushed. On the paradoxical thing: the outer subreg is always non-paradoxical for simplify_truncation, so we don't need to check that specifically. For the inner subreg we need to handle the paradoxical case for the PR. I wondered at one point about punting for non-paradoxical inner subregs, but there didn't seem to be a good reason to keep an expression like (truncate (op (truncate...))) over (op (truncate ...)). If it turns out there is a good reason though, changing SUBREG_P to paradoxical_subreg_p would be fine in terms of this PR. Richard