Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
Hello! > 2015-11-09 Segher Boessenkool> > * gcc/simplify-rtx.c (simplify_truncation): Simplify TRUNCATE > of AND of [LA]SHIFTRT. Revision r230164 (the above patch) regressed: FAIL: gcc.target/alpha/pr42269-1.c scan-assembler-not addl on alpha-linux-gnu. The difference starts in combine, where before the patch, we were able to combine insns: (insn 7 6 8 2 (set (reg:DI 82) (lshiftrt:DI (reg:DI 81 [ x ]) (const_int 16 [0x10]))) pr42269-1.c:8 66 {lshrdi3} (expr_list:REG_DEAD (reg:DI 81 [ x ]) (nil))) (insn 8 7 11 2 (set (reg:DI 70 [ _2 ]) (sign_extend:DI (subreg:SI (reg:DI 82) 0))) pr42269-1.c:8 2 {*extendsidi2_1} (expr_list:REG_DEAD (reg:DI 82) (nil))) to: Trying 7 -> 8: Successfully matched this instruction: (set (reg:DI 70 [ _2 ]) (zero_extract:DI (reg/v:DI 80 [ x ]) (const_int 16 [0x10]) (const_int 16 [0x10]))) allowing combination of insns 7 and 8 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 7. modifying insn i3 8: r70:DI=zero_extract(r80:DI,0x10,0x10) deferring rescan insn with uid = 8. After the patch, the combination fails: Trying 7 -> 8: Failed to match this instruction: (set (reg:DI 70 [ _2 ]) (sign_extend:DI (lshiftrt:SI (subreg:SI (reg/v:DI 80 [ x ]) 0) (const_int 16 [0x10] Uros.
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
On Fri, Nov 13, 2015 at 3:36 PM, Segher Boessenkoolwrote: > Hi! > > On Fri, Nov 13, 2015 at 11:02:55AM +0100, Uros Bizjak wrote: >> on alpha-linux-gnu. >> >> The difference starts in combine, where before the patch, we were able >> to combine insns: >> >> (insn 7 6 8 2 (set (reg:DI 82) >> (lshiftrt:DI (reg:DI 81 [ x ]) >> (const_int 16 [0x10]))) pr42269-1.c:8 66 {lshrdi3} >> (expr_list:REG_DEAD (reg:DI 81 [ x ]) >> (nil))) >> (insn 8 7 11 2 (set (reg:DI 70 [ _2 ]) >> (sign_extend:DI (subreg:SI (reg:DI 82) 0))) pr42269-1.c:8 2 >> {*extendsidi2_1} >> (expr_list:REG_DEAD (reg:DI 82) >> (nil))) >> >> to: >> >> Trying 7 -> 8: >> Successfully matched this instruction: >> (set (reg:DI 70 [ _2 ]) >> (zero_extract:DI (reg/v:DI 80 [ x ]) >> (const_int 16 [0x10]) >> (const_int 16 [0x10]))) >> allowing combination of insns 7 and 8 >> original costs 4 + 4 = 8 >> replacement cost 4 >> deferring deletion of insn with uid = 7. >> modifying insn i3 8: r70:DI=zero_extract(r80:DI,0x10,0x10) >> deferring rescan insn with uid = 8. >> >> After the patch, the combination fails: >> >> Trying 7 -> 8: >> Failed to match this instruction: >> (set (reg:DI 70 [ _2 ]) >> (sign_extend:DI (lshiftrt:SI (subreg:SI (reg/v:DI 80 [ x ]) 0) >> (const_int 16 [0x10] > > Somehow, before the patch, it decided to do a zero-extension (where the > combined insns had a sign extension). Was that even correct? Maybe > many bits of reg 80 (or, hrm, 81 in the orig?!) are known zero? Oops, this analysis is wrong. I'll re-do the analysis in reported PR 68330 [1]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68330 Uros.
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
Hi! On Fri, Nov 13, 2015 at 11:02:55AM +0100, Uros Bizjak wrote: > on alpha-linux-gnu. > > The difference starts in combine, where before the patch, we were able > to combine insns: > > (insn 7 6 8 2 (set (reg:DI 82) > (lshiftrt:DI (reg:DI 81 [ x ]) > (const_int 16 [0x10]))) pr42269-1.c:8 66 {lshrdi3} > (expr_list:REG_DEAD (reg:DI 81 [ x ]) > (nil))) > (insn 8 7 11 2 (set (reg:DI 70 [ _2 ]) > (sign_extend:DI (subreg:SI (reg:DI 82) 0))) pr42269-1.c:8 2 > {*extendsidi2_1} > (expr_list:REG_DEAD (reg:DI 82) > (nil))) > > to: > > Trying 7 -> 8: > Successfully matched this instruction: > (set (reg:DI 70 [ _2 ]) > (zero_extract:DI (reg/v:DI 80 [ x ]) > (const_int 16 [0x10]) > (const_int 16 [0x10]))) > allowing combination of insns 7 and 8 > original costs 4 + 4 = 8 > replacement cost 4 > deferring deletion of insn with uid = 7. > modifying insn i3 8: r70:DI=zero_extract(r80:DI,0x10,0x10) > deferring rescan insn with uid = 8. > > After the patch, the combination fails: > > Trying 7 -> 8: > Failed to match this instruction: > (set (reg:DI 70 [ _2 ]) > (sign_extend:DI (lshiftrt:SI (subreg:SI (reg/v:DI 80 [ x ]) 0) > (const_int 16 [0x10] Somehow, before the patch, it decided to do a zero-extension (where the combined insns had a sign extension). Was that even correct? Maybe many bits of reg 80 (or, hrm, 81 in the orig?!) are known zero? Segher
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
On Tue, Nov 10, 2015 at 10:04:30PM +0100, Bernd Schmidt wrote: > On 11/10/2015 06:44 PM, Segher Boessenkool wrote: > > >Yes I know. All the rest of the code around is it like this though. > >Do you want this written in a saner way? > > I won't object to leaving it as-is for now, but in the future it would > be good to keep this in mind. With the trunc_int_for_mode it ended up hugging the righthand margin, so I did clean it up after all. Please see attached (committed). Segher diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 17568ba..c4fc42a 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -714,6 +714,34 @@ simplify_truncation (machine_mode mode, rtx op, return simplify_gen_binary (ASHIFT, mode, XEXP (XEXP (op, 0), 0), XEXP (op, 1)); + /* Likewise (truncate:QI (and:SI (lshiftrt:SI (x:SI) C) C2)) into + (and:QI (lshiftrt:QI (truncate:QI (x:SI)) C) C2) for suitable C + and C2. */ + if (GET_CODE (op) == AND + && (GET_CODE (XEXP (op, 0)) == LSHIFTRT + || GET_CODE (XEXP (op, 0)) == ASHIFTRT) + && CONST_INT_P (XEXP (XEXP (op, 0), 1)) + && CONST_INT_P (XEXP (op, 1))) +{ + rtx op0 = (XEXP (XEXP (op, 0), 0)); + rtx shift_op = XEXP (XEXP (op, 0), 1); + rtx mask_op = XEXP (op, 1); + unsigned HOST_WIDE_INT shift = UINTVAL (shift_op); + unsigned HOST_WIDE_INT mask = UINTVAL (mask_op); + + if (shift < precision + /* If doing this transform works for an X with all bits set, +it works for any X. */ + && ((GET_MODE_MASK (mode) >> shift) & mask) +== ((GET_MODE_MASK (op_mode) >> shift) & mask) + && (op0 = simplify_gen_unary (TRUNCATE, mode, op0, op_mode)) + && (op0 = simplify_gen_binary (LSHIFTRT, mode, op0, shift_op))) + { + mask_op = GEN_INT (trunc_int_for_mode (mask, mode)); + return simplify_gen_binary (AND, mode, op0, mask_op); + } +} + /* Recognize a word extraction from a multi-word subreg. */ if ((GET_CODE (op) == LSHIFTRT || GET_CODE (op) == ASHIFTRT) -- 1.9.3
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
On Tue, Nov 10, 2015 at 12:16:09PM +0100, Bernd Schmidt wrote: > On 11/09/2015 08:33 AM, Segher Boessenkool wrote: > >If we have > > > > (truncate:M1 (and:M2 (lshiftrt:M2 (x:M2) C) C2)) > > > >we can write it instead as > > > > (and:M1 (lshiftrt:M1 (truncate:M1 (x:M2)) C) C2) > > > > > >+ /* Likewise (truncate:QI (and:SI (lshiftrt:SI (x:SI) C) C2)) into > >+ (and:QI (lshiftrt:QI (truncate:QI (x:SI)) C) C2) for suitable C > >+ and C2. */ > >+ if (GET_CODE (op) == AND > >+ && (GET_CODE (XEXP (op, 0)) == LSHIFTRT > >+ || GET_CODE (XEXP (op, 0)) == ASHIFTRT) > >+ && CONST_INT_P (XEXP (XEXP (op, 0), 1)) > >+ && CONST_INT_P (XEXP (op, 1)) > >+ && UINTVAL (XEXP (XEXP (op, 0), 1)) < precision > >+ && ((GET_MODE_MASK (mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) > >+ & UINTVAL (XEXP (op, 1))) > >+ == ((GET_MODE_MASK (op_mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) > >+ & UINTVAL (XEXP (op, 1 > > In general this would be easier to read if there were intermediate > variables called shift_amount and mask. Yes I know. All the rest of the code around is it like this though. Do you want this written in a saner way? > I'm not entirely sure what the > last condition here is supposed to test. It tests whether moving the truncate inside will give the same result. It essentially looks if it works for an x with all bits set; if that works, it works for any x. > Is it related to... > > >+return simplify_gen_binary (AND, mode, op0, XEXP (op, 1)); > > ... the fact that here I think you'd have to trunc_int_for_mode the AND > amount for the smaller mode? Ugh yes, I still have to do that for it to be valid RTL in all cases. Thanks for catching it. Segher
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
On 11/10/2015 06:44 PM, Segher Boessenkool wrote: Yes I know. All the rest of the code around is it like this though. Do you want this written in a saner way? I won't object to leaving it as-is for now, but in the future it would be good to keep this in mind. I'm not entirely sure what the last condition here is supposed to test. It tests whether moving the truncate inside will give the same result. It essentially looks if it works for an x with all bits set; if that works, it works for any x. Yeah, I figured afterwards that must have been the purpose of the test but I was thinking of other constants because of the trunc_int_for_mode thing. (I probably would have written "(and_const >> shift_amount) & ~small_mask == 0" but yours should be ok too). You might want to use your description as a comment. ... the fact that here I think you'd have to trunc_int_for_mode the AND amount for the smaller mode? Ugh yes, I still have to do that for it to be valid RTL in all cases. Thanks for catching it. So FAOD the patch is OK with that change. Bernd
Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
On 11/09/2015 08:33 AM, Segher Boessenkool wrote: If we have (truncate:M1 (and:M2 (lshiftrt:M2 (x:M2) C) C2)) we can write it instead as (and:M1 (lshiftrt:M1 (truncate:M1 (x:M2)) C) C2) + /* Likewise (truncate:QI (and:SI (lshiftrt:SI (x:SI) C) C2)) into + (and:QI (lshiftrt:QI (truncate:QI (x:SI)) C) C2) for suitable C + and C2. */ + if (GET_CODE (op) == AND + && (GET_CODE (XEXP (op, 0)) == LSHIFTRT + || GET_CODE (XEXP (op, 0)) == ASHIFTRT) + && CONST_INT_P (XEXP (XEXP (op, 0), 1)) + && CONST_INT_P (XEXP (op, 1)) + && UINTVAL (XEXP (XEXP (op, 0), 1)) < precision + && ((GET_MODE_MASK (mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) + & UINTVAL (XEXP (op, 1))) +== ((GET_MODE_MASK (op_mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) +& UINTVAL (XEXP (op, 1 In general this would be easier to read if there were intermediate variables called shift_amount and mask. I'm not entirely sure what the last condition here is supposed to test. Is it related to... + return simplify_gen_binary (AND, mode, op0, XEXP (op, 1)); ... the fact that here I think you'd have to trunc_int_for_mode the AND amount for the smaller mode? Bernd
[PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt
If we have (truncate:M1 (and:M2 (lshiftrt:M2 (x:M2) C) C2)) we can write it instead as (and:M1 (lshiftrt:M1 (truncate:M1 (x:M2)) C) C2) (if that is valid, of course), which has smaller modes for the binary ops, and the truncate can often simplify further (if "x" is a register, for example). This fixes gcc.target/powerpc/20050603-3.c for -m32 -mpowerpc64; also that test is currently restricted to ilp32, but we can run it with lp64 just fine, in which case it fixes that, too. Bootstrapped and tested on powerpc64-linux (-m32,-m32/-mpowerpc64,-m64). Is this okay for trunk? Segher 2015-11-09 Segher Boessenkool* gcc/simplify-rtx.c (simplify_truncation): Simplify TRUNCATE of AND of [LA]SHIFTRT. --- gcc/simplify-rtx.c | 25 + 1 file changed, 25 insertions(+) diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 17568ba..1adb393 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -714,6 +714,31 @@ simplify_truncation (machine_mode mode, rtx op, return simplify_gen_binary (ASHIFT, mode, XEXP (XEXP (op, 0), 0), XEXP (op, 1)); + /* Likewise (truncate:QI (and:SI (lshiftrt:SI (x:SI) C) C2)) into + (and:QI (lshiftrt:QI (truncate:QI (x:SI)) C) C2) for suitable C + and C2. */ + if (GET_CODE (op) == AND + && (GET_CODE (XEXP (op, 0)) == LSHIFTRT + || GET_CODE (XEXP (op, 0)) == ASHIFTRT) + && CONST_INT_P (XEXP (XEXP (op, 0), 1)) + && CONST_INT_P (XEXP (op, 1)) + && UINTVAL (XEXP (XEXP (op, 0), 1)) < precision + && ((GET_MODE_MASK (mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) + & UINTVAL (XEXP (op, 1))) +== ((GET_MODE_MASK (op_mode) >> UINTVAL (XEXP (XEXP (op, 0), 1))) +& UINTVAL (XEXP (op, 1 +{ + rtx op0 = simplify_gen_unary (TRUNCATE, mode, XEXP (XEXP (op, 0), 0), + op_mode); + if (op0) + { + op0 = simplify_gen_binary (LSHIFTRT, mode, op0, +XEXP (XEXP (op, 0), 1)); + if (op0) + return simplify_gen_binary (AND, mode, op0, XEXP (op, 1)); + } +} + /* Recognize a word extraction from a multi-word subreg. */ if ((GET_CODE (op) == LSHIFTRT || GET_CODE (op) == ASHIFTRT) -- 1.9.3