> So, the question is if the combine pass really needs to zero-extend > with 0xfffffffe, the left shift << 1 guarantees zero in the LSB, so > 0xffffffff should be better and in line with canonical zero-extension > RTX.
The shift mask is generated in simplify_shift_const_1: mask_rtx = gen_int_mode (nonzero_bits (varop, int_varop_mode), int_result_mode); rtx count_rtx = gen_int_shift_amount (int_result_mode, count); mask_rtx = simplify_const_binary_operation (code, int_result_mode, mask_rtx, count_rtx); Can we adjust the count for ashift if nonzero_bits overlaps it? > Also, ix86_decompose_address accepts ASHIFT RTX when ASHIFT is > embedded in the PLUS chain, but naked ASHIFT is rejected (c.f. the > call in ix86_legitimate_address_p) for some (historic?) reason. It > looks to me that this restriction is not necessary, since > ix86_legitimize_address can canonicalize ASHIFT RTXes without > problems. The attached patch that survives bootstrap and regtest can > help in your case. We have a split to transform ashift to mult, I'm afraid it could not help this issue. Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2021年8月16日周一 下午4:12写道: > > On Fri, Aug 13, 2021 at 9:21 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Fri, Aug 13, 2021 at 2:48 AM Hongyu Wang <hongyu.w...@intel.com> wrote: > > > > > > Hi, > > > > > > For lea + zero_extendsidi insns, if dest of lea and src of zext are the > > > same, combine them with single leal under 64bit target since 32bit > > > register will be automatically zero-extended. > > > > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > > > Ok for master? > > > > > > gcc/ChangeLog: > > > > > > PR target/101716 > > > * config/i386/i386.md (*lea<mode>_zext): New define_insn. > > > (define_peephole2): New peephole2 to combine zero_extend > > > with lea. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/101716 > > > * gcc.target/i386/pr101716.c: New test. > > > > This form should be covered by ix86_decompose_address via > > address_no_seg_operand predicate. Combine creates: > > > > Trying 6 -> 7: > > 6: {r86:DI=r87:DI<<0x1;clobber flags:CC;} > > REG_DEAD r87:DI > > REG_UNUSED flags:CC > > 7: r85:DI=zero_extend(r86:DI#0) > > REG_DEAD r86:DI > > Failed to match this instruction: > > (set (reg:DI 85) > > (and:DI (ashift:DI (reg:DI 87) > > (const_int 1 [0x1])) > > (const_int 4294967294 [0xfffffffe]))) > > > > which does not fit: > > > > else if (GET_CODE (addr) == AND > > && const_32bit_mask (XEXP (addr, 1), DImode)) > > > > After reload, we lose SUBREG, so REE does not trigger on: > > > > (insn 17 3 7 2 (set (reg:DI 0 ax [86]) > > (mult:DI (reg:DI 5 di [87]) > > (const_int 2 [0x2]))) "pr101716.c":4:13 204 {*leadi} > > (nil)) > > (insn 7 17 13 2 (set (reg:DI 0 ax [85]) > > (zero_extend:DI (reg:SI 0 ax [86]))) "pr101716.c":4:19 136 > > {*zero_extendsidi2} > > (nil)) > > > > So, the question is if the combine pass really needs to zero-extend > > with 0xfffffffe, the left shift << 1 guarantees zero in the LSB, so > > 0xffffffff should be better and in line with canonical zero-extension > > RTX. > > Also, ix86_decompose_address accepts ASHIFT RTX when ASHIFT is > embedded in the PLUS chain, but naked ASHIFT is rejected (c.f. the > call in ix86_legitimate_address_p) for some (historic?) reason. It > looks to me that this restriction is not necessary, since > ix86_legitimize_address can canonicalize ASHIFT RTXes without > problems. The attached patch that survives bootstrap and regtest can > help in your case. > > Uros.