https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656
Bug ID: 92656 Summary: The zero_extend insn can't be eliminated in the combine pass Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bina2374 at gmail dot com Target Milestone: --- Target: RISC-V I compiled the C function with "-march=rv32imafc -mabi=ilp32f -mtune=sifive-7-series -O2 -funroll-loops", and there are more slli/ srli instructions than GCC 8.3. ========== C Source ========== unsigned short foo(unsigned short val, unsigned short result) { unsigned char i = 0; unsigned char data = (unsigned char)(val >> 8); for (i = 0; i < 3; i++) { data >>= 1; if (data & 1) result ^= 0x4002; result >>= 1; } return result; } ========== Assembly GCC 9.2 ========= foo: li a5,16384 srli a4,a0,9 addi t0,a5,2 andi t1,a4,1 xor a3,a1,t0 srli a2,a0,10 bne t1,zero,1f; mv a3,a1; 1: # movcc srli t2,a3,1 andi a6,a2,1 xor a1,t2,t0 srli a0,a0,11 slli a7,a1,16 ### andi t5,a0,1 srli t3,a7,16 ### bne a6,zero,1f; mv t3,t2; 1: # movcc srli t4,t3,1 slli t6,t4,16 ### srli a5,t6,16 ### xor t0,a5,t03 slli a4,t0,16 ### srli t1,a4,16 ### bne t5,zero,1f; mv t1,a5; 1: # movcc srli a0,t1,1 ret ========== Assembly GCC 8.3 ========== foo: srli a0,a0,8 li a5,16384 addi t0,a5,2 srli a2,a0,1 xor a3,a1,t0 andi t1,a2,1 bne t1,zero,1f; mv a3,a1; 1: # movcc srli a4,a0,2 srli t2,a3,1 andi a1,a4,1 xor a6,t2,t0 bne a1,zero,1f; mv a6,t2; 1: # movcc srli a7,a0,3 srli t3,a6,1 andi t4,a7,1 xor t5,t3,t0 bne t4,zero,1f; mv t5,t3; 1: # movcc srli a0,t5,1 ret When combiner try to combine zero_extend insn and another insn, the subst pattern can not simplify according rule below because the last condition (nonzero_bits) can not be met. In simplify-rtx.c: /* (zero_extend:M (subreg:N <X:O>)) is <X:O> (for M == O) or (zero_extend:M <X:O>), if X doesn't have any non-zero bits outside of mode N. E.g. (zero_extend:SI (subreg:QI (and:SI (reg:SI) (const_int 63)) 0)) is (and:SI (reg:SI) (const_int 63)). */ if (partial_subreg_p (op) && is_a <scalar_int_mode> (mode, &int_mode) && is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (op)), &op0_mode) && GET_MODE_PRECISION (op0_mode) <= HOST_BITS_PER_WIDE_INT && GET_MODE_PRECISION (int_mode) >= GET_MODE_PRECISION (op0_mode) && subreg_lowpart_p (op) && (nonzero_bits (SUBREG_REG (op), op0_mode) & ~GET_MODE_MASK (GET_MODE (op))) == 0) { if (GET_MODE_PRECISION (int_mode) == GET_MODE_PRECISION (op0_mode)) return SUBREG_REG (op); return simplify_gen_unary (ZERO_EXTEND, int_mode, SUBREG_REG (op), op0_mode); } By the way, I also noticed this issue could be caused by 2-to-2 combination (https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=263067).