https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89865
Bug ID: 89865 Summary: [9 Regression] FAIL: gcc.target/i386/pr49095.c scan-assembler-times \\\\), % 45 Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- There are two issues at play that interfere with expected number of scan-assembler-times expression. Please consider this testcase, simplified from gcc.target/i386/pr49095.c: char * hcharplus (char *x) { *x += 24; if (!*x) foo (x); return x; } current gcc trunk generates (-Os -fno-shrink-wrap -mregparm=2 -m32): hcharplus: pushl %ebp movl %esp, %ebp subl $24, %esp movb (%eax), %cl leal 24(%ecx), %edx movb %dl, (%eax) testb %dl, %dl jne .L7 movl %eax, -12(%ebp) call foo movl -12(%ebp), %eax .L7: leave ret Please note the sequence: movb (%eax), %cl leal 24(%ecx), %edx movb %dl, (%eax) testb %dl, %dl which is expected to be handled by the following peephole2 pattern: ;; Likewise for instances where we have a lea pattern. (define_peephole2 [(set (match_operand:SWI 0 "register_operand") (match_operand:SWI 1 "memory_operand")) (set (match_operand:SWI 3 "register_operand") (plus:SWI (match_dup 0) (match_operand:SWI 2 "<nonmemory_operand>"))) (set (match_dup 1) (match_dup 3)) (set (reg FLAGS_REG) (compare (match_dup 3) (const_int 0)))] "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ()) && peep2_reg_dead_p (4, operands[3]) && (rtx_equal_p (operands[0], operands[3]) || peep2_reg_dead_p (2, operands[0])) && !reg_overlap_mentioned_p (operands[0], operands[1]) && !reg_overlap_mentioned_p (operands[3], operands[1]) && !reg_overlap_mentioned_p (operands[0], operands[2]) && (<MODE>mode != QImode || immediate_operand (operands[2], QImode) || any_QIreg_operand (operands[2], QImode)) && ix86_match_ccmode (peep2_next_insn (3), CCGOCmode)" [(parallel [(set (match_dup 4) (match_dup 6)) (set (match_dup 1) (match_dup 5))])] { operands[4] = SET_DEST (PATTERN (peep2_next_insn (3))); operands[5] = gen_rtx_PLUS (<MODE>mode, copy_rtx (operands[1]), operands[2]); operands[6] = gen_rtx_COMPARE (GET_MODE (operands[4]), copy_rtx (operands[5]), const0_rtx); }) However, the above pattern does not look for correct mode of the LEA insn and doesn't take into account that input and output register can differ for LEA. We have the following sequence before peephole2 pass: (insn 25 6 28 2 (set (reg:QI 2 cx [91]) (mem:QI (reg/v/f:SI 0 ax [orig:87 x ] [87]) [0 *x_7(D)+0 S1 A8])) "ra.c":24:6 69 {*movqi_internal} (nil)) (insn 28 25 8 2 (set (reg:SI 1 dx [orig:85 _4 ] [85]) (plus:SI (reg:SI 2 cx [91]) (const_int 24 [0x18]))) "ra.c":24:6 186 {*leasi} (expr_list:REG_DEAD (reg:SI 2 cx [91]) (nil))) (insn 8 28 9 2 (set (mem:QI (reg/v/f:SI 0 ax [orig:87 x ] [87]) [0 *x_7(D)+0 S1 A8]) (reg:QI 1 dx [orig:85 _4 ] [85])) "ra.c":24:6 69 {*movqi_internal} (nil)) (insn 9 8 10 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:QI 1 dx [orig:85 _4 ] [85]) (const_int 0 [0]))) "ra.c":25:6 5 {*cmpqi_ccno_1} (expr_list:REG_DEAD (reg:QI 1 dx [orig:85 _4 ] [85]) (nil))) From the above sequence, it can be seen that the mode of LEA insn in the peephole2 pattern should use LEAMODE mode attribute instead of SWI mode iterator. Also, the regno of (insn 28) output reg should only match the regno of the output of (insn 25), with regno of (insn 28) matching regno of (insn 8) and (insn 9). The other issue with pr49095.c test is, that we now spill call-used register around the call: movl %eax, -12(%ebp) call foo movl -12(%ebp), %eax where gcc-8 used call-preserved register to save the value around the call: movl %eax, %ebx call foo movl %ebx, %eax However, the above approach requires call-preserved register %ebx to be saved in the callee function, so the new approach saves a push/pop pair. In any case, the new assembly changes the result of the scan-assembler-times dg directive, as movl -12(%ebp), %eax triggers the scan-assembler-times regexp.