Re: [PATCH, i386]: Avoid partial reg stall with arith insn + setCC + movzbl sequence

2012-03-12 Thread Paolo Bonzini
Il 12/03/2012 09:52, Uros Bizjak ha scritto:
 +(define_peephole2
 +  [(parallel [(set (reg FLAGS_REG) (match_operand 0  ))
 +   (match_operand 4  )])
 +   (set (match_operand:QI 1 register_operand )
 + (match_operator:QI 2 ix86_comparison_operator
 +   [(reg FLAGS_REG) (const_int 0)]))
 +   (set (match_operand 3 q_regs_operand )
 + (zero_extend (match_dup 1)))]
 +  (peep2_reg_dead_p (3, operands[1])
 +|| operands_match_p (operands[1], operands[3]))
 +! reg_overlap_mentioned_p (operands[3], operands[0])

I understand that you're assuming the shape of operands[4] to be the
same as operands[3], but would it be preferrable to add another overlap
check on operands[4]?

For example the transformation is invalid if you had an overlap between
operands[3] and the destination of operands[4].

Paolo


Re: [PATCH, i386]: Avoid partial reg stall with arith insn + setCC + movzbl sequence

2012-03-12 Thread Uros Bizjak
On Mon, Mar 12, 2012 at 11:13 AM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 12/03/2012 09:52, Uros Bizjak ha scritto:
 +(define_peephole2
 +  [(parallel [(set (reg FLAGS_REG) (match_operand 0  ))
 +           (match_operand 4  )])
 +   (set (match_operand:QI 1 register_operand )
 +     (match_operator:QI 2 ix86_comparison_operator
 +       [(reg FLAGS_REG) (const_int 0)]))
 +   (set (match_operand 3 q_regs_operand )
 +     (zero_extend (match_dup 1)))]
 +  (peep2_reg_dead_p (3, operands[1])
 +    || operands_match_p (operands[1], operands[3]))
 +    ! reg_overlap_mentioned_p (operands[3], operands[0])

 I understand that you're assuming the shape of operands[4] to be the
 same as operands[3], but would it be preferrable to add another overlap
 check on operands[4]?

 For example the transformation is invalid if you had an overlap between
 operands[3] and the destination of operands[4].

The destination of operands[4] _always_ matches one of operands inside
operand[0]. All arithmetic insn that set flags are destructive on x86.

Uros.