Hi! The combiner combines (set (reg:SI x) (and:SI (reg:SI y) (const_int 1234))) (set (reg:DI z) (zero_extend:DI (reg:SI x))) into (set (reg:DI z) (and:DI (subreg:DI (reg:SI (y) 0) (const_int 1234)))) which unfortunately isn't the best form on x86_64 from RA POV, because if y needs to be moved around, with the paradoxical subreg it is copied around as DImode which is one byte longer than SImode copy. We only need the low 32-bits from that though.
The following patch fixes it, by splitting what combiner creates before RA into andsi_1_zext pattern. My initial version of the patch did it for all DImode operands[1], but that unfortunately regressed pr49095.c where mem &= const with flag setting is no longer being peepholed. So, this patch limits it to whatever combiner creates. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-06-12 Jakub Jelinek <ja...@redhat.com> PR target/53639 * config/i386/i386.md (*anddi_1 into *andsi_1_zext splitter): New. --- gcc/config/i386/i386.md.jj 2012-06-12 09:46:26.449838954 +0200 +++ gcc/config/i386/i386.md 2012-06-12 12:22:58.972769925 +0200 @@ -7933,6 +7933,18 @@ (define_insn "*andqi_1_slp" [(set_attr "type" "alu1") (set_attr "mode" "QI")]) +;; Turn *anddi_1 into *andsi_1_zext if possible. +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI (subreg:DI (match_operand:SI 1 "register_operand") 0) + (match_operand:DI 2 "x86_64_zext_immediate_operand"))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT" + [(parallel [(set (match_dup 0) + (zero_extend:DI (and:SI (match_dup 1) (match_dup 2)))) + (clobber (reg:CC FLAGS_REG))])] + "operands[2] = gen_lowpart (SImode, operands[2]);") + (define_split [(set (match_operand:SWI248 0 "register_operand") (and:SWI248 (match_operand:SWI248 1 "nonimmediate_operand") Jakub