On Thu, Mar 8, 2012 at 3:22 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Fri, Mar 2, 2012 at 10:02 PM, H.J. Lu <hongjiu...@intel.com> wrote: > >> This patches uses word_mode instead of Pmode in loop expand since >> word_mode may have bigger size than Pmode. OK for trunk? >> >> Thanks. >> >> H.J. >> --- >> 2012-03-02 H.J. Lu <hongjiu...@intel.com> >> >> * config/i386/i386.c (ix86_expand_movmem): Use word_mode instead >> of Pmode on loop. >> (ix86_expand_setmem): Likwise. > > Jan, can you please comment on the changes in this patch? >
Here is a complete updated patch to use word_mode in ix86_expand_movmem and ix86_expand_setmem. It also fixes ix86_zero_extend_to_Pmode to handle Pmode != DImode. OK for trunk? Thanks. -- H.J. --- 2012-03-10 H.J. Lu <hongjiu...@intel.com> * config/i386/i386.c (ix86_zero_extend_to_Pmode): Handle Pmode != DImode. (ix86_expand_movmem): Use word_mode for size needed for loop. (ix86_expand_setmem): Likewise.
2012-03-10 H.J. Lu <hongjiu...@intel.com> * config/i386/i386.c (ix86_zero_extend_to_Pmode): Handle Pmode != DImode. (ix86_expand_movmem): Use word_mode for size needed for loop. (ix86_expand_setmem): Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index bc144a9..a51c6b4 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -21031,7 +21031,11 @@ ix86_zero_extend_to_Pmode (rtx exp) if (GET_MODE (exp) == Pmode) return copy_to_mode_reg (Pmode, exp); r = gen_reg_rtx (Pmode); - emit_insn (gen_zero_extendsidi2 (r, exp)); + if (Pmode == DImode) + emit_insn (gen_zero_extendsidi2 (r, exp)); + else + emit_move_insn (r, + simplify_gen_subreg (Pmode, exp, GET_MODE (exp), 0)); return r; } @@ -22060,11 +22064,11 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx align_exp, gcc_unreachable (); case loop: need_zero_guard = true; - size_needed = GET_MODE_SIZE (Pmode); + size_needed = GET_MODE_SIZE (word_mode); break; case unrolled_loop: need_zero_guard = true; - size_needed = GET_MODE_SIZE (Pmode) * (TARGET_64BIT ? 4 : 2); + size_needed = GET_MODE_SIZE (word_mode) * (TARGET_64BIT ? 4 : 2); break; case rep_prefix_8_byte: size_needed = 8; @@ -22230,13 +22234,13 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx align_exp, break; case loop: expand_set_or_movmem_via_loop (dst, src, destreg, srcreg, NULL, - count_exp, Pmode, 1, expected_size); + count_exp, word_mode, 1, expected_size); break; case unrolled_loop: /* Unroll only by factor of 2 in 32bit mode, since we don't have enough registers for 4 temporaries anyway. */ expand_set_or_movmem_via_loop (dst, src, destreg, srcreg, NULL, - count_exp, Pmode, TARGET_64BIT ? 4 : 2, + count_exp, word_mode, TARGET_64BIT ? 4 : 2, expected_size); break; case rep_prefix_8_byte: @@ -22448,11 +22452,11 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx val_exp, rtx align_exp, gcc_unreachable (); case loop: need_zero_guard = true; - size_needed = GET_MODE_SIZE (Pmode); + size_needed = GET_MODE_SIZE (word_mode); break; case unrolled_loop: need_zero_guard = true; - size_needed = GET_MODE_SIZE (Pmode) * 4; + size_needed = GET_MODE_SIZE (word_mode) * 4; break; case rep_prefix_8_byte: size_needed = 8; @@ -22623,11 +22627,11 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx val_exp, rtx align_exp, break; case loop: expand_set_or_movmem_via_loop (dst, NULL, destreg, NULL, promoted_val, - count_exp, Pmode, 1, expected_size); + count_exp, word_mode, 1, expected_size); break; case unrolled_loop: expand_set_or_movmem_via_loop (dst, NULL, destreg, NULL, promoted_val, - count_exp, Pmode, 4, expected_size); + count_exp, word_mode, 4, expected_size); break; case rep_prefix_8_byte: expand_setmem_via_rep_stos (dst, destreg, promoted_val, count_exp,