[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

law at redhat dot com Tue, 29 Sep 2009 14:56:11 -0700


------- Comment #4 from law at redhat dot com  2009-09-29 21:55 -------
Subject: Re:  GCC choosing poor code sequence for certain
 stores (x86)


On 09/29/09 15:18, rth at gcc dot gnu dot org wrote:
> ------- Comment #3 from rth at gcc dot gnu dot org  2009-09-29 21:18 -------
> There are already peepholes for this, though the condition appears to be
> slightly wrong for -Os.  See i386.md:21121 :
>
> (define_peephole2
>    [(match_scratch:SI 1 "r")
>     (set (match_operand:SI 0 "memory_operand" "")
>          (const_int 0))]
>    "optimize_insn_for_speed_p ()
>     &&  ! TARGET_USE_MOV0
>     &&  TARGET_SPLIT_LONG_MOVES
>     &&  get_attr_length (insn)>= ix86_cur_cost ()->large_insn
>     &&  peep2_regno_dead_p (0, FLAGS_REG)"
>
>    

Ah, yes, the flags register needs to be available.

As for the condition, after reading optimization guides for the various 
x86 chips that

     mov $0, <mem>

is generally going to be faster than

     xor  temp, temp
     mov temp, <mem>

So I was thinking we'd want something like this for the condition.

  ((optimize_insn_for_size_p ()
    || (!TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cur_cost()->large_insn))
&& peep2_regno_dead_p (0, FLAGS_REG)

Which I think should always give us the xor sequence when optimizing for 
size or when optimizing for the odd x86 implementation where the xor 
sequence is faster.


I can easily bundle that up as a patch if it looks right to you...

Jeff


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505

[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

Reply via email to