[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2021-07-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #8 from Andrew Pinski  ---
This is a dup of bug 11877 which is now fixed on the trunk.

*** This bug has been marked as a duplicate of bug 11877 ***

[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-30 Thread law at redhat dot com


--- Comment #7 from law at redhat dot com  2009-09-30 14:47 ---
Subject: Re:  GCC choosing poor code sequence for certain
 stores (x86)

On 09/30/09 03:22, jakub at gcc dot gnu dot org wrote:
> --- Comment #6 from jakub at gcc dot gnu dot org  2009-09-30 09:22 ---
> For x86-64 we perhaps want further checks for the size optimization - if the
> scratch register is %r8d through %r15d, 3 byte xorl %r8d, %r8d and e.g. 3 byte
> movl %r8d, (%rdx) won't be shorter than movl $0, (%rdx) which is 6 bytes).
> And likely the 2 insns will be slower.
> But if the address already needs rex prefix, it is still a win.
>
>
>
Do we have any good way to test if the address needs a rex prefix?  I 
see the rex_prefix attribute in i386.md, but that's for testing an 
entire insn and based on my quick reading of i386.md it's not complete 
as many insns set the attribute explicitly.

Jeff


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-30 Thread jakub at gcc dot gnu dot org


--- Comment #6 from jakub at gcc dot gnu dot org  2009-09-30 09:22 ---
For x86-64 we perhaps want further checks for the size optimization - if the
scratch register is %r8d through %r15d, 3 byte xorl %r8d, %r8d and e.g. 3 byte
movl %r8d, (%rdx) won't be shorter than movl $0, (%rdx) which is 6 bytes).
And likely the 2 insns will be slower.
But if the address already needs rex prefix, it is still a win.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-29 Thread rth at gcc dot gnu dot org


--- Comment #5 from rth at gcc dot gnu dot org  2009-09-29 23:43 ---
Yeah, that looks right.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-29 Thread law at redhat dot com


--- Comment #4 from law at redhat dot com  2009-09-29 21:55 ---
Subject: Re:  GCC choosing poor code sequence for certain
 stores (x86)

On 09/29/09 15:18, rth at gcc dot gnu dot org wrote:
> --- Comment #3 from rth at gcc dot gnu dot org  2009-09-29 21:18 ---
> There are already peepholes for this, though the condition appears to be
> slightly wrong for -Os.  See i386.md:21121 :
>
> (define_peephole2
>[(match_scratch:SI 1 "r")
> (set (match_operand:SI 0 "memory_operand" "")
>  (const_int 0))]
>"optimize_insn_for_speed_p ()
> &&  ! TARGET_USE_MOV0
> &&  TARGET_SPLIT_LONG_MOVES
> &&  get_attr_length (insn)>= ix86_cur_cost ()->large_insn
> &&  peep2_regno_dead_p (0, FLAGS_REG)"
>
>

Ah, yes, the flags register needs to be available.

As for the condition, after reading optimization guides for the various 
x86 chips that

 mov $0, 

is generally going to be faster than

 xor  temp, temp
 mov temp, 

So I was thinking we'd want something like this for the condition.

  ((optimize_insn_for_size_p ()
|| (!TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cur_cost()->large_insn))
&& peep2_regno_dead_p (0, FLAGS_REG)

Which I think should always give us the xor sequence when optimizing for 
size or when optimizing for the odd x86 implementation where the xor 
sequence is faster.


I can easily bundle that up as a patch if it looks right to you...

Jeff


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-29 Thread rth at gcc dot gnu dot org


--- Comment #3 from rth at gcc dot gnu dot org  2009-09-29 21:18 ---
There are already peepholes for this, though the condition appears to be
slightly wrong for -Os.  See i386.md:21121 :

(define_peephole2
  [(match_scratch:SI 1 "r")
   (set (match_operand:SI 0 "memory_operand" "")
(const_int 0))]
  "optimize_insn_for_speed_p ()
   && ! TARGET_USE_MOV0
   && TARGET_SPLIT_LONG_MOVES
   && get_attr_length (insn) >= ix86_cur_cost ()->large_insn
   && peep2_regno_dead_p (0, FLAGS_REG)"


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-29 Thread law at redhat dot com


--- Comment #2 from law at redhat dot com  2009-09-29 17:12 ---
I don't understand your comment Richard.  Isn't it just something like this?
(define_peephole2
  [(match_scratch:SI 2 "r")
   (set (match_operand:SI 0 "memory_operand" "")
(match_operand:SI 1 "const_0_operand" ""))]
  ""
  [(set (match_dup 2) (match_dup 1))
   (set (match_dup 0) (match_dup 2))]
  "")


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505



[Bug target/41505] GCC choosing poor code sequence for certain stores (x86)

2009-09-29 Thread rguenth at gcc dot gnu dot org


--- Comment #1 from rguenth at gcc dot gnu dot org  2009-09-29 16:07 ---
difficult


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505