[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2009-01-31 Thread bonzini at gnu dot org


--- Comment #2 from bonzini at gnu dot org  2009-01-31 14:33 ---
??? Andrew, there's 11 vs. 12 instructions.


-- 

bonzini at gnu dot org changed:

   What|Removed |Added

 CC||bonzini at gnu dot org
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-01-31 14:33:24
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2009-01-31 Thread bonzini at gnu dot org


--- Comment #3 from bonzini at gnu dot org  2009-01-31 15:39 ---
In both versions there's some pessimization in the expansion of _mm_set_ps and
_mm_set_ps1.  It's probably easier to fix than the regression.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2009-01-31 Thread bonzini at gnu dot org


--- Comment #4 from bonzini at gnu dot org  2009-01-31 16:23 ---
I see optimal code with trunk:

.LFB8:
movaps  %xmm1, %xmm4
shrl$2, %edx
mov %edx, %edx
xorl%eax, %eax
addss   %xmm0, %xmm4
movaps  %xmm4, %xmm3
unpcklps%xmm0, %xmm4
addss   %xmm1, %xmm3
movaps  %xmm3, %xmm2
addss   %xmm1, %xmm2
mulss   .LC0(%rip), %xmm1
unpcklps%xmm3, %xmm2
shufps  $0, %xmm1, %xmm1
movlhps %xmm4, %xmm2
.align 16
.L2:
movaps  (%rsi,%rax), %xmm0
addps   %xmm2, %xmm0
addps   %xmm1, %xmm2
movaps  %xmm0, (%rdi,%rax)
addq$16, %rax
subq$1, %rdx
jne .L2


-- 

bonzini at gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2009-01-06 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2009-01-05 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Severity|enhancement |normal
   Keywords||missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2008-12-31 Thread pinskia at gcc dot gnu dot org


--- Comment #1 from pinskia at gcc dot gnu dot org  2008-12-31 15:34 ---
This is a target issue really.  The number and type of instructions is the
same.  The difference is just a little reassociation in the addition.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|tree-optimization   |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682



[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math

2008-12-31 Thread pinskia at gcc dot gnu dot org


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|--- |4.4.0
Version|unknown |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682