[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
--- Comment #2 from bonzini at gnu dot org 2009-01-31 14:33 --- ??? Andrew, there's 11 vs. 12 instructions. -- bonzini at gnu dot org changed: What|Removed |Added CC||bonzini at gnu dot org Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-01-31 14:33:24 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
--- Comment #3 from bonzini at gnu dot org 2009-01-31 15:39 --- In both versions there's some pessimization in the expansion of _mm_set_ps and _mm_set_ps1. It's probably easier to fix than the regression. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
--- Comment #4 from bonzini at gnu dot org 2009-01-31 16:23 --- I see optimal code with trunk: .LFB8: movaps %xmm1, %xmm4 shrl$2, %edx mov %edx, %edx xorl%eax, %eax addss %xmm0, %xmm4 movaps %xmm4, %xmm3 unpcklps%xmm0, %xmm4 addss %xmm1, %xmm3 movaps %xmm3, %xmm2 addss %xmm1, %xmm2 mulss .LC0(%rip), %xmm1 unpcklps%xmm3, %xmm2 shufps $0, %xmm1, %xmm1 movlhps %xmm4, %xmm2 .align 16 .L2: movaps (%rsi,%rax), %xmm0 addps %xmm2, %xmm0 addps %xmm1, %xmm2 movaps %xmm0, (%rdi,%rax) addq$16, %rax subq$1, %rdx jne .L2 -- bonzini at gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Severity|enhancement |normal Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
--- Comment #1 from pinskia at gcc dot gnu dot org 2008-12-31 15:34 --- This is a target issue really. The number and type of instructions is the same. The difference is just a little reassociation in the addition. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Component|tree-optimization |target http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682
[Bug target/38682] [4.4 Regression] speed regression with sse intrinsics and -ffast-math
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|--- |4.4.0 Version|unknown |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38682