[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 Andrew Pinski changed: What|Removed |Added CC||gabravier at gmail dot com --- Comment #7 from Andrew Pinski --- *** Bug 94837 has been marked as a duplicate of this bug. ***
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 --- Comment #6 from sgunderson at bigfoot dot com 2012-09-15 20:28:02 UTC --- Ah. So basically it hurts AMD enough (the opposite doesn't hit Intel enough) that the choice was made to make it that way generic too. Well, as long as it's a deliberate choice, I assume it's a reasonable tradeoff, so thanks for the enlightenment. :-)
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 --- Comment #5 from H.J. Lu 2012-09-15 20:17:24 UTC --- (In reply to comment #4) > I'm not sure if I understand the comment very well; it talks about Pentium 4, > but none of them run 64-bit code, do they? Wrong quote. It should be /* X86_TUNE_INTER_UNIT_MOVES */ ~(m_AMD_MULTIPLE | m_GENERIC),
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 --- Comment #4 from sgunderson at bigfoot dot com 2012-09-15 16:54:28 UTC --- I'm not sure if I understand the comment very well; it talks about Pentium 4, but none of them run 64-bit code, do they?
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 --- Comment #3 from Andrew Pinski 2012-09-15 16:50:31 UTC --- (In reply to comment #2) > Interesting. So it's a conscious choice that “generic” does this? Yes: /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: In the Generic model we have a conflict here in between PPro/Pentium4 based chips that thread 128bit SSE registers as single units versus K8 based chips that divide SSE registers to two 64bit halves. This knob promotes all store destinations to be 128bit to allow register renaming on 128bit SSE units, but usually results in one extra microop on 64bit SSE units. Experimental results shows that disabling this option on P4 brings over 20% SPECfp regression, while enabling it on K8 brings roughly 2.4% regression that can be partly masked by careful scheduling of moves. */ m_PPRO | m_P4_NOCONA | m_CORE2I7 | m_ATOM | m_AMDFAM10 | m_BDVER | m_GENERIC,
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 --- Comment #2 from sgunderson at bigfoot dot com 2012-09-15 16:38:34 UTC --- Interesting. So it's a conscious choice that “generic” does this?
[Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #1 from Andrew Pinski 2012-09-15 16:35:43 UTC --- This depends on the actually x86 processor. On AMD processors, it is better to go through memory than going direct.