[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread ubizjak at gmail dot com
--- Comment #11 from ubizjak at gmail dot com 2007-12-13 14:24 --- c testcase: --cut here-- extern void foo(void); extern double *dpb; double s05m_test(void) { double result = 0.0; int n; for (n = 0; n 2000; ++n) result += dpb[n]; #ifdef FOOBAR foo(); #endif

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread ubizjak at gmail dot com
--- Comment #9 from ubizjak at gmail dot com 2007-12-13 14:10 --- Reduced c++ testcase that is the cause of the runtime difference: --cut here-- #include iostream extern double *dpb; void s05m_test(void) { double result = 0.0; for (int n = 0; n 2000; ++n) result +=

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread ubizjak at gmail dot com
--- Comment #10 from ubizjak at gmail dot com 2007-12-13 14:12 --- BTW: .p2align are removed manually from the first case for clarity, I have just forgot to remove them in second case before posting. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread rguenth at gcc dot gnu dot org
--- Comment #12 from rguenth at gcc dot gnu dot org 2007-12-13 14:36 --- This is still a register allocation problem. We somehow prefer xmm0 which is call clobbered and causes reloads inside the loop. Micha? :) -- rguenth at gcc dot gnu dot org changed: What

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread rguenth at gcc dot gnu dot org
--- Comment #14 from rguenth at gcc dot gnu dot org 2007-12-13 14:54 --- Does yara address this somehow? -- rguenth at gcc dot gnu dot org changed: What|Removed |Added

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread rguenth at gcc dot gnu dot org
--- Comment #15 from rguenth at gcc dot gnu dot org 2007-12-13 15:00 --- Works with 2.95.4, fails at least starting with 3.3.6 (-m32). Also happens on x86_64, but there it's not a regression. Happens on all targets that have only call-clobbered registers that can hold 'result'. --

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-12-13 Thread rguenth at gcc dot gnu dot org
--- Comment #13 from rguenth at gcc dot gnu dot org 2007-12-13 14:43 --- I guess if we would split the life-range of (reg:DF 64 [result]) to not extend over the call, global wouldn't reload all of its uses. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322

[Bug target/23322] [4.1/4.2/4.3 regression] performance regression, possibly related to caching

2007-02-14 Thread mmitchel at gcc dot gnu dot org
-- mmitchel at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.1.2 |4.1.3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23322