[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Priority|P3 |P2 --- Comment #19 from Richard Biener --- Ok, so this means it is coalescing related. We still don't know which coalescing is good/bad though.
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #18 from Martin Liška --- (In reply to Patrik Huber from comment #14) > It even seems a few percent slower after the FDO stuff. But the ` > -fprofile-use` is a bit weird. If there is no .gcda file, it doesn't > complain. If you give it a file that doesn't exist (e.g. -fprofile-use=foo), > then it doesn't complain either. So how can I check whether it really ran > the FDO? Yep, maybe having an option that will cause failure would be a good idea. Anyway, you can use -fdump-ipa-profile and check *.065i.profile file where you should see something like: ... Read edge from 0 to 2, count:1 1 edge counts read ... Note that -fprofile-use=foo tells the compiler to search in *folder* foo for corresponding gcda files.
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #17 from Martin Liška --- Created attachment 43654 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43654=edit optimized dump after the revision
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #16 from Martin Liška --- Created attachment 43653 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43653=edit optimized dump before the revision
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 Martin Liška changed: What|Removed |Added Status|WAITING |NEW CC||aoliva at gcc dot gnu.org, ||marxin at gcc dot gnu.org --- Comment #15 from Martin Liška --- I can confirm that on my Haswell machine: model name : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz I see regression with -march=core2 -mtune=core2 -O3 starting from r226901 (first time in GCC 5.x). Time difference is: 0:00:14.975390 vs 0:00:11.889274
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 Richard Biener changed: What|Removed |Added Target Milestone|--- |6.5
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #14 from Patrik Huber --- It even seems a few percent slower after the FDO stuff. But the ` -fprofile-use` is a bit weird. If there is no .gcda file, it doesn't complain. If you give it a file that doesn't exist (e.g. -fprofile-use=foo), then it doesn't complain either. So how can I check whether it really ran the FDO?
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #13 from Patrik Huber --- >> Did you try with FDO? (-fprofile-generate, run, -fprofile-use) I just tried this with g++-7. It didn't help, the final executable has the same slower run time as in the attached log without the FDO.
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #12 from Richard Biener --- Hmm, the preprocessed source(s) are hard to work with given the eigen headers seem to have conditional code on the enabled ISAs. >From a quick look it seems to be inlining related? My past experience says that compute kernels in C++ should have the flatten attribute attached to them... Did you try with FDO? (-fprofile-generate, run, -fprofile-use)
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #10 from Patrik Huber --- Created attachment 43367 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43367=edit gcc5_gemm_test.ii
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #11 from Patrik Huber --- Created attachment 43368 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43368=edit gcc7_gemm_test.ii
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #9 from Marc Glisse --- (In reply to Patrik Huber from comment #6) > I could also upload you the .ii files but they are 5 MB, which the > bugtracker doesn't allow (1 MB limit). preprocessed sources are the .ii files (you can use compression).
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #8 from Patrik Huber --- Created attachment 43366 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43366=edit full_log.txt
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #7 from Patrik Huber --- Created attachment 43365 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43365=edit gemm_test.cpp
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #6 from Patrik Huber --- I could also upload you the .ii files but they are 5 MB, which the bugtracker doesn't allow (1 MB limit).
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #5 from Patrik Huber --- Created attachment 43364 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43364=edit gcc7_gemm_test.s
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #4 from Patrik Huber --- Created attachment 43363 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43363=edit gcc5_gemm_test.s
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #3 from Patrik Huber --- @Richard: I'm not 100% sure what you mean with "preprocessed source" but I googled and you probably mean the output of compiling with "-c -save-temps". Please see attached.
[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 Marc Glisse changed: What|Removed |Added Known to work||5.5.0 Summary|Performance regression in |[6/7/8 Regression] |g++-7 with Eigen for|Performance regression in |non-AVX2 CPUs |g++-7 with Eigen for ||non-AVX2 CPUs Known to fail||6.4.0, 7.2.0 --- Comment #2 from Marc Glisse --- The difference seems to be between gcc-5 and gcc-6.