I have a question about SPEC CPU 2017 and what GCC can and cannot do with -flto. As part of some SPEC analysis I am doing I found that with -Ofast, ICC and GCC were not that far apart (especially spec int rate, spec fp rate was a slightly larger difference).
But when I added -ipo to the ICC command and -flto to the GCC command, the difference got larger. In particular the 519.lbm_r was more than twice as fast with ICC and -ipo, but -flto did not help GCC at all. There are other tests that also show this type of improvement with -ipo like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and 548.exchange2_r, but none are as dramatic as 519.lbm_r. Anyone have any idea on what ICC is doing that GCC is missing? Is GCC just not agressive enough with its inlining? Steve Ellcey sell...@marvell.com