http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46590
Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2010.11.22 13:13:03 CC| |rguenth at gcc dot gnu.org Ever Confirmed|0 |1 --- Comment #9 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-11-22 13:13:03 UTC --- It's a very large monolithic function. And as usual we have a gazillion amount of local IO state variables. On trunk with release checking I see: A quarter of the testcase: alias stmt walking : 15.94 (59%) usr 0.03 ( 8%) sys 16.00 (57%) wall 1845 kB ( 2%) ggc TOTAL : 27.08 0.38 27.89 96306 kB Half of the testcase: alias stmt walking : 63.31 (68%) usr 0.51 (31%) sys 64.06 (67%) wall 3684 kB ( 2%) ggc TOTAL : 93.52 1.66 95.57 241871 kB All of the testcase: alias stmt walking : 259.19 (73%) usr 0.78 (26%) sys 261.79 (72%) wall 7023 kB ( 1%) ggc TOTAL : 356.27 2.98 361.57 690719 kB so it's definitely nearly quadratic (but that's expected). 4.5.x for a quarter of the testcase has: alias stmt walking : 93.10 (88%) usr 0.03 ( 8%) sys 93.31 (87%) wall 0 kB ( 0%) ggc TOTAL : 106.11 0.40 106.93 87895 kB so trunk is already a lot better. Removing the alias stmt walk timevar gets us to the following on trunk (quarter of the testcase again): tree PRE : 12.93 (47%) usr 0.00 ( 0%) sys 12.98 (46%) wall 3607 kB ( 4%) ggc TOTAL : 27.57 0.34 27.99 96324 kB What is costly is translating things through the loop bodies. We can improve this a lot by properly marking the I/O structs as dead once they are no longer used and before they are used first. The proposed virtual kill stmts could be used for that. We could also build this kind of lifeness information up-front and use it to limit the walking (but that again is only trivial for non-address taken variables, which the I/O structs are not). Anyway, confirmed. Fortran I/O and array descriptor temporaries really need re-use (I proposed a patch for I/O ones once but it was shot down because of async-I/O). Removing all prints from the testcase gives: alias stmt walking : 132.44 (61%) usr 0.79 (30%) sys 133.47 (61%) wall 7023 kB ( 1%) ggc TOTAL : 216.55 2.67 219.78 645229 kB As all arrays are not address-taken we really look for CSE opportunities up to the very start of the function (PRE translates the in-loop references from the any (a /= b) loop to the loop header using the constant initial index and tries to CSE that, but it doesn't actually succeed - which is another bug of course, it should look it up from original resp. A.0).