https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121936
--- Comment #26 from Jan Hubicka <hubicka at ucw dot cz> --- > If the only case in which we need to back out of (or disable) the optimisation > is when we determine that we are not going to take either of those actions, > but > the callee is DECL_DECLARED_INLINE_P - then how often would we be losing a > "useful (albeit potentially invalid) optimisation" ? As discussed in #23, being able to analyze side effects of common libstdc++ containers is important for real-world performance. (It has shown as 47% performance difference on jpeg-XL compression relative to clang where clang inlined agressively cold path of the push-back.) Giving up on this is also going to prevent future optimzations. We are still quite limited to optimize across function boundaries but we are getting better. The option to clone when useful propagation is done is difficult. When analyzing the function's side effects you do not know if they will be useful or not. Cloning every function where we noticed something potentially useful about its behaviour would effectively disable comdats. Now once middle-end decides to do something useful (in this case eliminate memory stores in the innermost loop of the algorithm) it is too late to clone and also it is hard to track what needs to be cloned, since often knowledge of multiple functions needs to be taken into account... This is problematic even with -flto where we agressively turn comdats to non-comdats (since we know they will not be optimized away from the DSO anyway). With current organization of compiler we collect a lot of data at compile time to propagate it to link-time. Since at compile time we do not know if comdat will be linked to our implementation we still have to assume the worst and disable all the propagations. We could re-do early analysis at WPA, but that would mean adding another partitinoning/streaming phase so effectively doubling compile time on machines with high number of CPUs. So while we can patch visibility to have AVAIL_AVAILABLE_IF_CLONED value and disable IPA propagation across those, we are giving up on quite considerable amount of current and future optimization oppurtunities.
