https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121936
Jan Hubicka <hubicka at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org
Status|UNCONFIRMED |NEW
Last reconfirmed| |2025-09-28
Ever confirmed|0 |1
--- Comment #23 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
We do a lot of inter-procedural optimisation across the comdat function
boundary, including
1) nothrow/const/pure and similar flags discovery
2) propagation of return values
3) mod/ref (summarising what memory location can not be changed by the
function)
4) information about what parameters may or may not escape for
points-to-analysis
These optimisations may have a large performance impact and we are working on
additional transformations. In fact, the return value propagation was
introduced to optimise std::vector push_back reasonably, which in turn improved
jpeg-xl encoding speed by 47%.
With Maritn we had presentation last year
https://www.ucw.cz/~hubicka/slides/jhubicka-mjambor-gcc_cpp_std_vec.pdf
https://www.youtube.com/watch?v=bmzHE4F6gYk
Reduced testcase is:
#include <vector>
#include <cstdint>
typedef std::pair<uint32_t, uint32_t> pair_t;
pair_t pair;
void test()
{
std::vector <pair_t> stack;
stack.push_back (pair);
while (!stack.empty())
{
pair_t cur = stack.back();
stack.pop_back();
if (!cur.first)
{
cur.second++;
stack.push_back (cur);
}
if (cur.second > 10000)
break;
}
}
Here, the compiler needs to understand enough of stack.push_back semantics to
optimise the loop reasonably. Critical is to propagate that the return value of
check_len, which computes the size of the resized vector, will always fit into
the allowed range for memory allocation, that in turn saves two possible
exceptions in the reallocation (cold) path of push_back which in turn makes it
a a good in-line candidate.
So, if we decide to assume that check_len (as all comdat functions), if not
cloned or inlined, may have different behaviour, this optimisation will require
cloning check_len into every translation unit that uses push_back. This is not
limited to std::vector, but most of std containers need IPA propagation to
generate sane code.
Now the problem is costing such a clone: GCC at the IPA level can determine
side effects but does not yet know if it will be able to use them. Once it
uses them, it is too late to clone and also it may turn cascaded changes (such
as constant propagation enabling loop vectorisation) where the first
optimisation in sequence by itself is not very beneficial.