https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28831

--- Comment #45 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
(Alright, let's ignore the "oversized" vector for now - my
std(::experimental)::simd implementation doesn't use them anyway. It uses
aggregates of vectors matching the target SIMD width. FWIW, I pass all these by
const-ref in my implementation and still see way too many copies. But I'd have
to dig deeper for a relevant test case.)

But... just for a low-hanging fruit / partial solution idea here: Given

void a(auto, auto); // i.e. by-value

a(b(), c())

the two calls to b() and c() will store their returned objects on the stack.
Subsequently, when 'a' is called, instead of unconditionally moving the stack
pointer and copying the arguments to the desired locations, check whether
  1. the objects returned from b() and c() can be modified by 'a' in-place
without
     affecting following code (last use of object returned by b() and c()), and
  2. by chance, the arguments are already in the right place (i.e. if the stack
     pointer is left unmodified).
If both are true, elide the copies.

Next step? Predict the best stack location for return objects so that the above
heuristic is true in more cases.

Reply via email to