https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Ick - convoluted C++. We end up with void ff (struct MyClass & obj) { vector(2) long unsigned int vect_SR.16; vector(2) long unsigned int vect_SR.15; vector(2) long unsigned int vect_SR.14; void * _6; <bb 2> [local count: 1073741824]: vect_SR.14_5 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D)]; vect_SR.15_28 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D) + 16]; vect_SR.16_30 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D) + 32]; _6 = operator new (48); MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6] = vect_SR.14_5; MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 16B] = vect_SR.15_28; MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 32B] = vect_SR.16_30; HandleMyClass2 (_6); [tail call] and the issue is that 'operator new (48)' can alter what 'obj' points to, so we cannot move the loads across the call and we get spilling. There is no inter-procedural analysis in GCC that would tell us that 'obj_2(D)' (the MyClass & obj argument of ff) does not point to an object that did not escape. In fact 'ff' has global visibility and it might have other callers. If you add -fwhole-program then you get the function inlined to main and main: .LFB652: .cfi_startproc subq $8, %rsp .cfi_def_cfa_offset 16 movl $48, %edi call _Znwm movq $0, (%rax) movq %rax, %rdi movq $0, 8(%rax) movq $0, 16(%rax) movq $0, 24(%rax) movq $0, 32(%rax) movq $0, 40(%rax) call _Z14HandleMyClass2Pv xorl %eax, %eax addq $8, %rsp .cfi_def_cfa_offset 8 ret (not using vectors because 'main' is considered cold). Do you cite an inline copy of ff() for clang?