https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114987
--- Comment #7 from Haochen Jiang <haochen.jiang at intel dot com> ---
Furthermore, when I build with GCC11, the codegen is much better:
vaddps 0xc0(%rsp),%ymm5,%ymm2
vaddps 0xe0(%rsp),%ymm4,%ymm1
vmovaps %ymm2,0x80(%rsp)
vmovdqa 0x90(%rsp),%xmm6
vmovaps %ymm1,0xa0(%rsp)
vmovdqa 0xb0(%rsp),%xmm7
vmovdqa %xmm2,0xc0(%rsp)
vmovdqa %xmm6,0xd0(%rsp)
vmovdqa %xmm1,0xe0(%rsp)
vmovdqa %xmm7,0xf0(%rsp)
sub $0x1,%eax
jne 401e00 <stress_vecfp_float_add_16.avx.1+0x1e0>
Seems we might get two separate issues for this regression.