https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116312
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- >but we could implement it as a simple final assembly output template change >for minimal invasion. No you can't since ldp and ld2 mean 2 different things. ld2 is basically a perm to unmix the two registers. that is load lanes. Note in the GCC case there is only one fadd while in LLVM there are 2 though indepedent. so the question becomes is the ldp better than ld2 here? overall or just looking at the ldp vs ld2?