https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124904
--- Comment #3 from Drea Pinski <pinskia at gcc dot gnu.org> ---
And then `-O3 -march=armv8.5-a+sve -g0 -fno-trapping-math ` can also vectorize
it (so there is a cost model issue with `-march=armv9-a` again.
.L3:
ld1d z30.d, p7/z, [x0, x3, lsl 3]
cmpne p7.d, p7/z, z30.d, #0
ld1d z31.d, p7/z, [x5, x3, lsl 3]
scvtf z31.d, p6/m, z31.d
st1d z31.d, p7, [x1, x3, lsl 3]
add x3, x3, x4
whilelo p7.d, w3, w2
b.any .L3
