https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125880
--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Ah, no, on Intel only vpbroadcast{d,q} are entirely handled by the load ports,
vpbroadcast{b,w} still have a port 5 uop in addition to a load uop.
But on AMD all four broadcast variants appear to be equally cheap.
