https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66862

--- Comment #5 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> Now, it seems AVX512BW (and AVX512VL in some cases) has the needed
> instructions,
> in particular VMOVDQU{8,16}, but it is not reflected in maskload<mode> and
> maskstore<mode> expanders.  CCing Kyrill and Uros on this.

w/ -mavx512bw and -mavx512vl, the loop is vectorized since GCC 8.1.

Reply via email to