On Sun, Oct 21, 2018 at 6:04 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Sat, Oct 20, 2018 at 8:46 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > Many AVX512 vector operations can broadcast from a scalar memory source.
> > This patch enables memory broadcast for FMSUB operations.  In order to
> > support AVX512 memory broadcast for FMSUB, FMSUB builtin functions are
> > also added, instead of passing the negated value to FMA builtin functions.
> >
> > gcc/
> >
> >         PR target/72782
> >         * config/i386/avx512fintrin.h (_mm512_fmsub_round_pd): Use
> >         __builtin_ia32_vfmsubpd512_mask.
> >         (_mm512_mask_fmsub_round_pd): Likewise.
> >         (_mm512_fmsub_pd): Likewise.
> >         (_mm512_mask_fmsub_pd): Likewise.
> >         (_mm512_maskz_fmsub_round_pd): Use
> >         __builtin_ia32_vfmsubpd512_maskz.
> >         (_mm512_maskz_fmsub_pd): Likewise.
> >         (_mm512_fmsub_round_ps): Use __builtin_ia32_vfmsubps512_mask.
> >         (_mm512_mask_fmsub_round_ps): Likewise.
> >         (_mm512_fmsub_ps): Likewise.
> >         (_mm512_mask_fmsub_ps): Likewise.
> >         (_mm512_maskz_fmsub_round_ps): Use
> >         __builtin_ia32_vfmsubps512_maskz.
> >         (_mm512_maskz_fmsub_ps): Likewise.
> >         * config/i386/avx512vlintrin.h (_mm256_mask_fmsub_pd): Use
> >         __builtin_ia32_vfmsubpd256_mask.
> >         (_mm256_maskz_fmsub_pd): Use __builtin_ia32_vfmsubpd256_maskz.
> >         (_mm_mask_fmsub_pd): Use __builtin_ia32_vfmaddpd128_mask
> >         (_mm_maskz_fmsub_pd): Use __builtin_ia32_vfmsubpd128_maskz.
> >         (_mm256_mask_fmsub_ps): Use __builtin_ia32_vfmsubps256_mask.
> >         (_mm256_mask_fmsub_ps): Use __builtin_ia32_vfmsubps256_mask.
> >         (_mm256_maskz_fmsub_ps): Use __builtin_ia32_vfmsubps256_maskz.
> >         (_mm_mask_fmsub_ps): Use __builtin_ia32_vfmsubps128_mask.
> >         (_mm_maskz_fmsub_ps): Use __builtin_ia32_vfmsubps128_maskz.
> >         * config/i386/fmaintrin.h (_mm_fmsub_pd): Use
> >         __builtin_ia32_vfmsubpd.
> >         (_mm256_fmsub_pd): Use __builtin_ia32_vfmsubpd256.
> >         (_mm_fmsub_ps): Use __builtin_ia32_vfmsubps.
> >         (_mm256_fmsub_ps): Use __builtin_ia32_vfmsubps256.
> >         (_mm_fmsub_sd): Use __builtin_ia32_vfmsubsd3.
> >         (_mm_fmsub_ss): Use __builtin_ia32_vfmsubss3.
> >         * config/i386/i386-builtin.def: Add
> >         __builtin_ia32_vfmsubpd256_mask,
> >         __builtin_ia32_vfmsubpd256_maskz,
> >         __builtin_ia32_vfmsubpd128_mask,
> >         __builtin_ia32_vfmsubpd128_maskz,
> >         __builtin_ia32_vfmsubps256_mask,
> >         __builtin_ia32_vfmsubps256_maskz,
> >         __builtin_ia32_vfmsubps128_mask,
> >         __builtin_ia32_vfmsubps128_maskz,
> >         __builtin_ia32_vfmsubpd512_mask,
> >         __builtin_ia32_vfmsubpd512_maskz,
> >         __builtin_ia32_vfmsubps512_mask,
> >         __builtin_ia32_vfmsubps512_maskz, __builtin_ia32_vfmsubss3,
> >         __builtin_ia32_vfmsubsd3, __builtin_ia32_vfmsubps,
> >         __builtin_ia32_vfmsubpd, __builtin_ia32_vfmsubps256 and.
> >         __builtin_ia32_vfmsubpd256.
> >         * config/i386/sse.md (fma4i_fmsub_<mode>): New.
> >         (<avx512>_fmsub_<mode>_maskz<round_expand_name>): Likewise.
> >         (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1):
> >         Likewise.
> >         (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_2):
> >         Likewise.
> >         (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_3):
> >         Likewise.
> >         (fmai_vmfmsub_<mode><round_name>): Likewise.
> >
> > gcc/testsuite/
> >
> >         PR target/72782
> >         * gcc.target/i386/avx512f-fmsub-df-zmm-1.c: New test.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-1.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-2.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-3.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-4.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-5.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-6.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-7.c: Likewise.
> >         * gcc.target/i386/avx512f-fmsub-sf-zmm-8.c: Likewise.
> >         * gcc.target/i386/avx512vl-fmsub-sf-xmm-1.c: Likewise.
> >         * gcc.target/i386/avx512vl-fmsub-sf-ymm-1.c: Likewise.
>
> LGTM.

LGTM for the whole patch serie (all patches implement the same approach).

Thanks,
Uros.

Reply via email to