RE: [PATCH] i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}

2022-11-07 Thread Kong, Lingling via Gcc-patches
> On Sun, Nov 6, 2022 at 2:00 PM Kong, Lingling via Gcc-patches patc...@gcc.gnu.org> wrote: > > > > Hi > > > > The patch is to add flag -mprefer-remote-atomic to control whether to > generate raoint insn for atomic operations. > > Ok for trunk? > >

[PATCH] [committed] i386: Fix typo in sse-22.c pragma

2022-11-07 Thread Kong, Lingling via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/sse-22.c: Fix typo in pragma GCC target. Pushing as obvious. Thanks, Lingling --- gcc/testsuite/gcc.target/i386/sse-22.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c

[PATCH] i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}

2022-11-06 Thread Kong, Lingling via Gcc-patches
Hi The patch is to add flag -mprefer-remote-atomic to control whether to generate raoint insn for atomic operations. Ok for trunk? BRs, Lingling gcc/ChangeLog: * config/i386/i386.opt:Add -mprefer-remote-atomic. * config/i386/sync.md (atomic_): New define_expand.

[PATCH] Support Intel RAO-INT

2022-11-06 Thread Kong, Lingling via Gcc-patches
Hi, The patches aimed to add Intel RAO-INT. The information is based on newly released Intel Architecture Instruction Set Extensions and Future Features. The document comes following:

RE: [wwwdocs] [GCC13] Mention Intel __bf16 support in AVX512BF16 intrinsics.

2022-11-03 Thread Kong, Lingling via Gcc-patches
> > > diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html > > > index 7c6bfa6e..cd0282f1 100644 > > > --- a/htdocs/gcc-13/changes.html > > > +++ b/htdocs/gcc-13/changes.html > > > @@ -230,6 +230,8 @@ a work-in-progress. > > >For both C and C++ the __bf16 type is supported on >

RE: [wwwdocs] [GCC13] Mention Intel __bf16 support in AVX512BF16 intrinsics.

2022-11-01 Thread Kong, Lingling via Gcc-patches
> > diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html > > index 7c6bfa6e..cd0282f1 100644 > > --- a/htdocs/gcc-13/changes.html > > +++ b/htdocs/gcc-13/changes.html > > @@ -230,6 +230,8 @@ a work-in-progress. > >For both C and C++ the __bf16 type is supported on > >

[wwwdocs] [GCC13] Mention Intel __bf16 support in AVX512BF16 intrinsics.

2022-10-31 Thread Kong, Lingling via Gcc-patches
Hi The patch is for mention Intel __bf16 support in AVX512BF16 intrinsics. Ok for master ? Thanks, Lingling --- htdocs/gcc-13/changes.html | 2 ++ 1 file changed, 2 insertions(+) diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 7c6bfa6e..cd0282f1 100644 ---

RE: [PATCH 4/6] Support Intel AVX-NE-CONVERT

2022-10-28 Thread Kong, Lingling via Gcc-patches
Hi, Because we switch intrinsics for avx512bf16 to the new type __bf16. Now we could use m128/256bh for vector bf16 type instead of m128/256bf16. And unified builtin for avx512bf16/avxneconvert. Thanks, Lingling > -Original Message- > From: Hongtao Liu > Sent: Tuesday, October 25,

[PATCH] i386: using __bf16 for AVX512BF16 intrinsics

2022-10-28 Thread Kong, Lingling via Gcc-patches
Hi, Previously we use unsigned short to represent bf16. It's not a good expression, and at the time the front end didn't support bf16 type. Now we introduced __bf16 to X86 psABI. So we can switch intrinsics to the new type. Ok for trunk ? Thanks, Lingling gcc/ChangeLog: *

RE: [PATCH 4/6] Support Intel AVX-NE-CONVERT

2022-10-24 Thread Kong, Lingling via Gcc-patches
> From: Gcc-patches > On Behalf Of Hongtao Liu via Gcc-patches > Sent: Monday, October 17, 2022 1:47 PM > To: Jiang, Haochen > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH 4/6] Support Intel AVX-NE-CONVERT > > On Fri, Oct 14, 2022 at 3:58 PM Haochen Jiang via Gcc-patches >

RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-09-20 Thread Kong, Lingling via Gcc-patches
he compute_overall_effect_of_inner_loop) being chrec_dont_know? If > > that call produced something useful it will not be of either of the two > > special > forms. > > Thus like > > > > if (def != chrec_dont_know) > > /* Already OK. */ > >

RE: [PATCH] i386: Fixed vec_init_dup_v16bf [PR106887]

2022-09-16 Thread Kong, Lingling via Gcc-patches
bf [PR106887] > > On Thu, Sep 15, 2022 at 11:36 AM Kong, Lingling via Gcc-patches patc...@gcc.gnu.org> wrote: > > > > Hi > > > > The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in > ix86_expand_vector_init_duplicate. > >

RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-09-15 Thread Kong, Lingling via Gcc-patches
; > else if ((bitinv_def = ...) > .. > else if (tree_fits_uhwi_p (niter) > ... bitwise induction case...) > ... > > ? > > Otherwise looks OK now. > > Thanks, > Richard. > > > Do you agree with these changes? Thanks again for taking

[PATCH] i386: Fixed vec_init_dup_v16bf [PR106887]

2022-09-14 Thread Kong, Lingling via Gcc-patches
Hi The patch is to fix vec_init_dup_v16bf, add correct handle for v16bf mode in ix86_expand_vector_init_duplicate. Add testcase with sse2 without avx2. OK for master? gcc/ChangeLog: PR target/106887 * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate):

RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-09-13 Thread Kong, Lingling via Gcc-patches
age- > From: Richard Biener > Sent: Tuesday, August 23, 2022 3:27 PM > To: Kong, Lingling > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop > with an invariant induction.[PR105735] > > On Th

RE: [PATCH] x86: Handle V8BF in expand_vec_perm_broadcast_1

2022-09-02 Thread Kong, Lingling via Gcc-patches
Hi, I fixed it in a new patch. And added BF vector mode in SUBST_V and avx512fmaskhalfmode for @vec_interleave_high. Ok for trunk ? > > Hi, > > > > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and > ix86_expand_vector_init_duplicate. > > Ok for trunk? > > > > gcc/ChangeLog: > > > >

RE: [PATCH] middle-end: Add MULT_EXPR recognition for cond scalar reduction

2022-08-31 Thread Kong, Lingling via Gcc-patches
Hi Richard, could you help to have a look for the patch ? Ok for master ? > Hi, > > The conditional mult reduction cannot be recognized with current GCC. The > following loop cannot be vectorized. > Now add MULT_EXPR recognition for conditional scalar reduction. > > float summa(int n, float

[PATCH] x86: Handle V8BF in expand_vec_perm_broadcast_1

2022-08-31 Thread Kong, Lingling via Gcc-patches
Hi, Handle E_V8BFmode in expand_vec_perm_broadcast_1 and ix86_expand_vector_init_duplicate. Ok for trunk? gcc/ChangeLog: PR target/106742 * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Handle V8BF mode. (expand_vec_perm_broadcast_1): Ditto.

[PATCH] middle-end: Add MULT_EXPR recognition for cond scalar reduction

2022-08-25 Thread Kong, Lingling via Gcc-patches
Hi, The conditional mult reduction cannot be recognized with current GCC. The following loop cannot be vectorized. Now add MULT_EXPR recognition for conditional scalar reduction. float summa(int n, float *arg1, float *arg2) { int i;

RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-08-22 Thread Kong, Lingling via Gcc-patches
Hi Richard, could you help to have a look for the patch ? > Hi, > > This patch is for pr105735/pr101991. It will enable below optimization: > { > - long unsigned int bit; > - > - [local count: 32534376]: > - > - [local count: 1041207449]: > - # tmp_10 = PHI > - # bit_12 = PHI > -

[wwwdocs] [GCC13] Mention Intel __bf16 support.

2022-08-18 Thread Kong, Lingling via Gcc-patches
Hi The patch is for mention Intel __bf16 support in gcc13. Ok for master ? Thanks, Lingling htdocs/gcc-13/changes.html | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 57bd8724..7d98329c 100644 ---

[PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-08-18 Thread Kong, Lingling via Gcc-patches
Hi, This patch is for pr105735/pr101991. It will enable below optimization: { - long unsigned int bit; - - [local count: 32534376]: - - [local count: 1041207449]: - # tmp_10 = PHI - # bit_12 = PHI - tmp_7 = bit2_6(D) & tmp_10; - bit_8 = bit_12 + 1; - if (bit_8 != 32) -goto ;

[PATCH] x86: Support vector __bf16 type.

2022-08-16 Thread Kong, Lingling via Gcc-patches
Hi, The patch is support vector init/broadcast/set/extract for __bf16 type. The __bf16 type is a storage type. OK for master? gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_sse_movcc): Handle vector BFmode. (ix86_expand_vector_init_duplicate): Support vector

RE: [PATCH] x86: Enable __bf16 type for TARGET_SSE2 and above

2022-08-03 Thread Kong, Lingling via Gcc-patches
Hi, Old patch has some mistake in `*movbf_internal` , now disable BFmode constant double move in `*movbf_internal`. Thanks, Lingling > -Original Message- > From: Kong, Lingling > Sent: Tuesday, July 26, 2022 9:31 AM > To: Liu, Hongtao ; gcc-patches@gcc.gnu.org > Cc: Kong, Lingling >

[PATCH] x86: Enable __bf16 type for TARGET_SSE2 and above

2022-07-25 Thread Kong, Lingling via Gcc-patches
Hi, The patch is enable __bf16 scalar type for target sse2 and above according to psABI(https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/35/diffs). The __bf16 type is a storage type like arm. OK for master? gcc/ChangeLog: * config/i386/i386-builtin-types.def (BFLOAT16): New

[PATCH] i386: Fix _mm_[u]comixx_{ss,sd} codegen and add PF result. [PR106113]

2022-07-14 Thread Kong, Lingling via Gcc-patches
Hi, The patch is to fix _mm_[u]comixx_{ss,sd} codegen and add PF result. These intrinsics have changed over time, like `_mm_comieq_ss ` old operation is `RETURN ( a[31:0] == b[31:0] ) ? 1 : 0`, and new operation update is `RETURN ( a[31:0] != NaN AND b[31:0] != NaN AND a[31:0] == b[31:0] ) ?

RE: [PATCH] MAINTAINERS: Add myself for write after approval

2022-06-27 Thread Kong, Lingling via Gcc-patches
ans you shold put your name between > > > > Dave Korn Julia Koval > > > > > > Kong, Lingling via Gcc-patches 于2022年6月27 > 日周一 > > 16:05写道: > > > > > > > > Hi, > > > > > > I want to add myself in MAINTANINER for

[PATCH] MAINTAINERS: Add myself for write after approval

2022-06-27 Thread Kong, Lingling via Gcc-patches
Hi, I want to add myself in MAINTANINER for write after approval. OK for master? ChangeLog: * MAINTAINERS (Write After Approval): Add myself. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 54d8ad41a6f..49627e5d113 100644 ---

[PATCH] i386: Enable intrinsics that convert float and bf16 data to each other.

2021-12-21 Thread Kong, Lingling via Gcc-patches
Hi, This patch is to enable intrinsics that convert float and bf16 data to each other. Ok for master? gcc/ChangeLog: * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic. (_mm512_cvtpbh_ps): Likewise. (_mm512_maskz_cvtpbh_ps): Likewise.

RE: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-24 Thread Kong, Lingling via Gcc-patches
OK, This is the patch I prepare to check in. -Original Message- From: Uros Bizjak Sent: Wednesday, November 24, 2021 4:49 PM To: Kong, Lingling Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode

[PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-24 Thread Kong, Lingling via Gcc-patches
Hi, vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c. Cleared before conversion, updated movhi_internal and ix86_can_change_mode_class. And fixed some commit message. OK for master?

RE: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-24 Thread Kong, Lingling via Gcc-patches
t16 to SFmode with -mf16c [PR 102811] On Wed, Nov 24, 2021 at 7:25 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with > -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c. > And cle

RE: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-23 Thread Kong, Lingling via Gcc-patches
Hi, vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c. And cleared before conversion, updated movhi_internal and ix86_can_change_mode_class. OK for master? gcc/ChangeLog: PR target/102811

[PATCH] i386: add alias for f*mul_*ch intrinsics

2021-11-16 Thread Kong, Lingling via Gcc-patches
Hi, This patch is to add alias for f*mul_*ch intrinsics. Ok for master? gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_mul_pch): Add alias for _mm512_fmul_pch. (_mm512_mask_mul_pch): Likewise. (_mm512_maskz_mul_pch): Likewise. (_mm512_mul_round_pch):

[PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-16 Thread Kong, Lingling via Gcc-patches
Hi, vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c. OK for master? gcc/ChangeLog: PR target/102811 * config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for f16c.

[PATCH] i386: Optimization for mm512_set1_pch.

2021-11-05 Thread Kong, Lingling via Gcc-patches
Hi, This patch is to support fold _mm512_fmadd_pch (a, _mm512_set1_pch(*(b)), c) to 1 instruction vfmaddcph (%rsp){1to16}, %zmm1, %zmm2. OK for master? gcc/ChangeLog: * config/i386/sse.md (fma___pair): Add new define_insn. (fma__fmaddc_bcst): Add new

[PATCH] i386: Support complex fma/conj_fma for _Float16.

2021-11-05 Thread Kong, Lingling via Gcc-patches
Hi, This patch is to support cmla_optab, cmul_optab, cmla_conj_optab, cmul_conj_optab for vector _Float16. Ok for master? gcc/ChangeLog: * config/i386/sse.md (cmul3): add new define_expand. (cmla4): Likewise gcc/testsuite/ChangeLog: *

[PATCH] i386: Combine the FADD(A, FMA(B, C, 0)) to FMA(B, C, A) and combine FADD(A, FMUL(B, C)) to FMA(B, C, A).

2021-10-21 Thread Kong, Lingling via Gcc-patches
Hi, This patch is to support transform in fast-math something like _mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to _mm512_fmadd_pch(a, b, x1). And support transform _mm512_add_ph(x1, _mm512_fmul_pch(a, b)) to _mm512_fmadd_pch(a, b, x1). Ok for master? gcc/ChangeLog:

[PATCH] i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

2021-08-26 Thread Kong, Lingling via Gcc-patches
Hi, For avx512f_scattersi, mask operand only affect set src, we need to refine the pattern to let gcc know mask register also affect the dest. So we put mask operand into UNSPEC_VSIBADDR. Bootstrapped and regression tested on x86_64-linux-gnu{-m32,-m64}. Ok for master? gcc/ChangeLog:

[PATCH] i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

2021-08-25 Thread Kong, Lingling via Gcc-patches
Hi, For avx512f_scattersi, mask operand only affect set src, we need to refine the pattern to let gcc know mask register also affect the dest. So we put mask operand into UNSPEC_VSIBADDR. Bootstrapped and regression tested on x86_64-linux-gnu{-m32,-m64}. Ok for master? gcc/ChangeLog:

[PATCH] i386: Fix _mm512_fpclass_ps_mask in O0 [PR 101471]

2021-08-25 Thread Kong, Lingling via Gcc-patches
Hi, For _mm512_fpclass_ps_mask in O0, mask should be (__mmask16)-1 instead of (__mmask8)-1). Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for master? gcc/ChangeLog: * gcc/config/i386/avx512dqintrin.h : fix _mm512_fpclass_ps_mask define in O0 gcc/testsuite/ChangeLog: *