Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > > gcc

Re: [PATCH] AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch fixes typo in subst for scalar complex mask_round operand. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > Ok. > gcc/ChangeLog: > > PR target/104977 > * c

Re: [PATCH] x86: Correct march=sapphirerapids to base on icelake server

2022-03-18 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote: > > Hi Hongtao, > > This patch is to correct march=sapphirerapids to base on icelake server. > and update sapphirerapids in the documentation. > > OK for master and backport to GCC 11? Ok. > > > gcc/Changelog: > > PR target/104963 >

Re: [PATCH] [i386] Add extra cost for unsigned_load which may have stall forward issue.

2022-03-17 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches wrote: > > On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote: > > > > This patch only handle pure-slp for by-value passed parameter which > > has nothing to do with IPA but psABI. For by-reference passed > > parameter IPA is required. >

Re: [x86 PATCH] PR target/94680: Clear upper bits of V2DF using movq (like V2DI).

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote: > > > This simple i386 patch unblocks a more significant change. The testcase > gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and > alas the fix for PR target/94680 doesn't (yet) handle V2DF mode. > > For the first test fro

Re: [PATCH v2] x86: Also check _SOFT_FLOAT in

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote: > > On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote: > > > > Push target("general-regs-only") in if x87 is enabled. > > > > gcc/ > > > > PR target/104890 > > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before > > push

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote: > > On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > > LGTM, thanks for handling this. > > > > Thanks, committed. > >

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > LGTM, thanks for handling this. > > Thanks, committed. > > > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2, > > > f

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-13 Thread Hongtao Liu via Gcc-patches
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote: > > Hi! > > These intrinsics are supposed to do an unaligned may_alias load > of a 16-bit or 32-bit value and store it as the first element of > a 128-bit integer vector, with all other elements cleared. > > The current _mm_storeu_* implementati

Re: [PATCH] target/104762 - vectorization costs of CONSTRUCTORs

2022-03-11 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches wrote: > > After accounting for GPR -> XMM move cost for vec_construct the > base cost needs adjustments to not double-cost those. This also > lowers the cost when such move is not necessary. > > This fixes the observed 538.imagick_r

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-10 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote: > > ping^1 > > On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > >

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches wrote: > > On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote: > > > > For parameter passing through stack, vectorized load from parm_decl > > in callee may trigger serious STF issue. This is why GCC12 regresses > > 50% for cray at -O2 comp

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-07 Thread Hongtao Liu via Gcc-patches
ping^1 On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for main trunk? > > > g

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches wrote: > > For parameter passing through stack, vectorized load from parm_decl > in callee may trigger serious STF issue. This is why GCC12 regresses > 50% for cray at -O2 compared to GCC11. > > The patch add an extremely large number to stmt

Re: [PATCH V2] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-06 Thread Hongtao Liu via Gcc-patches
Met some problem in git send-email --cc=a,b,c, so manually CC. On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches wrote: > > >What happens if you set preferred_for_speed to false for alternative 1? > It works, and I've removed the newly added splitter in this patch. > Also i tried to do simi

Re: [PATCH] i386: Fix up cond_{and,ior,xor,mul}* [PR104779]

2022-03-06 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote: > > Hi! > > The following testcase ICEs, because the cond_andv* expander > has vector_operand predicates in both of the commutative inputs > and calls gen_andv*_mask which calls ix86_binary_operator_ok > in its condition, but nothing calls ix86_f

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches wrote: > > This is incremental patch based on [1], it enables optimization as below > > - vbroadcastss.LC1(%rip), %xmm0 > + movl$-45, %edx > + vmovd %edx, %xmm0 > + vpshufd $0, %xmm0, %xmm0 > > According to

Re: [PATCH] x86: Always return pseudo register in ix86_gen_scratch_sse_rtx

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches wrote: > > ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector > register to prevent RTL optimizers from removing vector register. It > introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it > is called by R

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-03-01 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote: > > On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote: > > > > > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches > > > wrote: > >

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > > > .. in ix86_expand_vector_move and > > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > >

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > .. in ix86_expand_vector_move and > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > ix86_expand_vector_move is called by emit_move_insn which is use

Re: [PATCH] AVX512F: Add helper enumeration for ternary logic intrinsics.

2022-02-27 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 25, 2022 at 4:44 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch intends to sync with llvm change in > https://reviews.llvm.org/D120307 to add enumeration and truncate This will be documented in intel intrinsic guide. > imm to unsigned char, so users could use ~ on immedia

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-02-24 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > The patch fixes ICE in ix86_gimple_fold_builtin. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for main trunk? > gcc/ChangeLog: > > PR target/104666 > * config/i386/i386-expand.cc > (ix86_check_builtin_isa_m

Re: [PATCH] [i386] Fix typo in v1ti3.

2022-02-23 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 23, 2022 at 5:48 PM Jakub Jelinek via Gcc-patches wrote: > > On Wed, Feb 23, 2022 at 05:21:26PM +0800, liuhongt via Gcc-patches wrote: > > For evex encoding vp{xor,or,and}, suffix is needed. > > > > Or there would be an error for > > vpxor %ymm0, %ymm31, %ymm1 > > The insn is about V1T

Re: [PATCH 3/3] target/99881 - x86 vector cost of CTOR from integer regs

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Mon, Feb 21, 2022 at 5:10 PM Richard Biener wrote: > > On Mon, 21 Feb 2022, Hongtao Liu wrote: > > > On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches > > wrote: > > > > > > This uses the now passed SLP node to the vectorizer costing h

Re: [PATCH] i386: Fix up copysign/xorsign expansion [PR104612]

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 22, 2022 at 12:46 AM Jakub Jelinek wrote: > > Hi! > > We ICE on the following testcase for -m32 since r12-3435. because > operands[2] is (subreg:SF (reg:DI ...) 0) and According to validate_subreg, (subreg:V4SF (reg:DI ...) 0) should be valid(but not sure if it really works ) For -m64

Re: [PATCH v2] x86: Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 22, 2022 at 2:35 AM H.J. Lu wrote: > > On Sun, Feb 20, 2022 at 6:01 PM Hongtao Liu wrote: > > > > On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote: > > > > > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote: > > > > On

Re: [PATCH v2] x86: Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO

2022-02-20 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote: > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote: > > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches > > wrote: > > > > > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches &

Re: [PATCH 3/3] target/99881 - x86 vector cost of CTOR from integer regs

2022-02-20 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches wrote: > > This uses the now passed SLP node to the vectorizer costing hook > to adjust vector construction costs for the cost of moving an > integer component from a GPR to a vector register when that's > required for building a vect

Re: [PATCH] target/104581 - compile-time regression in mode-switching

2022-02-17 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches wrote: > > The x86 backend piggy-backs on mode-switching for insertion of > vzeroupper. A recent improvement there was implemented in a way > to walk possibly the whole basic-block for all DF reg def definitions > in its mode_needed h

Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches wrote: > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride, > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX > transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to > generate vzero

Re: [PATCH] [i386] Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:00 PM liuhongt wrote: > > Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > * config/i386/cpuid.h (bit_MPX): Removed. > (bit_BNDREGS): Ditto. > (bit_BNDCSR): Ditto. > --- > gcc/config/i386/cpuid.h | 5

Re: [PATCH] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches wrote: > > On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote: > > > > +(match (cond_expr_convert_p @0 @2 @3 @6) > > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3)) > > > > + (if (types_mat

Re: [GCC 11 PATCH 0/5] x86: Backport straight-line-speculation mitigation

2022-02-15 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches wrote: > > Backport -mindirect-branch-cs-prefix: > > commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a > Author: H.J. Lu > Date: Wed Oct 27 06:27:15 2021 -0700 > > x86: Add -mindirect-branch-cs-prefix > > Add -mindirect-branch-cs-pref

Re: [GCC 11 PATCH 1/5] x86: Remove "%!" before ret

2022-02-15 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 1, 2022 at 2:56 AM H.J. Lu via Gcc-patches wrote: > > Before MPX was removed, "%!" was mapped to > > case '!': > if (ix86_bnd_prefixed_insn_p (current_output_insn)) > fputs ("bnd ", file); > return; > > After CET was added and MPX was removed, "%

Re: [PATCH] x86: Update PR 35513 tests

2022-02-11 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 10, 2022 at 9:58 PM H.J. Lu via Gcc-patches wrote: > > 1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513 > run-time tests. > 2. Compile pr35513-8.c to scan assembly code. > > PR testsuite/104481 > * g++.target/i386/pr35513-1.C: Require property_1_needed

Re: [PATCH] x86: Check each component of source operand for AVX_U128_DIRTY

2022-02-08 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 9, 2022 at 10:53 AM H.J. Lu via Gcc-patches wrote: > > commit 9775e465c1fbfc32656de77c618c61acf5bd905d > Author: H.J. Lu > Date: Tue Jul 27 07:46:04 2021 -0700 > > x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register > > called ix86_check_avx_upper_register to check mode

Re: [PATCH v4] x86: Add -m[no-]direct-extern-access

2022-02-08 Thread Hongtao Liu via Gcc-patches
On Fri, Jan 28, 2022 at 5:53 AM H.J. Lu via Gcc-patches wrote: > > The v3 patch was posted at > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html > > There is no progress with repeated pings since then. Glibc 2.35 and > binutils 2.38 will support GNU_PROPERTY_1_NEEDED_INDIRECT_EXT

Re: [PATCH v2] x86: Also check mode of memory broadcast in bcst_mem_operand

2022-01-23 Thread Hongtao Liu via Gcc-patches
On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches wrote: > > Return false for invalid mode on memory broadcast in bcst_mem_operand: > > (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ]))) > Yes, thanks. > gcc/ > > PR target/104188 > * config/i386/predicates.md (bcst_mem

Re: [PATCH] i386: Fix *aesu8

2022-01-18 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 19, 2022 at 9:40 AM Jakub Jelinek wrote: > > Hi! > > On Wed, Jan 19, 2022 at 09:09:41AM +0800, Hongtao Liu wrote: > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Yes, thanks. > > Thanks. Committed. > grep '{[^|}

Re: [PATCH] i386: Fix GLC tuning with -masm=intel [PR104104]

2022-01-18 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 19, 2022 at 8:00 AM Jakub Jelinek wrote: > > On Sun, Jan 16, 2022 at 12:22:18PM +0800, Hongtao Liu via Gcc-patches wrote: > > On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Sat, Jan 15, 2022

Re: [PATCH] [i386]Adjust testcase for --target_board='unix{-m64\ -march=cascadelake}'

2022-01-17 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 18, 2022 at 10:57 AM liuhongt via Gcc-patches wrote: > > Change scan-assembler from "\tucomisd" to "\t\[v\]?ucomisd". It's an obvious "fix", Pushed to trunk. > > refer to https://gcc.gnu.org/pipermail/gcc-regression/2022-January/076241.html > > gcc/testsuite/ChangeLog: > > * g+

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Hongtao Liu via Gcc-patches
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches wrote: > > On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote: > > > > Thanks for the suggestion, here is the updated patch that survived > > bootstrap/regtest. > > LGTM for me, but please get the final approval from Hongtao. > Ok, thank

Re: [PATCH] [i386] Fix ICE of unrecognizable insn. [PR target/104001]

2022-01-13 Thread Hongtao Liu via Gcc-patches
Here's the patch I'm going to check in, the patch is pre-approved in PR. On Thu, Jan 13, 2022 at 11:59 PM liuhongt wrote: > > For define_insn_and_split "*xor2andn": > > 1. Refine predicate of operands[0] from nonimmediate_operand to > register_operand. > 2. Remove TARGET_AVX512BW from condition t

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-10 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 3:21 PM Jiang, Haochen wrote: > > Hi Hongtao, > > I have changed that message in this patch. Ok for trunk? Ok. > > Thx, > Haochen > > -Original Message- > From: Hongtao Liu > Sent: Monday, January 10, 2022 3:25 PM > To: Jia

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-09 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 2:23 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch removes the register restriction on operands for andnot insn so > that it can be used from memory. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > >

Re: [PATCH] [i386]Fix tdpbf16ps testcase

2021-12-27 Thread Hongtao Liu via Gcc-patches
On Fri, Dec 24, 2021 at 4:51 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the testcase of amxbf16-dpbf16ps-2.c. Previously the type > convert has some issue. > > Ok for trunk? Ok. > > BRs, > Haochen > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/amx-check.h (

Re: [PATCH] [i386] Add define_insn_and_split for vpcmp{b, w, d, q} vpcmp{ph, ps, pd}.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 21, 2021 at 2:27 PM liuhongt wrote: > > The purpose of those define_insn_and_split: > 1. Combine vpcmpuw and zero_extend into vpcmpuw. > 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just > kmov > 3. Use DImode as dest of zero_extend so cprop_hardreg can elim

Re: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > > This patch is to enable intrinsics that convert float and bf16 data to each > other. > Ok for master? > Ok. > gcc/ChangeLog: > > * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic. >

Re: [PATCH] [i386] Optimize bit_and op1 float_vector_all_ones_operands to op1.

2021-12-19 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 1:59 PM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? Pushed to trunk. > > gcc/ChangeLog: > > PR target/98468 > * config/i386/sse.md (*bit_and_float_vector_all_ones): New > pre-reload splitter. > > gcc/

Re: [PATCH] [i386][avx512]Add combine splitter to transform vpternlogd/vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 15, 2021 at 9:26 AM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the regression previously reported on the combine splitter > under '-m32 -march=cascadelake' options. > > Regtested on x86_64-pc-linux-gnu. Ok. > > BRs, > Haochen > > gcc/ChangeLog: > > PR

Re: [PATCH] [i386] Add option -mvect-compare-costs

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 5:00 PM Richard Sandiford via Gcc-patches wrote: > > Obviously I'm not in a position to comment on the target bits, but: > > liuhongt via Gcc-patches writes: > > Also with corresponding target attribute, option default disabled. > > > > Bootstrapped and regtested on x86_64

Re: [PATCH] [i386]Add combine splitter to transform vashr/vlshr/vashl_optab to ashr/lshr/ashl_optab for const vector duplicate operand.

2021-12-08 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 8, 2021 at 2:47 PM Haochen Jiang via Gcc-patches wrote: > > Hi, > > This patch add combine splitter to transform vashr/vlshr/vashl_optab to > ashr/lshr/ashl_optab for const vector duplicate operand. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Ok. > > BRs, > Haochen > > gcc/Cha

Re: [PATCH] [i386]Add combine splitter to transform vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

2021-12-07 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 8, 2021 at 11:13 AM Jiang, Haochen via Gcc-patches wrote: > > Hi Uros, > > I have fixed that in this patch attached for checking in. Is that ok for > trunk? > Uros already said it's ok with that change, let me check in the patch for you. > Regtested on x86_64-pc-linux-gnu. > > Thx, >

Re: [PATCH] [i386] Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class.

2021-12-05 Thread Hongtao Liu via Gcc-patches
care for 64-bit moves which are expensive on 32-bit > targets. I like your version, update patch. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} w/ and w/o -march=k8. On Mon, Dec 6, 2021 at 11:41 AM liuhongt wrote: > > When moves between integer and sse registers are cheap. > &

Re: [PATCH] [i386] Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class.

2021-12-02 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 2, 2021 at 4:27 PM liuhongt wrote: > > The patch helps reload to choose GENENRAL_REGS alternatives for > SSE_FLOAT_MODE and enabled optimization like > > - vmovd %xmm0, -4(%rsp) > - movl$1, %eax > - addl-4(%rsp), %eax > + movd%xmm0, %eax > +

Re: [PATCH] i386: Improve V8HI and V8HF inserts [PR102811]

2021-12-01 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak wrote: > > Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar > element 0 inserts to from a GP register, SSE register or memory. Also > add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is > split after reload t

Re: [PATCH] [i386] Fix ICE in ix86_attr_length_immediate_default.

2021-11-30 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 30, 2021 at 5:44 PM liuhongt via Gcc-patches wrote: > > ix86_attr_length_immediate_default assume TYPE ishift only have 1 > constant operand, > but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with > condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or > INTV

Re: [PATCH] Fix regression introduced by r12-5536.

2021-11-29 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 30, 2021 at 5:21 AM Uros Bizjak wrote: > > On Mon, Nov 29, 2021 at 10:48 AM Hongtao Liu wrote: > > > > On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote: > > > > > > On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote: > > > > > > &g

Re: [PATCH] Fix regression introduced by r12-5536.

2021-11-29 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote: > > On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote: > > > > There're several failures reported in [1]: > > 1. unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)" > > %vpextrw should be used in output templates. > > 2. ICE in get_a

Re: [PATCH] pr103194-5.c: Replace long with int64_t

2021-11-24 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 25, 2021 at 12:18 PM H.J. Lu via Gcc-patches wrote: > > Replace long with int64_t to work with -mx32. Thanks. > > * gcc.target/i386/pr103194-5.c: Replace long with int64_t. > --- > gcc/testsuite/gcc.target/i386/pr103194-5.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletio

Re: [PATCH] rs6000/test: Add emulated gather test case

2021-11-24 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches wrote: > > Hi, > > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > > As evaluated, the emulated gather capability of vectorizer > (r12-2733) can help to speed up SPEC2017 5

Re: [PATCH] Don't allow mask/sse/mmx mov in TLS code sequences.

2021-11-21 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 19, 2021 at 3:53 PM Uros Bizjak via Gcc-patches wrote: > > On Fri, Nov 19, 2021 at 8:50 AM Uros Bizjak wrote: > > > > On Fri, Nov 19, 2021 at 2:14 AM liuhongt wrote: > > > > > > >Why is the above declared as a special memory constraint? Also the > > > Change to define_memory_constrai

Re: [PATCH] i386: add alias for f*mul_*ch intrinsics

2021-11-17 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 16, 2021 at 4:35 PM Hongtao Liu wrote: > > On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches > wrote: > > > > Hi, > > > > This patch is to add alias for f*mul_*ch intrinsics. > > > > Ok for master? > This patch j

Re: [PATCH] i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]

2021-11-16 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 16, 2021 at 4:15 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with > -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c. > > OK for master? > > gcc/ChangeLog: > > PR target/102811

Re: [PATCH] i386: add alias for f*mul_*ch intrinsics

2021-11-16 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > This patch is to add alias for f*mul_*ch intrinsics. > > Ok for master? This patch just adds some macro definitions (new aliases for intrinsic) to the header file, and I think this should be low risk. And considering

Re: [PATCH v2] Check optab before transforming atomic bit test and operations

2021-11-14 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 15, 2021 at 9:37 AM Hongtao Liu wrote: > > On Sat, Nov 13, 2021 at 2:21 AM H.J. Lu via Gcc-patches > wrote: > > > > On Fri, Nov 12, 2021 at 8:13 AM Jakub Jelinek wrote: > > > > > > On Fri, Nov 12, 2021 at 07:55:26AM -0800, H.J. Lu wrote: >

Re: [PATCH v2] Check optab before transforming atomic bit test and operations

2021-11-14 Thread Hongtao Liu via Gcc-patches
On Sat, Nov 13, 2021 at 2:21 AM H.J. Lu via Gcc-patches wrote: > > On Fri, Nov 12, 2021 at 8:13 AM Jakub Jelinek wrote: > > > > On Fri, Nov 12, 2021 at 07:55:26AM -0800, H.J. Lu wrote: > > > > I have following patch queued for testing for this... > > > > > > > > 2021-11-12 Jakub Jelinek > > >

Re: [PATCH] [pass_if_conversion] Extend is_cond_scalar_reduction to handle bit_and/bit_xor/bit_ior.

2021-11-09 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 9, 2021 at 6:22 PM Richard Biener via Gcc-patches wrote: > > On Tue, Nov 9, 2021 at 3:09 AM liuhongt wrote: > > > > This will enable transformation like > > > > - # sum1_50 = PHI > > - # sum2_52 = PHI > > + # sum1_50 = PHI <_87(13), 0(4)> > > + # sum2_52 = PHI <_89(13), 0(4)> >

Re: [PATCH 1/2] [Gimple] Simplify (trunc)fmax/fmin((extend)a, (extend)b) to MAX/MIN(a,b)

2021-11-09 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 9, 2021 at 6:21 PM Richard Biener wrote: > > On Tue, Nov 9, 2021 at 3:37 AM Hongtao Liu wrote: > > > > On Mon, Nov 8, 2021 at 4:59 PM Richard Biener > > wrote: > > > > > > On Mon, Nov 8, 2021 at 2:30 AM Hongtao Liu wrote: > > &g

Re: [PATCH 1/2] [Gimple] Simplify (trunc)fmax/fmin((extend)a, (extend)b) to MAX/MIN(a,b)

2021-11-08 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 8, 2021 at 4:59 PM Richard Biener wrote: > > On Mon, Nov 8, 2021 at 2:30 AM Hongtao Liu wrote: > > > > On Fri, Nov 5, 2021 at 5:52 PM Richard Biener > > wrote: > > > > > > On Fri, Nov 5, 2021 at 6:38 AM liuhongt wrote: > > > &g

Re: [PATCH] i386: Optimization for mm512_set1_pch.

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 3:20 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > This patch is to support fold _mm512_fmadd_pch (a, _mm512_set1_pch(*(b)), c) > to 1 instruction vfmaddcph (%rsp){1to16}, %zmm1, %zmm2. > OK for master? > LGTM. > gcc/ChangeLog: > > * config/i386/sse.md (fma__

Re: [PATCH] i386: Support complex fma/conj_fma for _Float16.

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 3:09 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > This patch is to support cmla_optab, cmul_optab, cmla_conj_optab, > cmul_conj_optab for vector _Float16. > Ok for master? LGTM. > gcc/ChangeLog: > > * config/i386/sse.md (cmul3): add new define_expand. >

Re: [PATCH 1/2] [Gimple] Simplify (trunc)fmax/fmin((extend)a, (extend)b) to MAX/MIN(a,b)

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 5:52 PM Richard Biener wrote: > > On Fri, Nov 5, 2021 at 6:38 AM liuhongt wrote: > > > > a and b are same type as trunc type and has less precision than > > extend type, the transformation is guarded by flag_finite_math_only. > > > > Bootstrapped and regtested under x86_64-

Re: [PATCH] i386: Auto vectorize sdot_prod, usdot_prod with VNNI instruction.

2021-11-03 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 4, 2021 at 9:50 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > AVX512VNNI/AVXVNNI has vpdpwssd for HImode, vpdpbusd for QImode, so > Adjust HImode sdot_prod expander and add QImode usdot_prod expander > to enhance vectorization for dotprod. > > Bootstraped/regtested on x86_64-pc-lin

Re: [PATCH] i386: Fix wrong result for AMX-TILE intrinsic when parsing expression.

2021-11-03 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 4, 2021 at 11:25 AM Hongyu Wang wrote: > > > Could you add a testcase for that? > > Yes, updated patch. > > Hongtao Liu via Gcc-patches 于2021年11月4日周四 上午10:25写道: > > > > On Thu, Nov 4, 2021 at 9:19 AM Hongyu Wang via Gcc-patches > > wrote:

Re: [PATCH] i386: Fix wrong result for AMX-TILE intrinsic when parsing expression.

2021-11-03 Thread Hongtao Liu via Gcc-patches
On Thu, Nov 4, 2021 at 9:19 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > _tile_loadd, _tile_stored, _tile_streamloadd intrinsics are defined by > macro, so the parameters should be wrapped by parentheses to accept > expressions. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.

Re: [PATCH] Adjust testcase for O2 vect.

2021-10-28 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 29, 2021 at 10:34 AM Martin Sebor wrote: > > On 10/28/21 7:47 PM, Hongtao Liu wrote: > > On Fri, Oct 29, 2021 at 12:20 AM Martin Sebor via Gcc-patches > > wrote: > >> > >> On 10/28/21 1:23 AM, liuhongt via Gcc-patches wrote: > >>> Adju

Re: [PATCH] Enable vectorization for _Float16 floor/ceil/trunc/nearbyint/rint operations.

2021-10-28 Thread Hongtao Liu via Gcc-patches
On Thu, Oct 28, 2021 at 10:26 AM Hongtao Liu wrote: > > On Mon, Oct 25, 2021 at 4:24 PM liuhongt wrote: > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > I'm going to check in this patch if there's

Re: [PATCH] Adjust testcase for O2 vect.

2021-10-28 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 29, 2021 at 12:20 AM Martin Sebor via Gcc-patches wrote: > > On 10/28/21 1:23 AM, liuhongt via Gcc-patches wrote: > > Adjust code in check_vect_slp_aligned_store_usage to make it an exact > > pattern match of the corresponding testcases. > > These new target/xfail selectors are added a

Re: [PATCH] Enable vectorization for _Float16 floor/ceil/trunc/nearbyint/rint operations.

2021-10-27 Thread Hongtao Liu via Gcc-patches
On Mon, Oct 25, 2021 at 4:24 PM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > I'm going to check in this patch if there's no objection. > gcc/ChangeLog: > > PR target/102464 > * config/i386/i386-builtin-types.def (V8HF_FTYPE_V8H

Re: [PATCH] AVX512FP16: Optimize _Float16 reciprocal for div and sqrt

2021-10-27 Thread Hongtao Liu via Gcc-patches
On Tue, Oct 26, 2021 at 5:51 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For _Float16 type, add insn and expanders to optimize x / y to > x * rcp (y), and x / sqrt (y) to x * rsqrt (y). > As Half float only have minor precision difference between div and > mul * rcp, there is no need for New

Re: [PATCH] testsuite: i386: Fix gcc.target/i386/avx512f-pr96891-3.c on Solaris [PR102834]

2021-10-25 Thread Hongtao Liu via Gcc-patches
On Mon, Oct 25, 2021 at 10:01 PM Rainer Orth wrote: > > gcc.target/i386/avx512f-pr96891-3.c currently FAILs on 32-bit Solaris/x86: > > FAIL: gcc.target/i386/avx512f-pr96891-3.c scan-assembler-times > (?n)vpcmp[bwdq][ t]*\$7 4 > > There are only 3 instances of the expected pattern because

Re: [PATCH] Canonicalize __atomic/sync_fetch_or/xor/and for constant mask.

2021-10-25 Thread Hongtao Liu via Gcc-patches
On Mon, Oct 25, 2021 at 1:59 PM liuhongt wrote: > > Canoicalize & and nop_convert order for > __atomic_fetch_or_*, __atomic_fetch_xor_*, > __atomic_xor_fetch_*,__sync_fetch_and_or_*, > __sync_fetch_and_xor_*,__sync_xor_and_fetch_*, > __atomic_fetch_and_*,__sync_fetch_and_and_* when mask is constan

Re: [PATCH] Simplify (_Float16) sqrtf((float) a) to .SQRT(a) when a is a _Float16 value.

2021-10-25 Thread Hongtao Liu via Gcc-patches
On Mon, Oct 25, 2021 at 1:59 PM liuhongt wrote: > > Similar for sqrt/sqrtl. > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ok for trunk? > gcc/ChangeLog: > > PR target/102464 > * match.pd: Simplify (_Float16) sqrtf((float) a) to .SQRT(a) > when direct_inter

Re: [PATCH] i386: Combine the FADD(A, FMA(B, C, 0)) to FMA(B, C, A) and combine FADD(A, FMUL(B, C)) to FMA(B, C, A).

2021-10-24 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 22, 2021 at 1:57 PM Kong, Lingling via Gcc-patches wrote: > > Hi, > > This patch is to support transform in fast-math something like > _mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to > _mm512_fmadd_pch(a, b, x1). > > And support transform _mm512_add_ph(x1, _mm512_f

Re: [PATCH v4] Improve integer bit test on __atomic_fetch_[or|and]_* returns

2021-10-21 Thread Hongtao Liu via Gcc-patches
i is On Wed, Oct 13, 2021 at 8:34 PM Richard Biener via Gcc-patches wrote: > > On Sun, Oct 10, 2021 at 3:49 PM H.J. Lu wrote: > > > > Changes in v4: > > > > 1. Bypass redundant check when inputs have been transformed to the > > equivalent canonical form with valid bit operation. > > > > Changes

Re: [PATCH] i386: Fix wrong codegen for V8HF move without TARGET_AVX512F

2021-10-21 Thread Hongtao Liu via Gcc-patches
de 100644 > index 000..bad4fa9394e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr102812.c > @@ -0,0 +1,12 @@ > +/* PR target/102812 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse4 -mno-avx" } */ > +/* { dg-final { scan-assembler-no

Re: [PATCH] i386: Fix wrong codegen for V8HF move without TARGET_AVX512F

2021-10-20 Thread Hongtao Liu via Gcc-patches
On Wed, Oct 20, 2021 at 1:31 PM Hongyu Wang via Gcc-patches wrote: > > Since _Float16 type is enabled under sse2 target, returning > V8HFmode vector without AVX512F target would generate wrong > vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this. > > Bootstraped/regtested on x86_64-pc-lin

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-20 Thread Hongtao Liu via Gcc-patches
On Thu, Oct 21, 2021 at 10:06 AM Hongtao Liu wrote: > > On Thu, Oct 21, 2021 at 9:20 AM Hongtao Liu wrote: > > > > On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches > > wrote: > > > > > > Hi, > > > > > > > > >

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-20 Thread Hongtao Liu via Gcc-patches
On Thu, Oct 21, 2021 at 9:20 AM Hongtao Liu wrote: > > On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches > wrote: > > > > Hi, > > > > > > On Tue, Oct 19, 2021 at 11:03 AM liuhongt via Gcc-patches < > > gcc-patches@gcc.gnu.org

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-20 Thread Hongtao Liu via Gcc-patches
On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches wrote: > > Hi, > > > On Tue, Oct 19, 2021 at 11:03 AM liuhongt via Gcc-patches < > gcc-patches@gcc.gnu.org> wrote: > > > updated patch: > > 1. Add documents in doc/sourcebuild.texi (Effective-Target Keywords). > > 2. Reduce -novec

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 3:11 PM Kewen.Lin via Gcc-patches wrote: > > on 2021/10/14 下午6:56, Kewen.Lin via Gcc-patches wrote: > > Hi Hongtao, > > > > on 2021/10/14 下午3:11, liuhongt wrote: > >> Hi Kewen: > >> Cound you help to verify if this patch fix those regressions > >> for rs6000 port. > >> >

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 11:37 PM Martin Sebor wrote: > > On 10/14/21 1:11 AM, liuhongt wrote: > > Hi Kewen: > >Cound you help to verify if this patch fix those regressions > > for rs6000 port. > > > > As discussed in [1], this patch add xfail/target selector to those > > testcases, also make a

Re: [PATCH] AVX512FP16: Add *_set1_pch intrinsics.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 4:38 PM dianhong.xu--- via Gcc-patches wrote: > > From: dianhong xu > > Add *_set1_pch (_Float16 _Complex A) intrinsics. > > gcc/ChangeLog: > > * config/i386/avx512fp16intrin.h: > (_mm512_set1_pch): New intrinsic. > * config/i386/avx512fp16vlintrin.

Re: [PATCH] AVX512FP16: Support vector shuffle builtins

2021-10-14 Thread Hongtao Liu via Gcc-patches
checks for NULL seems reasonable according to documents, > > op0,op1,target maybe NULL. > Thanks for pointing it out, didn't realize the difference between > these 2 functions. LGTM. > > Updated patch. > > Hongtao Liu 于2021年10月15日周五 下午1:54写道: > > > > On Fri, O

Re: [PATCH] AVX512FP16: Support vector shuffle builtins

2021-10-14 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 1:37 PM Hongyu Wang wrote: > > > This part seems not related to vector shuffle. > Yes, have separated this part to another patch and checked-in. > > Updated patch. Ok for this one? > > Hongtao Liu via Gcc-patches 于2021年10月14日周四 下午2:33写道: > >

Re: [PATCH] AVX512FP16: Fix ICE for 2 v4hf vector concat

2021-10-14 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 1:07 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For V4HFmode, doing vector concat like > > __builtin_shufflevector (a, b, {0, 1, 2, 3, 4, 5, 6, 7}) > > could trigger ICE since it is not handled in ix86_vector_init (). > > Handle HFmode like HImode to avoid such ICE.

Re: [PATCH] AVX512FP16: Support vector shuffle builtins

2021-10-13 Thread Hongtao Liu via Gcc-patches
On Thu, Oct 14, 2021 at 10:39 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch supports HFmode vector shuffle by creating HImode subreg when > expanding permutation expr. > > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,} and sde{-m32,} > OK for master? > > gcc/ChangeLog: > >

Re: [PATCH] AVX512FP16: Adjust builtin for mask complex fma

2021-10-13 Thread Hongtao Liu via Gcc-patches
On Wed, Oct 13, 2021 at 5:07 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > Current mask/mask3 implementation for complex fma contains > duplicated parameter in macro, which may cause error at -O0. > Refactor macro implementation to builtins to avoid potential > error. > > For round intrinsic w

<    2   3   4   5   6   7   8   9   10   11   >