Re: PING^1 [PATCH] x86: Skip ENDBR when emitting direct call/jmp to local function

2022-06-26 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 21, 2022 at 3:50 AM Uros Bizjak via Gcc-patches wrote: > > On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote: > > > > On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote: > > > > > > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at > > > function entry. Skip the 4-byte

Re: [PATCH] Add optional __Bfloat16 support

2022-06-13 Thread Hongtao Liu via Gcc-patches
On Sat, Jun 11, 2022 at 1:46 AM H.J. Lu wrote: > > On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote: > > > > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote: > > > > > > * liuhongt via Libc-alpha: > > > > > > > +\subsubsection{Special Types} > > > > + > > > > +The \code{__Bfloat16} type uses

Re: [PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches wrote: > > This patch is to change dg-options for two testcases. > > Use -mtune=generic to limit these two testcases. Because configuring them with > -mtune=cascadelake or znver3 will vectorize them. > > regtested on

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha wrote: > > Pass and return __Bfloat16 values in XMM registers. > > Background: > __Bfloat16 (BF16) is a new floating-point format that can accelerate machine > learning (deep learning training, in particular) algorithms. > It's first

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-09 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 8, 2022 at 11:44 AM Cui, Lili wrote: > > > -Original Message- > > From: Hongtao Liu > > Sent: Monday, June 6, 2022 1:25 PM > > To: H.J. Lu > > Cc: Cui, Lili ; Liu, Hongtao ; > > GCC > > Patches > > Subject: Re: [PATCH] U

Re: [PATCH] Disparages SSE_REGS alternatives sligntly with ?v instead of *v in *mov{si, di}_internal.

2022-06-07 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 7, 2022 at 3:41 PM liuhongt via Gcc-patches wrote: > > So alternative v won't be igored in record_reg_classess. > > Similar for *r alternatives in some vector patterns. > > It helps testcase in the PR, also RA now makes better decisions for > gcc.target/i386/extract-insert-combining.c

Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 11:56 PM H.J. Lu via Gcc-patches wrote: > > On Tue, May 31, 2022 at 10:06 PM Cui,Lili wrote: > > > > This patch is to update {skylake,icelake,alderlake}_cost to add a bit > > preference to vector store. > > Since the interger vector construction cost has changed, we need

Re: [PATCH] x86: harmonize __builtin_ia32_psadbw*() types

2022-06-05 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 6, 2022 at 3:17 AM Uros Bizjak via Gcc-patches wrote: > > On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote: > > > > The 64-bit, 128-bit, and 512-bit variants have VDI return type, in > > line with instruction behavior. Make the 256-bit builtin match, thus > > also making it match the

Re: [x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-02 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle wrote: > > > This patch resolves PR target/105791 which is a regression that was > accidentally introduced for my workaround to PR tree-optimization/10566. > (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it > shouldn't). The latest

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-31 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 1, 2022 at 12:40 AM Richard Sandiford wrote: > > Vladimir Makarov via Gcc-patches writes: > > On 2022-05-29 23:05, Hongtao Liu wrote: > >> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches > >> wrote: > >>> > >>>

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote: > > On Mon, 30 May 2022, Hongtao Liu wrote: > > > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches > > wrote: > > > > > > > > The spill is mainly decided by 3 insns related

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-30 Thread Hongtao Liu via Gcc-patches
On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches wrote: > > > > In the PR, the spill happens in the initial basic block of the function, > > > i.e. > > > the one with the highest frequency. > > > > > > Also as noted in the PR, swapping the 'unlikely' branch to 'likely' > > >

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-29 Thread Hongtao Liu via Gcc-patches
On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches wrote: > > > On 2022-05-24 23:39, liuhongt wrote: > > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > > to rework backend cost

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-24 Thread Hongtao Liu via Gcc-patches
On Wed, May 25, 2022 at 11:39 AM liuhongt via Gcc-patches wrote: > > Rigt now, mem_cost for separate mem alternative is 1 * frequency which > is pretty small and caused the unnecessary SSE spill in the PR, I've tried > to rework backend cost model, but RA still not happy with that(regress >

Re: [PATCH] [i386] recognize bzhi pattern when there's zero_extendsidi.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 6:07 PM Uros Bizjak via Gcc-patches wrote: > > On Tue, May 17, 2022 at 5:06 AM liuhongt wrote: > > > > backend has > > > > 16550(define_insn "*bmi2_bzhi_3_2" > > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r") > > 16552(and:SWI48 > > 16553

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 6:03 PM Uros Bizjak wrote: > > On Tue, May 17, 2022 at 3:33 AM Hongtao Liu wrote: > > > > On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote: &

Re: [PATCH] [Middle-end] Enhance final_value_replacement_loop to handle bitwise induction.

2022-05-17 Thread Hongtao Liu via Gcc-patches
On Fri, May 13, 2022 at 7:16 PM Richard Biener wrote: > > On Fri, May 13, 2022 at 5:37 AM Hongtao Liu wrote: > > > > On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote: &g

Re: [committed] forwprop: Fix a typo and comment formatting

2022-05-17 Thread Hongtao Liu via Gcc-patches
thanks. On Tue, May 17, 2022 at 3:09 PM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > When looking around the spot of the PR105591 fix, I've noticed a typo > and incorrectly formatted comment. > > Bootstrapped/regtested on x86_64-linux and i668-linux, committed to > trunk as obvious. > >

Re: [PATCH] [i386] recognize bzhi pattern when there's zero_extendsidi.

2022-05-16 Thread Hongtao Liu via Gcc-patches
On Tue, May 17, 2022 at 11:06 AM liuhongt via Gcc-patches wrote: > > backend has > > 16550(define_insn "*bmi2_bzhi_3_2" > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r") > 16552(and:SWI48 > 16553 (plus:SWI48 > 16554(ashift:SWI48 (const_int 1) > 16555

Re: [PATCH v2] Optimize vpermtiw/b to vpunpcklqdq for certain cases.

2022-05-16 Thread Hongtao Liu via Gcc-patches
I've committed the patch. On Fri, May 13, 2022 at 5:22 PM liuhongt via Gcc-patches wrote: > > Here's updated patch which adds ix86_pre_reload_split () to those 2 > define_insn_and_splits. > > Assembly Optimization like: > - vmovq %xmm0, %xmm2 > - vmovdqa .LC0(%rip), %xmm0 >

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-16 Thread Hongtao Liu via Gcc-patches
On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches wrote: > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote: > > > > This is adjusted patch only for OImode. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > > gcc/ChangeLog: > > > > PR

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-15 Thread Hongtao Liu via Gcc-patches
ping. On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc

Re: [PATCH] [Middle-end] Enhance final_value_replacement_loop to handle bitwise induction.

2022-05-12 Thread Hongtao Liu via Gcc-patches
On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches wrote: > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote: > > > > This patch will enable below optimization: > > > > { > > - int bit; > > - long long unsigned int _1; > > - long long unsigned int _2; > > - > > [local count:

Re: [PATCH v2] Strip of a vector load which is only used partially.

2022-05-11 Thread Hongtao Liu via Gcc-patches
On Tue, May 10, 2022 at 2:54 PM Richard Biener via Gcc-patches wrote: > > On Mon, May 9, 2022 at 7:11 AM liuhongt via Gcc-patches > wrote: > > > > Here's adjused patch. > > Ok for trunk? > > > > Optimize > > > > _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>; > > _5 = BIT_FIELD_REF

Re: [PATCH] [i386] Optimize movzwl + vmovd/vmovq to vmovw.

2022-05-10 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 4:28 PM Uros Bizjak wrote: > > On Mon, May 9, 2022 at 4:03 AM liuhongt wrote: > > > > Similarly optimize movl + vmovq to vmovd. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > > gcc/ChangeLog: > > > > PR target/104915 > >

Re: [PATCH] [i386] Implement permutation with pslldq + psrldq + por when pshufb is not available.

2022-05-09 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 4:19 PM Uros Bizjak wrote: > > On Mon, May 9, 2022 at 7:24 AM Hongtao Liu wrote: > > > > On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches > > wrote: > > > > > > pand/pandn may be used to clear upper/lower bits of the oper

Re: [PATCH] Optimize vec_setv8{hi,hf}_0 + pmovzxbq to pmovzxbq.

2022-05-09 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 2:43 PM liuhongt via Gcc-patches wrote: > > Clean up of 16-bit uppers is not needed for pmovzxbq/pmovsxbq. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/105072 > * config/i386/sse.md

Re: [PATCH] [i386] Implement permutation with pslldq + psrldq + por when pshufb is not available.

2022-05-08 Thread Hongtao Liu via Gcc-patches
On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches wrote: > > pand/pandn may be used to clear upper/lower bits of the operands, in > that case there will be 4-5 instructions for permutation, and it's > still better than scalar codes. > > Bootstrapped and regtested on

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.

2022-05-08 Thread Hongtao Liu via Gcc-patches
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc (ix86_expand_branch): Use

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OI/TImode.

2022-05-05 Thread Hongtao Liu via Gcc-patches
On Thu, May 5, 2022 at 4:09 PM Uros Bizjak via Gcc-patches wrote: > > On Thu, May 5, 2022 at 9:50 AM Richard Biener via Gcc-patches > wrote: > > > > On Thu, May 5, 2022 at 9:37 AM liuhongt via Gcc-patches > > wrote: > > > > > > Enable optimization for TImode only under 32-bit target, for 64-bit

Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OI/TImode.

2022-05-05 Thread Hongtao Liu via Gcc-patches
On Thu, May 5, 2022 at 3:37 PM liuhongt wrote: > > Enable optimization for TImode only under 32-bit target, for 64-bit > target there could be extra ineteger <-> sse move regarding psABI, > not efficient. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > >

Re: [PATCH] AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]

2022-04-23 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang wrote: > > > Please add the corresponding intrinsic test in sse-14.c > > Sorry for forgetting this part. Updated patch. Thanks. > LGTM. > Hongtao Liu via Gcc-patches 于2022年4月22日周五 16:49写道: > > > > On Fri, Apr 22, 20

Re: [PATCH] AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]

2022-04-22 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > Add missing macro under O0 and adjust macro format for scalf > intrinsics. > Please add the corresponding intrinsic test in sse-14.c. > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for master and

Re: [x86_64 PATCH] Support pandn for V1TI mode (i.e. *andnotv1ti3).

2022-04-05 Thread Hongtao Liu via Gcc-patches
On Wed, Apr 6, 2022 at 5:56 AM Roger Sayle wrote: > > > > This simple patch allows the i386 backend to generate pandn instructions > > for V1TI mode. Currently, the testcase: > > > > typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16))); > > v1ti andnot1(v1ti x, v1ti y) { return

Re: [PATCH V3] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-04 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches wrote: > > Update in V3: > 1. Add -param=x86-stlf-window-ninsns= (default 64). > 2. Exclude call in the window. > > Since cfg is freed before machine_reorg, just do a rough calculation > of the window according to the layout. > Also

Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches wrote: > > On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches > wrote: > > > > Update in V2: > > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS. > > 2. Return for any_uncondjump_p and ANY_RETURN_P. > > 3. Add dump

Re: [PATCH] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-01 Thread Hongtao Liu via Gcc-patches
On Thu, Mar 31, 2022 at 6:45 PM Richard Biener via Gcc-patches wrote: > > On Thu, Mar 31, 2022 at 7:51 AM liuhongt wrote: > > > > Since cfg is freed before machine_reorg, just do a rough calculation > > of the window according to the layout. > > Also according to an experiment on CLX, set window

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-26 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches wrote: > > > > Is it possible to create a test case that gas would throw an error for > > > invalid operands? > > > > You can use -ffix-xmmN to disable XMM0-15. > > I mean can we create an intrinsic test for this PR that produces

Re: [PATCH] x86: Use x constraint on SSSE3 patterns with MMX operands

2022-03-26 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches wrote: > > Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND > have no AVX512 version, replace the "Yv" register constraint with the > "x" register constraint. LGTM, please backport to GCC10/GCC11 branch. > > PR

Re: [PATCH] x86: Use x constraint on KL patterns

2022-03-26 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches wrote: > > Since KL instructions have no AVX512 version, replace the "v" register > constraint with the "x" register constraint. > > PR target/105058 > * config/i386/sse.md (loadiwkey): Replace "v" with "x". >

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote: > > On Fri, 25 Mar 2022, Hongtao Liu wrote: > > > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches > > wrote: > > > > > > Since we're now vectorizing by default at -O2 issues like PR1019

Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

2022-03-25 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches wrote: > > Since we're now vectorizing by default at -O2 issues like PR101908 > become more important where we apply basic-block vectorization to > parts of the function covering loads from function parameters passed > on the stack.

Re: [PATCH] Fix ICE caused by NULL_RTX returned by lowpart_subreg.

2022-03-23 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches wrote: > > In validate_subreg, both (subreg:V2HF (reg:SI) 0) > and (subreg:V8HF (reg:V2HF) 0) are valid, but not > for (subreg:V8HF (reg:SI) 0) which causes ICE. > > Ideally it should be handled in validate_subreg to support > subreg for

Re: [PATCH] [i386] Extend splitter pattern to reversed condition by swapping then and else rtx. [PR target/104982]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote: > > Failed to match this instruction: > (set (reg/v:SI 88 [ z ]) > (if_then_else:SI (eq (zero_extract:SI (reg:SI 92) > (const_int 1 [0x1]) > (zero_extend:SI (subreg:QI (reg:SI 93) 0))) > (const_int 0

Re: [PATCH v2] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-21 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > Use masked vmovss to perform same operation which omits higher bits > of mask. > >

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b) https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss_expand=3807,3081,3082,3084,3083,4837,4838 > > LLVM generates mask & 1 for these intrinsics. > > Hongtao Liu via Gcc-patches 于2022年3月21日周一

Re: [PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > For complex scalar intrinsic like _mm_mask_fcmadd_sch, the > mask should be and by 1 to ensure the mask is bind to lowest byte. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > >

Re: [PATCH] AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]

2022-03-20 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch fixes typo in subst for scalar complex mask_round operand. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > Ok for master? > Ok. > gcc/ChangeLog: > > PR target/104977 > *

Re: [PATCH] x86: Correct march=sapphirerapids to base on icelake server

2022-03-18 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote: > > Hi Hongtao, > > This patch is to correct march=sapphirerapids to base on icelake server. > and update sapphirerapids in the documentation. > > OK for master and backport to GCC 11? Ok. > > > gcc/Changelog: > > PR target/104963 >

Re: [PATCH] [i386] Add extra cost for unsigned_load which may have stall forward issue.

2022-03-17 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches wrote: > > On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote: > > > > This patch only handle pure-slp for by-value passed parameter which > > has nothing to do with IPA but psABI. For by-reference passed > > parameter IPA is required. >

Re: [x86 PATCH] PR target/94680: Clear upper bits of V2DF using movq (like V2DI).

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote: > > > This simple i386 patch unblocks a more significant change. The testcase > gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and > alas the fix for PR target/94680 doesn't (yet) handle V2DF mode. > > For the first test

Re: [PATCH v2] x86: Also check _SOFT_FLOAT in

2022-03-15 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote: > > On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote: > > > > Push target("general-regs-only") in if x87 is enabled. > > > > gcc/ > > > > PR target/104890 > > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before > >

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote: > > On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > > LGTM, thanks for handling this. > > > > Thanks, committed. >

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-14 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote: > > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote: > > LGTM, thanks for handling this. > > Thanks, committed. > > > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2, > > > f

Re: [PATCH] i386: Fix up _mm_loadu_si{16,32} [PR99754]

2022-03-13 Thread Hongtao Liu via Gcc-patches
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote: > > Hi! > > These intrinsics are supposed to do an unaligned may_alias load > of a 16-bit or 32-bit value and store it as the first element of > a 128-bit integer vector, with all other elements cleared. > > The current _mm_storeu_*

Re: [PATCH] target/104762 - vectorization costs of CONSTRUCTORs

2022-03-11 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches wrote: > > After accounting for GPR -> XMM move cost for vec_construct the > base cost needs adjustments to not double-cost those. This also > lowers the cost when such move is not necessary. > > This fixes the observed 538.imagick_r

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-10 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote: > > ping^1 > > On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > >

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches wrote: > > On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote: > > > > For parameter passing through stack, vectorized load from parm_decl > > in callee may trigger serious STF issue. This is why GCC12 regresses > > 50% for cray at -O2

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-03-07 Thread Hongtao Liu via Gcc-patches
ping^1 On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote: > > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > > > The patch fixes ICE in ix86_gimple_fold_builtin. > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for main trunk? > > > g

Re: [PATCH] [i386] Prevent vectorization for load from parm_decl at O2 to avoid STF issue.

2022-03-07 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches wrote: > > For parameter passing through stack, vectorized load from parm_decl > in callee may trigger serious STF issue. This is why GCC12 regresses > 50% for cray at -O2 compared to GCC11. > > The patch add an extremely large number to

Re: [PATCH V2] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-06 Thread Hongtao Liu via Gcc-patches
Met some problem in git send-email --cc=a,b,c, so manually CC. On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches wrote: > > >What happens if you set preferred_for_speed to false for alternative 1? > It works, and I've removed the newly added splitter in this patch. > Also i tried to do

Re: [PATCH] i386: Fix up cond_{and,ior,xor,mul}* [PR104779]

2022-03-06 Thread Hongtao Liu via Gcc-patches
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote: > > Hi! > > The following testcase ICEs, because the cond_andv* expander > has vector_operand predicates in both of the commutative inputs > and calls gen_andv*_mask which calls ix86_binary_operator_ok > in its condition, but nothing calls

Re: [PATCH] [i386] Optimize v4si broadcast for noavx512vl.

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches wrote: > > This is incremental patch based on [1], it enables optimization as below > > - vbroadcastss.LC1(%rip), %xmm0 > + movl$-45, %edx > + vmovd %edx, %xmm0 > + vpshufd $0, %xmm0, %xmm0 > > According to

Re: [PATCH] x86: Always return pseudo register in ix86_gen_scratch_sse_rtx

2022-03-03 Thread Hongtao Liu via Gcc-patches
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches wrote: > > ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector > register to prevent RTL optimizers from removing vector register. It > introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it > is called by

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-03-01 Thread Hongtao Liu via Gcc-patches
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote: > > On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote: > > > > > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches > > > wrote: > >

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote: > > > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > > > .. in ix86_expand_vector_move and > > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > >

Re: [PATCH] [i386] Replace ix86_gen_scratch_sse_rtx with gen_reg_rtx.

2022-02-28 Thread Hongtao Liu via Gcc-patches
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches wrote: > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote: > > > > .. in ix86_expand_vector_move and > > ix86_convert_const_wide_int_to_broadcast(called by the former). > > > > ix86_expand_vector_move is called by emit_move_insn which is

Re: [PATCH] AVX512F: Add helper enumeration for ternary logic intrinsics.

2022-02-27 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 25, 2022 at 4:44 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > This patch intends to sync with llvm change in > https://reviews.llvm.org/D120307 to add enumeration and truncate This will be documented in intel intrinsic guide. > imm to unsigned char, so users could use ~ on

Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-02-24 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote: > > The patch fixes ICE in ix86_gimple_fold_builtin. > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for main trunk? > gcc/ChangeLog: > > PR target/104666 > * config/i386/i386-expand.cc >

Re: [PATCH] [i386] Fix typo in v1ti3.

2022-02-23 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 23, 2022 at 5:48 PM Jakub Jelinek via Gcc-patches wrote: > > On Wed, Feb 23, 2022 at 05:21:26PM +0800, liuhongt via Gcc-patches wrote: > > For evex encoding vp{xor,or,and}, suffix is needed. > > > > Or there would be an error for > > vpxor %ymm0, %ymm31, %ymm1 > > The insn is about

Re: [PATCH 3/3] target/99881 - x86 vector cost of CTOR from integer regs

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Mon, Feb 21, 2022 at 5:10 PM Richard Biener wrote: > > On Mon, 21 Feb 2022, Hongtao Liu wrote: > > > On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches > > wrote: > > > > > > This uses the now passed SLP node to the vectorizer costing h

Re: [PATCH] i386: Fix up copysign/xorsign expansion [PR104612]

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 22, 2022 at 12:46 AM Jakub Jelinek wrote: > > Hi! > > We ICE on the following testcase for -m32 since r12-3435. because > operands[2] is (subreg:SF (reg:DI ...) 0) and According to validate_subreg, (subreg:V4SF (reg:DI ...) 0) should be valid(but not sure if it really works ) For

Re: [PATCH v2] x86: Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO

2022-02-21 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 22, 2022 at 2:35 AM H.J. Lu wrote: > > On Sun, Feb 20, 2022 at 6:01 PM Hongtao Liu wrote: > > > > On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote: > > > > > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote: > > > > On

Re: [PATCH v2] x86: Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO

2022-02-20 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote: > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote: > > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches > > wrote: > > > > > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches &

Re: [PATCH 3/3] target/99881 - x86 vector cost of CTOR from integer regs

2022-02-20 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches wrote: > > This uses the now passed SLP node to the vectorizer costing hook > to adjust vector construction costs for the cost of moving an > integer component from a GPR to a vector register when that's > required for building a

Re: [PATCH] target/104581 - compile-time regression in mode-switching

2022-02-17 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches wrote: > > The x86 backend piggy-backs on mode-switching for insertion of > vzeroupper. A recent improvement there was implemented in a way > to walk possibly the whole basic-block for all DF reg def definitions > in its mode_needed

Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches wrote: > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride, > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX > transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to > generate

Re: [PATCH] [i386] Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:00 PM liuhongt wrote: > > Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > * config/i386/cpuid.h (bit_MPX): Removed. > (bit_BNDREGS): Ditto. > (bit_BNDCSR): Ditto. > --- > gcc/config/i386/cpuid.h | 5

Re: [PATCH] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches wrote: > > On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote: > > > > +(match (cond_expr_convert_p @0 @2 @3 @6) > > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3)) > > > > + (if

Re: [GCC 11 PATCH 0/5] x86: Backport straight-line-speculation mitigation

2022-02-15 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches wrote: > > Backport -mindirect-branch-cs-prefix: > > commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a > Author: H.J. Lu > Date: Wed Oct 27 06:27:15 2021 -0700 > > x86: Add -mindirect-branch-cs-prefix > > Add

Re: [GCC 11 PATCH 1/5] x86: Remove "%!" before ret

2022-02-15 Thread Hongtao Liu via Gcc-patches
On Tue, Feb 1, 2022 at 2:56 AM H.J. Lu via Gcc-patches wrote: > > Before MPX was removed, "%!" was mapped to > > case '!': > if (ix86_bnd_prefixed_insn_p (current_output_insn)) > fputs ("bnd ", file); > return; > > After CET was added and MPX was removed,

Re: [PATCH] x86: Update PR 35513 tests

2022-02-11 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 10, 2022 at 9:58 PM H.J. Lu via Gcc-patches wrote: > > 1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513 > run-time tests. > 2. Compile pr35513-8.c to scan assembly code. > > PR testsuite/104481 > * g++.target/i386/pr35513-1.C: Require property_1_needed

Re: [PATCH] x86: Check each component of source operand for AVX_U128_DIRTY

2022-02-08 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 9, 2022 at 10:53 AM H.J. Lu via Gcc-patches wrote: > > commit 9775e465c1fbfc32656de77c618c61acf5bd905d > Author: H.J. Lu > Date: Tue Jul 27 07:46:04 2021 -0700 > > x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register > > called ix86_check_avx_upper_register to check mode

Re: [PATCH v4] x86: Add -m[no-]direct-extern-access

2022-02-08 Thread Hongtao Liu via Gcc-patches
On Fri, Jan 28, 2022 at 5:53 AM H.J. Lu via Gcc-patches wrote: > > The v3 patch was posted at > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html > > There is no progress with repeated pings since then. Glibc 2.35 and > binutils 2.38 will support

Re: [PATCH v2] x86: Also check mode of memory broadcast in bcst_mem_operand

2022-01-23 Thread Hongtao Liu via Gcc-patches
On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches wrote: > > Return false for invalid mode on memory broadcast in bcst_mem_operand: > > (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ]))) > Yes, thanks. > gcc/ > > PR target/104188 > * config/i386/predicates.md

Re: [PATCH] i386: Fix *aesu8

2022-01-18 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 19, 2022 at 9:40 AM Jakub Jelinek wrote: > > Hi! > > On Wed, Jan 19, 2022 at 09:09:41AM +0800, Hongtao Liu wrote: > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Yes, thanks. > > Thanks. Committed. > grep '{[^|}]*}&q

Re: [PATCH] i386: Fix GLC tuning with -masm=intel [PR104104]

2022-01-18 Thread Hongtao Liu via Gcc-patches
On Wed, Jan 19, 2022 at 8:00 AM Jakub Jelinek wrote: > > On Sun, Jan 16, 2022 at 12:22:18PM +0800, Hongtao Liu via Gcc-patches wrote: > > On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Sat, Jan 15, 2022

Re: [PATCH] [i386]Adjust testcase for --target_board='unix{-m64\ -march=cascadelake}'

2022-01-17 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 18, 2022 at 10:57 AM liuhongt via Gcc-patches wrote: > > Change scan-assembler from "\tucomisd" to "\t\[v\]?ucomisd". It's an obvious "fix", Pushed to trunk. > > refer to https://gcc.gnu.org/pipermail/gcc-regression/2022-January/076241.html > > gcc/testsuite/ChangeLog: > > *

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Hongtao Liu via Gcc-patches
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches wrote: > > On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote: > > > > Thanks for the suggestion, here is the updated patch that survived > > bootstrap/regtest. > > LGTM for me, but please get the final approval from Hongtao. > Ok,

Re: [PATCH] [i386] Fix ICE of unrecognizable insn. [PR target/104001]

2022-01-13 Thread Hongtao Liu via Gcc-patches
Here's the patch I'm going to check in, the patch is pre-approved in PR. On Thu, Jan 13, 2022 at 11:59 PM liuhongt wrote: > > For define_insn_and_split "*xor2andn": > > 1. Refine predicate of operands[0] from nonimmediate_operand to > register_operand. > 2. Remove TARGET_AVX512BW from condition

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-10 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 3:21 PM Jiang, Haochen wrote: > > Hi Hongtao, > > I have changed that message in this patch. Ok for trunk? Ok. > > Thx, > Haochen > > -Original Message- > From: Hongtao Liu > Sent: Monday, January 10, 2022 3:25 PM > To: Jia

Re: [PATCH] [i386] Remove register restriction on operands for andnot insn

2022-01-09 Thread Hongtao Liu via Gcc-patches
On Mon, Jan 10, 2022 at 2:23 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch removes the register restriction on operands for andnot insn so > that it can be used from memory. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > >

Re: [PATCH] [i386]Fix tdpbf16ps testcase

2021-12-27 Thread Hongtao Liu via Gcc-patches
On Fri, Dec 24, 2021 at 4:51 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the testcase of amxbf16-dpbf16ps-2.c. Previously the type > convert has some issue. > > Ok for trunk? Ok. > > BRs, > Haochen > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/amx-check.h

Re: [PATCH] [i386] Add define_insn_and_split for vpcmp{b, w, d, q} vpcmp{ph, ps, pd}.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 21, 2021 at 2:27 PM liuhongt wrote: > > The purpose of those define_insn_and_split: > 1. Combine vpcmpuw and zero_extend into vpcmpuw. > 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just > kmov > 3. Use DImode as dest of zero_extend so cprop_hardreg can

Re: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other.

2021-12-22 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches wrote: > > Hi, > > > This patch is to enable intrinsics that convert float and bf16 data to each > other. > Ok for master? > Ok. > gcc/ChangeLog: > > * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic. >

Re: [PATCH] [i386] Optimize bit_and op1 float_vector_all_ones_operands to op1.

2021-12-19 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 1:59 PM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? Pushed to trunk. > > gcc/ChangeLog: > > PR target/98468 > * config/i386/sse.md (*bit_and_float_vector_all_ones): New > pre-reload splitter. > >

Re: [PATCH] [i386][avx512]Add combine splitter to transform vpternlogd/vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 15, 2021 at 9:26 AM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > This patch fix the regression previously reported on the combine splitter > under '-m32 -march=cascadelake' options. > > Regtested on x86_64-pc-linux-gnu. Ok. > > BRs, > Haochen > > gcc/ChangeLog: > > PR

Re: [PATCH] [i386] Add option -mvect-compare-costs

2021-12-16 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 16, 2021 at 5:00 PM Richard Sandiford via Gcc-patches wrote: > > Obviously I'm not in a position to comment on the target bits, but: > > liuhongt via Gcc-patches writes: > > Also with corresponding target attribute, option default disabled. > > > > Bootstrapped and regtested on

Re: [PATCH] [i386]Add combine splitter to transform vashr/vlshr/vashl_optab to ashr/lshr/ashl_optab for const vector duplicate operand.

2021-12-08 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 8, 2021 at 2:47 PM Haochen Jiang via Gcc-patches wrote: > > Hi, > > This patch add combine splitter to transform vashr/vlshr/vashl_optab to > ashr/lshr/ashl_optab for const vector duplicate operand. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Ok. > > BRs, > Haochen > >

Re: [PATCH] [i386]Add combine splitter to transform vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0

2021-12-07 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 8, 2021 at 11:13 AM Jiang, Haochen via Gcc-patches wrote: > > Hi Uros, > > I have fixed that in this patch attached for checking in. Is that ok for > trunk? > Uros already said it's ok with that change, let me check in the patch for you. > Regtested on x86_64-pc-linux-gnu. > > Thx, >

<    1   2   3   4   5   6   7   8   9   10   >