On Tue, Jun 21, 2022 at 3:50 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote:
> >
> > On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote:
> > >
> > > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at
> > > function entry. Skip the 4-byte
On Sat, Jun 11, 2022 at 1:46 AM H.J. Lu wrote:
>
> On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote:
> >
> > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote:
> > >
> > > * liuhongt via Libc-alpha:
> > >
> > > > +\subsubsection{Special Types}
> > > > +
> > > > +The \code{__Bfloat16} type uses
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches
wrote:
>
> This patch is to change dg-options for two testcases.
>
> Use -mtune=generic to limit these two testcases. Because configuring them with
> -mtune=cascadelake or znver3 will vectorize them.
>
> regtested on
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha
wrote:
>
> Pass and return __Bfloat16 values in XMM registers.
>
> Background:
> __Bfloat16 (BF16) is a new floating-point format that can accelerate machine
> learning (deep learning training, in particular) algorithms.
> It's first
On Wed, Jun 8, 2022 at 11:44 AM Cui, Lili wrote:
>
> > -Original Message-
> > From: Hongtao Liu
> > Sent: Monday, June 6, 2022 1:25 PM
> > To: H.J. Lu
> > Cc: Cui, Lili ; Liu, Hongtao ;
> > GCC
> > Patches
> > Subject: Re: [PATCH] U
On Tue, Jun 7, 2022 at 3:41 PM liuhongt via Gcc-patches
wrote:
>
> So alternative v won't be igored in record_reg_classess.
>
> Similar for *r alternatives in some vector patterns.
>
> It helps testcase in the PR, also RA now makes better decisions for
> gcc.target/i386/extract-insert-combining.c
On Wed, Jun 1, 2022 at 11:56 PM H.J. Lu via Gcc-patches
wrote:
>
> On Tue, May 31, 2022 at 10:06 PM Cui,Lili wrote:
> >
> > This patch is to update {skylake,icelake,alderlake}_cost to add a bit
> > preference to vector store.
> > Since the interger vector construction cost has changed, we need
On Mon, Jun 6, 2022 at 3:17 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote:
> >
> > The 64-bit, 128-bit, and 512-bit variants have VDI return type, in
> > line with instruction behavior. Make the 256-bit builtin match, thus
> > also making it match the
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/105791 which is a regression that was
> accidentally introduced for my workaround to PR tree-optimization/10566.
> (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
> shouldn't). The latest
On Wed, Jun 1, 2022 at 12:40 AM Richard Sandiford
wrote:
>
> Vladimir Makarov via Gcc-patches writes:
> > On 2022-05-29 23:05, Hongtao Liu wrote:
> >> On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches
> >> wrote:
> >>>
> >>>
On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote:
>
> On Mon, 30 May 2022, Hongtao Liu wrote:
>
> > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
> > wrote:
> > > >
> > > > The spill is mainly decided by 3 insns related
On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
wrote:
>
> > > In the PR, the spill happens in the initial basic block of the function,
> > > i.e.
> > > the one with the highest frequency.
> > >
> > > Also as noted in the PR, swapping the 'unlikely' branch to 'likely'
> > >
On Fri, May 27, 2022 at 5:12 AM Vladimir Makarov via Gcc-patches
wrote:
>
>
> On 2022-05-24 23:39, liuhongt wrote:
> > Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> > is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> > to rework backend cost
On Wed, May 25, 2022 at 11:39 AM liuhongt via Gcc-patches
wrote:
>
> Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> to rework backend cost model, but RA still not happy with that(regress
>
On Tue, May 17, 2022 at 6:07 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Tue, May 17, 2022 at 5:06 AM liuhongt wrote:
> >
> > backend has
> >
> > 16550(define_insn "*bmi2_bzhi_3_2"
> > 16551 [(set (match_operand:SWI48 0 "register_operand" "=r")
> > 16552(and:SWI48
> > 16553
On Tue, May 17, 2022 at 6:03 PM Uros Bizjak wrote:
>
> On Tue, May 17, 2022 at 3:33 AM Hongtao Liu wrote:
> >
> > On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Sat, May 7, 2022 at 7:05 AM liuhongt wrote:
&
On Fri, May 13, 2022 at 7:16 PM Richard Biener
wrote:
>
> On Fri, May 13, 2022 at 5:37 AM Hongtao Liu wrote:
> >
> > On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > On Mon, May 9, 2022 at 7:19 AM liuhongt wrote:
&g
thanks.
On Tue, May 17, 2022 at 3:09 PM Jakub Jelinek via Gcc-patches
wrote:
>
> Hi!
>
> When looking around the spot of the PR105591 fix, I've noticed a typo
> and incorrectly formatted comment.
>
> Bootstrapped/regtested on x86_64-linux and i668-linux, committed to
> trunk as obvious.
>
>
On Tue, May 17, 2022 at 11:06 AM liuhongt via Gcc-patches
wrote:
>
> backend has
>
> 16550(define_insn "*bmi2_bzhi_3_2"
> 16551 [(set (match_operand:SWI48 0 "register_operand" "=r")
> 16552(and:SWI48
> 16553 (plus:SWI48
> 16554(ashift:SWI48 (const_int 1)
> 16555
I've committed the patch.
On Fri, May 13, 2022 at 5:22 PM liuhongt via Gcc-patches
wrote:
>
> Here's updated patch which adds ix86_pre_reload_split () to those 2
> define_insn_and_splits.
>
> Assembly Optimization like:
> - vmovq %xmm0, %xmm2
> - vmovdqa .LC0(%rip), %xmm0
>
On Mon, May 16, 2022 at 5:21 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Sat, May 7, 2022 at 7:05 AM liuhongt wrote:
> >
> > This is adjusted patch only for OImode.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR
ping.
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches
wrote:
>
> This is adjusted patch only for OImode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104610
> * config/i386/i386-expand.cc
On Wed, May 11, 2022 at 4:45 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, May 9, 2022 at 7:19 AM liuhongt wrote:
> >
> > This patch will enable below optimization:
> >
> > {
> > - int bit;
> > - long long unsigned int _1;
> > - long long unsigned int _2;
> > -
> > [local count:
On Tue, May 10, 2022 at 2:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Mon, May 9, 2022 at 7:11 AM liuhongt via Gcc-patches
> wrote:
> >
> > Here's adjused patch.
> > Ok for trunk?
> >
> > Optimize
> >
> > _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>;
> > _5 = BIT_FIELD_REF
On Mon, May 9, 2022 at 4:28 PM Uros Bizjak wrote:
>
> On Mon, May 9, 2022 at 4:03 AM liuhongt wrote:
> >
> > Similarly optimize movl + vmovq to vmovd.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/104915
> >
On Mon, May 9, 2022 at 4:19 PM Uros Bizjak wrote:
>
> On Mon, May 9, 2022 at 7:24 AM Hongtao Liu wrote:
> >
> > On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches
> > wrote:
> > >
> > > pand/pandn may be used to clear upper/lower bits of the oper
On Mon, May 9, 2022 at 2:43 PM liuhongt via Gcc-patches
wrote:
>
> Clean up of 16-bit uppers is not needed for pmovzxbq/pmovsxbq.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/105072
> * config/i386/sse.md
On Mon, May 9, 2022 at 1:22 PM liuhongt via Gcc-patches
wrote:
>
> pand/pandn may be used to clear upper/lower bits of the operands, in
> that case there will be 4-5 instructions for permutation, and it's
> still better than scalar codes.
>
> Bootstrapped and regtested on
On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches
wrote:
>
> This is adjusted patch only for OImode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/104610
> * config/i386/i386-expand.cc (ix86_expand_branch): Use
On Thu, May 5, 2022 at 4:09 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, May 5, 2022 at 9:50 AM Richard Biener via Gcc-patches
> wrote:
> >
> > On Thu, May 5, 2022 at 9:37 AM liuhongt via Gcc-patches
> > wrote:
> > >
> > > Enable optimization for TImode only under 32-bit target, for 64-bit
On Thu, May 5, 2022 at 3:37 PM liuhongt wrote:
>
> Enable optimization for TImode only under 32-bit target, for 64-bit
> target there could be extra ineteger <-> sse move regarding psABI,
> not efficient.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
>
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang wrote:
>
> > Please add the corresponding intrinsic test in sse-14.c
>
> Sorry for forgetting this part. Updated patch. Thanks.
>
LGTM.
> Hongtao Liu via Gcc-patches 于2022年4月22日周五 16:49写道:
> >
> > On Fri, Apr 22, 20
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> Add missing macro under O0 and adjust macro format for scalf
> intrinsics.
>
Please add the corresponding intrinsic test in sse-14.c.
> Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}.
>
> Ok for master and
On Wed, Apr 6, 2022 at 5:56 AM Roger Sayle wrote:
>
>
>
> This simple patch allows the i386 backend to generate pandn instructions
>
> for V1TI mode. Currently, the testcase:
>
>
>
> typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16)));
>
> v1ti andnot1(v1ti x, v1ti y) { return
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches
wrote:
>
> Update in V3:
> 1. Add -param=x86-stlf-window-ninsns= (default 64).
> 2. Exclude call in the window.
>
> Since cfg is freed before machine_reorg, just do a rough calculation
> of the window according to the layout.
> Also
On Fri, Apr 1, 2022 at 2:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Apr 1, 2022 at 8:47 AM liuhongt via Gcc-patches
> wrote:
> >
> > Update in V2:
> > 1. Use get_insns instead of FOR_EACH_BB_CFUN and FOR_BB_INSNS.
> > 2. Return for any_uncondjump_p and ANY_RETURN_P.
> > 3. Add dump
On Thu, Mar 31, 2022 at 6:45 PM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Mar 31, 2022 at 7:51 AM liuhongt wrote:
> >
> > Since cfg is freed before machine_reorg, just do a rough calculation
> > of the window according to the layout.
> > Also according to an experiment on CLX, set window
On Sat, Mar 26, 2022 at 10:05 AM Hongyu Wang via Gcc-patches
wrote:
>
> > > Is it possible to create a test case that gas would throw an error for
> > > invalid operands?
> >
> > You can use -ffix-xmmN to disable XMM0-15.
>
> I mean can we create an intrinsic test for this PR that produces
On Sat, Mar 26, 2022 at 1:27 AM H.J. Lu via Gcc-patches
wrote:
>
> Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
> have no AVX512 version, replace the "Yv" register constraint with the
> "x" register constraint.
LGTM, please backport to GCC10/GCC11 branch.
>
> PR
On Sat, Mar 26, 2022 at 4:50 AM H.J. Lu via Gcc-patches
wrote:
>
> Since KL instructions have no AVX512 version, replace the "v" register
> constraint with the "x" register constraint.
>
> PR target/105058
> * config/i386/sse.md (loadiwkey): Replace "v" with "x".
>
On Fri, Mar 25, 2022 at 9:42 PM Richard Biener wrote:
>
> On Fri, 25 Mar 2022, Hongtao Liu wrote:
>
> > On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > Since we're now vectorizing by default at -O2 issues like PR1019
On Fri, Mar 25, 2022 at 8:11 PM Richard Biener via Gcc-patches
wrote:
>
> Since we're now vectorizing by default at -O2 issues like PR101908
> become more important where we apply basic-block vectorization to
> parts of the function covering loads from function parameters passed
> on the stack.
On Wed, Mar 23, 2022 at 2:05 PM liuhongt via Gcc-patches
wrote:
>
> In validate_subreg, both (subreg:V2HF (reg:SI) 0)
> and (subreg:V8HF (reg:V2HF) 0) are valid, but not
> for (subreg:V8HF (reg:SI) 0) which causes ICE.
>
> Ideally it should be handled in validate_subreg to support
> subreg for
On Mon, Mar 21, 2022 at 9:06 PM liuhongt wrote:
>
> Failed to match this instruction:
> (set (reg/v:SI 88 [ z ])
> (if_then_else:SI (eq (zero_extract:SI (reg:SI 92)
> (const_int 1 [0x1])
> (zero_extend:SI (subreg:QI (reg:SI 93) 0)))
> (const_int 0
On Mon, Mar 21, 2022 at 7:52 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
> Use masked vmovss to perform same operation which omits higher bits
> of mask.
>
>
m_mask_move_ss (__m128 src, __mmask8 k, __m128 a, __m128 b)
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vmovss_expand=3807,3081,3082,3084,3083,4837,4838
>
> LLVM generates mask & 1 for these intrinsics.
>
> Hongtao Liu via Gcc-patches 于2022年3月21日周一
On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
>
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch fixes typo in subst for scalar complex mask_round operand.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> PR target/104977
> *
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote:
>
> Hi Hongtao,
>
> This patch is to correct march=sapphirerapids to base on icelake server.
> and update sapphirerapids in the documentation.
>
> OK for master and backport to GCC 11?
Ok.
>
>
> gcc/Changelog:
>
> PR target/104963
>
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote:
> >
> > This patch only handle pure-slp for by-value passed parameter which
> > has nothing to do with IPA but psABI. For by-reference passed
> > parameter IPA is required.
>
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote:
>
>
> This simple i386 patch unblocks a more significant change. The testcase
> gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
> alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.
>
> For the first test
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote:
>
> On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote:
> >
> > Push target("general-regs-only") in if x87 is enabled.
> >
> > gcc/
> >
> > PR target/104890
> > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
> >
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote:
>
> On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
> >
> > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > > LGTM, thanks for handling this.
> >
> > Thanks, committed.
>
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
>
> On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > LGTM, thanks for handling this.
>
> Thanks, committed.
>
> > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
> > > f
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote:
>
> Hi!
>
> These intrinsics are supposed to do an unaligned may_alias load
> of a 16-bit or 32-bit value and store it as the first element of
> a 128-bit integer vector, with all other elements cleared.
>
> The current _mm_storeu_*
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches
wrote:
>
> After accounting for GPR -> XMM move cost for vec_construct the
> base cost needs adjustments to not double-cost those. This also
> lowers the cost when such move is not necessary.
>
> This fixes the observed 538.imagick_r
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote:
>
> ping^1
>
> On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
> >
> > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> > >
> > > The patch fixes ICE in ix86_gimple_fold_builtin.
> > >
>
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote:
> >
> > For parameter passing through stack, vectorized load from parm_decl
> > in callee may trigger serious STF issue. This is why GCC12 regresses
> > 50% for cray at -O2
ping^1
On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
>
> On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> >
> > The patch fixes ICE in ix86_gimple_fold_builtin.
> >
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for main trunk?
>
> > g
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches
wrote:
>
> For parameter passing through stack, vectorized load from parm_decl
> in callee may trigger serious STF issue. This is why GCC12 regresses
> 50% for cray at -O2 compared to GCC11.
>
> The patch add an extremely large number to
Met some problem in git send-email --cc=a,b,c, so manually CC.
On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches
wrote:
>
> >What happens if you set preferred_for_speed to false for alternative 1?
> It works, and I've removed the newly added splitter in this patch.
> Also i tried to do
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase ICEs, because the cond_andv* expander
> has vector_operand predicates in both of the commutative inputs
> and calls gen_andv*_mask which calls ix86_binary_operator_ok
> in its condition, but nothing calls
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
wrote:
>
> This is incremental patch based on [1], it enables optimization as below
>
> - vbroadcastss.LC1(%rip), %xmm0
> + movl$-45, %edx
> + vmovd %edx, %xmm0
> + vpshufd $0, %xmm0, %xmm0
>
> According to
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches
wrote:
>
> ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector
> register to prevent RTL optimizers from removing vector register. It
> introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it
> is called by
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote:
>
> On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote:
> > >
> > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
> > > wrote:
> >
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> > >
> > > .. in ix86_expand_vector_move and
> > > ix86_convert_const_wide_int_to_broadcast(called by the former).
> > >
> >
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> >
> > .. in ix86_expand_vector_move and
> > ix86_convert_const_wide_int_to_broadcast(called by the former).
> >
> > ix86_expand_vector_move is called by emit_move_insn which is
On Fri, Feb 25, 2022 at 4:44 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch intends to sync with llvm change in
> https://reviews.llvm.org/D120307 to add enumeration and truncate
This will be documented in intel intrinsic guide.
> imm to unsigned char, so users could use ~ on
On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
>
> The patch fixes ICE in ix86_gimple_fold_builtin.
>
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for main trunk?
> gcc/ChangeLog:
>
> PR target/104666
> * config/i386/i386-expand.cc
>
On Wed, Feb 23, 2022 at 5:48 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 23, 2022 at 05:21:26PM +0800, liuhongt via Gcc-patches wrote:
> > For evex encoding vp{xor,or,and}, suffix is needed.
> >
> > Or there would be an error for
> > vpxor %ymm0, %ymm31, %ymm1
>
> The insn is about
On Mon, Feb 21, 2022 at 5:10 PM Richard Biener wrote:
>
> On Mon, 21 Feb 2022, Hongtao Liu wrote:
>
> > On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > This uses the now passed SLP node to the vectorizer costing h
On Tue, Feb 22, 2022 at 12:46 AM Jakub Jelinek wrote:
>
> Hi!
>
> We ICE on the following testcase for -m32 since r12-3435. because
> operands[2] is (subreg:SF (reg:DI ...) 0) and
According to validate_subreg, (subreg:V4SF (reg:DI ...) 0) should be
valid(but not sure if it really works )
For
On Tue, Feb 22, 2022 at 2:35 AM H.J. Lu wrote:
>
> On Sun, Feb 20, 2022 at 6:01 PM Hongtao Liu wrote:
> >
> > On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
> > >
> > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > > > On
On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
>
> On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
> > wrote:
> > >
> > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
&
On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
wrote:
>
> This uses the now passed SLP node to the vectorizer costing hook
> to adjust vector construction costs for the cost of moving an
> integer component from a GPR to a vector register when that's
> required for building a
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches
wrote:
>
> The x86 backend piggy-backs on mode-switching for insertion of
> vzeroupper. A recent improvement there was implemented in a way
> to walk possibly the whole basic-block for all DF reg def definitions
> in its mode_needed
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
wrote:
>
> Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> generate
On Thu, Feb 17, 2022 at 12:00 PM liuhongt wrote:
>
> Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/cpuid.h (bit_MPX): Removed.
> (bit_BNDREGS): Ditto.
> (bit_BNDCSR): Ditto.
> ---
> gcc/config/i386/cpuid.h | 5
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote:
> > > > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > > > + (if
On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches
wrote:
>
> Backport -mindirect-branch-cs-prefix:
>
> commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a
> Author: H.J. Lu
> Date: Wed Oct 27 06:27:15 2021 -0700
>
> x86: Add -mindirect-branch-cs-prefix
>
> Add
On Tue, Feb 1, 2022 at 2:56 AM H.J. Lu via Gcc-patches
wrote:
>
> Before MPX was removed, "%!" was mapped to
>
> case '!':
> if (ix86_bnd_prefixed_insn_p (current_output_insn))
> fputs ("bnd ", file);
> return;
>
> After CET was added and MPX was removed,
On Thu, Feb 10, 2022 at 9:58 PM H.J. Lu via Gcc-patches
wrote:
>
> 1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513
> run-time tests.
> 2. Compile pr35513-8.c to scan assembly code.
>
> PR testsuite/104481
> * g++.target/i386/pr35513-1.C: Require property_1_needed
On Wed, Feb 9, 2022 at 10:53 AM H.J. Lu via Gcc-patches
wrote:
>
> commit 9775e465c1fbfc32656de77c618c61acf5bd905d
> Author: H.J. Lu
> Date: Tue Jul 27 07:46:04 2021 -0700
>
> x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
>
> called ix86_check_avx_upper_register to check mode
On Fri, Jan 28, 2022 at 5:53 AM H.J. Lu via Gcc-patches
wrote:
>
> The v3 patch was posted at
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html
>
> There is no progress with repeated pings since then. Glibc 2.35 and
> binutils 2.38 will support
On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches
wrote:
>
> Return false for invalid mode on memory broadcast in bcst_mem_operand:
>
> (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ])))
>
Yes, thanks.
> gcc/
>
> PR target/104188
> * config/i386/predicates.md
On Wed, Jan 19, 2022 at 9:40 AM Jakub Jelinek wrote:
>
> Hi!
>
> On Wed, Jan 19, 2022 at 09:09:41AM +0800, Hongtao Liu wrote:
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > Yes, thanks.
>
> Thanks. Committed.
> grep '{[^|}]*}&q
On Wed, Jan 19, 2022 at 8:00 AM Jakub Jelinek wrote:
>
> On Sun, Jan 16, 2022 at 12:22:18PM +0800, Hongtao Liu via Gcc-patches wrote:
> > On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Sat, Jan 15, 2022
On Tue, Jan 18, 2022 at 10:57 AM liuhongt via Gcc-patches
wrote:
>
> Change scan-assembler from "\tucomisd" to "\t\[v\]?ucomisd".
It's an obvious "fix", Pushed to trunk.
>
> refer to https://gcc.gnu.org/pipermail/gcc-regression/2022-January/076241.html
>
> gcc/testsuite/ChangeLog:
>
> *
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote:
> >
> > Thanks for the suggestion, here is the updated patch that survived
> > bootstrap/regtest.
>
> LGTM for me, but please get the final approval from Hongtao.
>
Ok,
Here's the patch I'm going to check in, the patch is pre-approved in PR.
On Thu, Jan 13, 2022 at 11:59 PM liuhongt wrote:
>
> For define_insn_and_split "*xor2andn":
>
> 1. Refine predicate of operands[0] from nonimmediate_operand to
> register_operand.
> 2. Remove TARGET_AVX512BW from condition
On Mon, Jan 10, 2022 at 3:21 PM Jiang, Haochen wrote:
>
> Hi Hongtao,
>
> I have changed that message in this patch. Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> -Original Message-
> From: Hongtao Liu
> Sent: Monday, January 10, 2022 3:25 PM
> To: Jia
On Mon, Jan 10, 2022 at 2:23 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch removes the register restriction on operands for andnot insn so
> that it can be used from memory.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
>
On Fri, Dec 24, 2021 at 4:51 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch fix the testcase of amxbf16-dpbf16ps-2.c. Previously the type
> convert has some issue.
>
> Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/amx-check.h
On Tue, Dec 21, 2021 at 2:27 PM liuhongt wrote:
>
> The purpose of those define_insn_and_split:
> 1. Combine vpcmpuw and zero_extend into vpcmpuw.
> 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just
> kmov
> 3. Use DImode as dest of zero_extend so cprop_hardreg can
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
>
> This patch is to enable intrinsics that convert float and bf16 data to each
> other.
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic.
>
On Thu, Dec 16, 2021 at 1:59 PM liuhongt wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
Pushed to trunk.
>
> gcc/ChangeLog:
>
> PR target/98468
> * config/i386/sse.md (*bit_and_float_vector_all_ones): New
> pre-reload splitter.
>
>
On Wed, Dec 15, 2021 at 9:26 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch fix the regression previously reported on the combine splitter
> under '-m32 -march=cascadelake' options.
>
> Regtested on x86_64-pc-linux-gnu.
Ok.
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
> PR
On Thu, Dec 16, 2021 at 5:00 PM Richard Sandiford via Gcc-patches
wrote:
>
> Obviously I'm not in a position to comment on the target bits, but:
>
> liuhongt via Gcc-patches writes:
> > Also with corresponding target attribute, option default disabled.
> >
> > Bootstrapped and regtested on
On Wed, Dec 8, 2021 at 2:47 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi,
>
> This patch add combine splitter to transform vashr/vlshr/vashl_optab to
> ashr/lshr/ashl_optab for const vector duplicate operand.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
>
On Wed, Dec 8, 2021 at 11:13 AM Jiang, Haochen via Gcc-patches
wrote:
>
> Hi Uros,
>
> I have fixed that in this patch attached for checking in. Is that ok for
> trunk?
>
Uros already said it's ok with that change, let me check in the patch for you.
> Regtested on x86_64-pc-linux-gnu.
>
> Thx,
>
501 - 600 of 1240 matches
Mail list logo