On Sat, Mar 19, 2022 at 8:09 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
> mask should be and by 1 to ensure the mask is bind to lowest byte.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
> gcc
On Sat, Mar 19, 2022 at 8:06 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch fixes typo in subst for scalar complex mask_round operand.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
>
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> PR target/104977
> * c
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote:
>
> Hi Hongtao,
>
> This patch is to correct march=sapphirerapids to base on icelake server.
> and update sapphirerapids in the documentation.
>
> OK for master and backport to GCC 11?
Ok.
>
>
> gcc/Changelog:
>
> PR target/104963
>
On Wed, Mar 16, 2022 at 5:54 PM Richard Biener via Gcc-patches
wrote:
>
> On Wed, Mar 16, 2022 at 3:19 AM liuhongt wrote:
> >
> > This patch only handle pure-slp for by-value passed parameter which
> > has nothing to do with IPA but psABI. For by-reference passed
> > parameter IPA is required.
>
On Tue, Mar 15, 2022 at 10:52 PM Roger Sayle wrote:
>
>
> This simple i386 patch unblocks a more significant change. The testcase
> gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
> alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.
>
> For the first test fro
On Tue, Mar 15, 2022 at 10:40 PM H.J. Lu wrote:
>
> On Mon, Mar 14, 2022 at 7:31 AM H.J. Lu wrote:
> >
> > Push target("general-regs-only") in if x87 is enabled.
> >
> > gcc/
> >
> > PR target/104890
> > * config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
> > push
On Mon, Mar 14, 2022 at 8:20 PM Hongtao Liu wrote:
>
> On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
> >
> > On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > > LGTM, thanks for handling this.
> >
> > Thanks, committed.
> >
On Mon, Mar 14, 2022 at 7:25 PM Jakub Jelinek wrote:
>
> On Sun, Mar 13, 2022 at 09:34:10PM +0800, Hongtao Liu wrote:
> > LGTM, thanks for handling this.
>
> Thanks, committed.
>
> > > Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
> > > f
On Sun, Mar 13, 2022 at 3:28 AM Jakub Jelinek wrote:
>
> Hi!
>
> These intrinsics are supposed to do an unaligned may_alias load
> of a 16-bit or 32-bit value and store it as the first element of
> a 128-bit integer vector, with all other elements cleared.
>
> The current _mm_storeu_* implementati
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches
wrote:
>
> After accounting for GPR -> XMM move cost for vec_construct the
> base cost needs adjustments to not double-cost those. This also
> lowers the cost when such move is not necessary.
>
> This fixes the observed 538.imagick_r
On Tue, Mar 8, 2022 at 9:30 AM Hongtao Liu wrote:
>
> ping^1
>
> On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
> >
> > On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> > >
> > > The patch fixes ICE in ix86_gimple_fold_builtin.
> > >
>
On Mon, Mar 7, 2022 at 5:37 PM Richard Biener via Gcc-patches
wrote:
>
> On Fri, Mar 4, 2022 at 8:27 AM liuhongt wrote:
> >
> > For parameter passing through stack, vectorized load from parm_decl
> > in callee may trigger serious STF issue. This is why GCC12 regresses
> > 50% for cray at -O2 comp
ping^1
On Fri, Feb 25, 2022 at 1:51 PM Hongtao Liu wrote:
>
> On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
> >
> > The patch fixes ICE in ix86_gimple_fold_builtin.
> >
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for main trunk?
>
> > g
On Fri, Mar 4, 2022 at 3:28 PM liuhongt via Gcc-patches
wrote:
>
> For parameter passing through stack, vectorized load from parm_decl
> in callee may trigger serious STF issue. This is why GCC12 regresses
> 50% for cray at -O2 compared to GCC11.
>
> The patch add an extremely large number to stmt
Met some problem in git send-email --cc=a,b,c, so manually CC.
On Mon, Mar 7, 2022 at 1:11 PM liuhongt via Gcc-patches
wrote:
>
> >What happens if you set preferred_for_speed to false for alternative 1?
> It works, and I've removed the newly added splitter in this patch.
> Also i tried to do simi
On Sat, Mar 5, 2022 at 4:05 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase ICEs, because the cond_andv* expander
> has vector_operand predicates in both of the commutative inputs
> and calls gen_andv*_mask which calls ix86_binary_operator_ok
> in its condition, but nothing calls ix86_f
On Fri, Mar 4, 2022 at 10:29 AM liuhongt via Gcc-patches
wrote:
>
> This is incremental patch based on [1], it enables optimization as below
>
> - vbroadcastss.LC1(%rip), %xmm0
> + movl$-45, %edx
> + vmovd %edx, %xmm0
> + vpshufd $0, %xmm0, %xmm0
>
> According to
On Thu, Mar 3, 2022 at 10:22 PM H.J. Lu via Gcc-patches
wrote:
>
> ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector
> register to prevent RTL optimizers from removing vector register. It
> introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it
> is called by R
On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu wrote:
>
> On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu wrote:
> > >
> > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
> > > wrote:
> >
On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu wrote:
> >
> > On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> > >
> > > .. in ix86_expand_vector_move and
> > > ix86_convert_const_wide_int_to_broadcast(called by the former).
> > >
> >
On Tue, Mar 1, 2022 at 10:27 AM H.J. Lu via Gcc-patches
wrote:
>
> On Mon, Feb 28, 2022 at 6:03 PM liuhongt wrote:
> >
> > .. in ix86_expand_vector_move and
> > ix86_convert_const_wide_int_to_broadcast(called by the former).
> >
> > ix86_expand_vector_move is called by emit_move_insn which is use
On Fri, Feb 25, 2022 at 4:44 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch intends to sync with llvm change in
> https://reviews.llvm.org/D120307 to add enumeration and truncate
This will be documented in intel intrinsic guide.
> imm to unsigned char, so users could use ~ on immedia
On Fri, Feb 25, 2022 at 1:50 PM liuhongt wrote:
>
> The patch fixes ICE in ix86_gimple_fold_builtin.
>
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for main trunk?
> gcc/ChangeLog:
>
> PR target/104666
> * config/i386/i386-expand.cc
> (ix86_check_builtin_isa_m
On Wed, Feb 23, 2022 at 5:48 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 23, 2022 at 05:21:26PM +0800, liuhongt via Gcc-patches wrote:
> > For evex encoding vp{xor,or,and}, suffix is needed.
> >
> > Or there would be an error for
> > vpxor %ymm0, %ymm31, %ymm1
>
> The insn is about V1T
On Mon, Feb 21, 2022 at 5:10 PM Richard Biener wrote:
>
> On Mon, 21 Feb 2022, Hongtao Liu wrote:
>
> > On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
> > wrote:
> > >
> > > This uses the now passed SLP node to the vectorizer costing h
On Tue, Feb 22, 2022 at 12:46 AM Jakub Jelinek wrote:
>
> Hi!
>
> We ICE on the following testcase for -m32 since r12-3435. because
> operands[2] is (subreg:SF (reg:DI ...) 0) and
According to validate_subreg, (subreg:V4SF (reg:DI ...) 0) should be
valid(but not sure if it really works )
For -m64
On Tue, Feb 22, 2022 at 2:35 AM H.J. Lu wrote:
>
> On Sun, Feb 20, 2022 at 6:01 PM Hongtao Liu wrote:
> >
> > On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
> > >
> > > On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > > > On
On Thu, Feb 17, 2022 at 9:56 PM H.J. Lu wrote:
>
> On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
> > wrote:
> > >
> > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
&
On Fri, Feb 18, 2022 at 10:01 PM Richard Biener via Gcc-patches
wrote:
>
> This uses the now passed SLP node to the vectorizer costing hook
> to adjust vector construction costs for the cost of moving an
> integer component from a GPR to a vector register when that's
> required for building a vect
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches
wrote:
>
> The x86 backend piggy-backs on mode-switching for insertion of
> vzeroupper. A recent improvement there was implemented in a way
> to walk possibly the whole basic-block for all DF reg def definitions
> in its mode_needed h
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
wrote:
>
> Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> generate vzero
On Thu, Feb 17, 2022 at 12:00 PM liuhongt wrote:
>
> Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/cpuid.h (bit_MPX): Removed.
> (bit_BNDREGS): Ditto.
> (bit_BNDCSR): Ditto.
> ---
> gcc/config/i386/cpuid.h | 5
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches
wrote:
>
> On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote:
> > > > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > > > + (if (types_mat
On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches
wrote:
>
> Backport -mindirect-branch-cs-prefix:
>
> commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a
> Author: H.J. Lu
> Date: Wed Oct 27 06:27:15 2021 -0700
>
> x86: Add -mindirect-branch-cs-prefix
>
> Add -mindirect-branch-cs-pref
On Tue, Feb 1, 2022 at 2:56 AM H.J. Lu via Gcc-patches
wrote:
>
> Before MPX was removed, "%!" was mapped to
>
> case '!':
> if (ix86_bnd_prefixed_insn_p (current_output_insn))
> fputs ("bnd ", file);
> return;
>
> After CET was added and MPX was removed, "%
On Thu, Feb 10, 2022 at 9:58 PM H.J. Lu via Gcc-patches
wrote:
>
> 1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513
> run-time tests.
> 2. Compile pr35513-8.c to scan assembly code.
>
> PR testsuite/104481
> * g++.target/i386/pr35513-1.C: Require property_1_needed
On Wed, Feb 9, 2022 at 10:53 AM H.J. Lu via Gcc-patches
wrote:
>
> commit 9775e465c1fbfc32656de77c618c61acf5bd905d
> Author: H.J. Lu
> Date: Tue Jul 27 07:46:04 2021 -0700
>
> x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
>
> called ix86_check_avx_upper_register to check mode
On Fri, Jan 28, 2022 at 5:53 AM H.J. Lu via Gcc-patches
wrote:
>
> The v3 patch was posted at
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html
>
> There is no progress with repeated pings since then. Glibc 2.35 and
> binutils 2.38 will support GNU_PROPERTY_1_NEEDED_INDIRECT_EXT
On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches
wrote:
>
> Return false for invalid mode on memory broadcast in bcst_mem_operand:
>
> (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ])))
>
Yes, thanks.
> gcc/
>
> PR target/104188
> * config/i386/predicates.md (bcst_mem
On Wed, Jan 19, 2022 at 9:40 AM Jakub Jelinek wrote:
>
> Hi!
>
> On Wed, Jan 19, 2022 at 09:09:41AM +0800, Hongtao Liu wrote:
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > Yes, thanks.
>
> Thanks. Committed.
> grep '{[^|}
On Wed, Jan 19, 2022 at 8:00 AM Jakub Jelinek wrote:
>
> On Sun, Jan 16, 2022 at 12:22:18PM +0800, Hongtao Liu via Gcc-patches wrote:
> > On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Sat, Jan 15, 2022
On Tue, Jan 18, 2022 at 10:57 AM liuhongt via Gcc-patches
wrote:
>
> Change scan-assembler from "\tucomisd" to "\t\[v\]?ucomisd".
It's an obvious "fix", Pushed to trunk.
>
> refer to https://gcc.gnu.org/pipermail/gcc-regression/2022-January/076241.html
>
> gcc/testsuite/ChangeLog:
>
> * g+
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
wrote:
>
> On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote:
> >
> > Thanks for the suggestion, here is the updated patch that survived
> > bootstrap/regtest.
>
> LGTM for me, but please get the final approval from Hongtao.
>
Ok, thank
Here's the patch I'm going to check in, the patch is pre-approved in PR.
On Thu, Jan 13, 2022 at 11:59 PM liuhongt wrote:
>
> For define_insn_and_split "*xor2andn":
>
> 1. Refine predicate of operands[0] from nonimmediate_operand to
> register_operand.
> 2. Remove TARGET_AVX512BW from condition t
On Mon, Jan 10, 2022 at 3:21 PM Jiang, Haochen wrote:
>
> Hi Hongtao,
>
> I have changed that message in this patch. Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> -Original Message-
> From: Hongtao Liu
> Sent: Monday, January 10, 2022 3:25 PM
> To: Jia
On Mon, Jan 10, 2022 at 2:23 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch removes the register restriction on operands for andnot insn so
> that it can be used from memory.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
>
On Fri, Dec 24, 2021 at 4:51 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch fix the testcase of amxbf16-dpbf16ps-2.c. Previously the type
> convert has some issue.
>
> Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/amx-check.h (
On Tue, Dec 21, 2021 at 2:27 PM liuhongt wrote:
>
> The purpose of those define_insn_and_split:
> 1. Combine vpcmpuw and zero_extend into vpcmpuw.
> 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just
> kmov
> 3. Use DImode as dest of zero_extend so cprop_hardreg can elim
On Wed, Dec 22, 2021 at 11:28 AM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
>
> This patch is to enable intrinsics that convert float and bf16 data to each
> other.
> Ok for master?
>
Ok.
> gcc/ChangeLog:
>
> * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic.
>
On Thu, Dec 16, 2021 at 1:59 PM liuhongt wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
Pushed to trunk.
>
> gcc/ChangeLog:
>
> PR target/98468
> * config/i386/sse.md (*bit_and_float_vector_all_ones): New
> pre-reload splitter.
>
> gcc/
On Wed, Dec 15, 2021 at 9:26 AM Haochen Jiang via Gcc-patches
wrote:
>
> Hi all,
>
> This patch fix the regression previously reported on the combine splitter
> under '-m32 -march=cascadelake' options.
>
> Regtested on x86_64-pc-linux-gnu.
Ok.
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
> PR
On Thu, Dec 16, 2021 at 5:00 PM Richard Sandiford via Gcc-patches
wrote:
>
> Obviously I'm not in a position to comment on the target bits, but:
>
> liuhongt via Gcc-patches writes:
> > Also with corresponding target attribute, option default disabled.
> >
> > Bootstrapped and regtested on x86_64
On Wed, Dec 8, 2021 at 2:47 PM Haochen Jiang via Gcc-patches
wrote:
>
> Hi,
>
> This patch add combine splitter to transform vashr/vlshr/vashl_optab to
> ashr/lshr/ashl_optab for const vector duplicate operand.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
> gcc/Cha
On Wed, Dec 8, 2021 at 11:13 AM Jiang, Haochen via Gcc-patches
wrote:
>
> Hi Uros,
>
> I have fixed that in this patch attached for checking in. Is that ok for
> trunk?
>
Uros already said it's ok with that change, let me check in the patch for you.
> Regtested on x86_64-pc-linux-gnu.
>
> Thx,
>
care for 64-bit moves which are expensive on 32-bit
> targets.
I like your version, update patch.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} w/ and w/o -march=k8.
On Mon, Dec 6, 2021 at 11:41 AM liuhongt wrote:
>
> When moves between integer and sse registers are cheap.
>
&
On Thu, Dec 2, 2021 at 4:27 PM liuhongt wrote:
>
> The patch helps reload to choose GENENRAL_REGS alternatives for
> SSE_FLOAT_MODE and enabled optimization like
>
> - vmovd %xmm0, -4(%rsp)
> - movl$1, %eax
> - addl-4(%rsp), %eax
> + movd%xmm0, %eax
> +
On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak wrote:
>
> Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar
> element 0 inserts to from a GP register, SSE register or memory. Also
> add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is
> split after reload t
On Tue, Nov 30, 2021 at 5:44 PM liuhongt via Gcc-patches
wrote:
>
> ix86_attr_length_immediate_default assume TYPE ishift only have 1
> constant operand,
> but *x86_64_shld_1/*x86_shld_1/*x86_64_shrd_1/*x86_shrd_1 has 2, with
> condition: INTVAL (operands[3]) == 32 - INTVAL (operands[2]) or
> INTV
On Tue, Nov 30, 2021 at 5:21 AM Uros Bizjak wrote:
>
> On Mon, Nov 29, 2021 at 10:48 AM Hongtao Liu wrote:
> >
> > On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote:
> > >
> > > On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote:
> > > >
> > &g
On Mon, Nov 29, 2021 at 3:53 PM Uros Bizjak wrote:
>
> On Mon, Nov 29, 2021 at 2:32 AM liuhongt wrote:
> >
> > There're several failures reported in [1]:
> > 1. unsupported instruction `pextrw` for "pextrw $0, %xmm31, 16(%rax)"
> > %vpextrw should be used in output templates.
> > 2. ICE in get_a
On Thu, Nov 25, 2021 at 12:18 PM H.J. Lu via Gcc-patches
wrote:
>
> Replace long with int64_t to work with -mx32.
Thanks.
>
> * gcc.target/i386/pr103194-5.c: Replace long with int64_t.
> ---
> gcc/testsuite/gcc.target/i386/pr103194-5.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletio
On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to add a test case similar to the one in i386
> to add testing coverage for 510.parest_r hotspots.
>
> As evaluated, the emulated gather capability of vectorizer
> (r12-2733) can help to speed up SPEC2017 5
On Fri, Nov 19, 2021 at 3:53 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Fri, Nov 19, 2021 at 8:50 AM Uros Bizjak wrote:
> >
> > On Fri, Nov 19, 2021 at 2:14 AM liuhongt wrote:
> > >
> > > >Why is the above declared as a special memory constraint? Also the
> > > Change to define_memory_constrai
On Tue, Nov 16, 2021 at 4:35 PM Hongtao Liu wrote:
>
> On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches
> wrote:
> >
> > Hi,
> >
> > This patch is to add alias for f*mul_*ch intrinsics.
> >
> > Ok for master?
> This patch j
On Tue, Nov 16, 2021 at 4:15 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with
> -mf16c. So added define_insn extendhfsf2 and truncsfhf2 for target_f16c.
>
> OK for master?
>
> gcc/ChangeLog:
>
> PR target/102811
On Tue, Nov 16, 2021 at 4:23 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to add alias for f*mul_*ch intrinsics.
>
> Ok for master?
This patch just adds some macro definitions (new aliases for
intrinsic) to the header file, and I think this should be low risk.
And considering
On Mon, Nov 15, 2021 at 9:37 AM Hongtao Liu wrote:
>
> On Sat, Nov 13, 2021 at 2:21 AM H.J. Lu via Gcc-patches
> wrote:
> >
> > On Fri, Nov 12, 2021 at 8:13 AM Jakub Jelinek wrote:
> > >
> > > On Fri, Nov 12, 2021 at 07:55:26AM -0800, H.J. Lu wrote:
>
On Sat, Nov 13, 2021 at 2:21 AM H.J. Lu via Gcc-patches
wrote:
>
> On Fri, Nov 12, 2021 at 8:13 AM Jakub Jelinek wrote:
> >
> > On Fri, Nov 12, 2021 at 07:55:26AM -0800, H.J. Lu wrote:
> > > > I have following patch queued for testing for this...
> > > >
> > > > 2021-11-12 Jakub Jelinek
> > >
On Tue, Nov 9, 2021 at 6:22 PM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Nov 9, 2021 at 3:09 AM liuhongt wrote:
> >
> > This will enable transformation like
> >
> > - # sum1_50 = PHI
> > - # sum2_52 = PHI
> > + # sum1_50 = PHI <_87(13), 0(4)>
> > + # sum2_52 = PHI <_89(13), 0(4)>
>
On Tue, Nov 9, 2021 at 6:21 PM Richard Biener
wrote:
>
> On Tue, Nov 9, 2021 at 3:37 AM Hongtao Liu wrote:
> >
> > On Mon, Nov 8, 2021 at 4:59 PM Richard Biener
> > wrote:
> > >
> > > On Mon, Nov 8, 2021 at 2:30 AM Hongtao Liu wrote:
> > &g
On Mon, Nov 8, 2021 at 4:59 PM Richard Biener
wrote:
>
> On Mon, Nov 8, 2021 at 2:30 AM Hongtao Liu wrote:
> >
> > On Fri, Nov 5, 2021 at 5:52 PM Richard Biener
> > wrote:
> > >
> > > On Fri, Nov 5, 2021 at 6:38 AM liuhongt wrote:
> > > &g
On Fri, Nov 5, 2021 at 3:20 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to support fold _mm512_fmadd_pch (a, _mm512_set1_pch(*(b)), c)
> to 1 instruction vfmaddcph (%rsp){1to16}, %zmm1, %zmm2.
> OK for master?
>
LGTM.
> gcc/ChangeLog:
>
> * config/i386/sse.md (fma__
On Fri, Nov 5, 2021 at 3:09 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to support cmla_optab, cmul_optab, cmla_conj_optab,
> cmul_conj_optab for vector _Float16.
> Ok for master?
LGTM.
> gcc/ChangeLog:
>
> * config/i386/sse.md (cmul3): add new define_expand.
>
On Fri, Nov 5, 2021 at 5:52 PM Richard Biener
wrote:
>
> On Fri, Nov 5, 2021 at 6:38 AM liuhongt wrote:
> >
> > a and b are same type as trunc type and has less precision than
> > extend type, the transformation is guarded by flag_finite_math_only.
> >
> > Bootstrapped and regtested under x86_64-
On Thu, Nov 4, 2021 at 9:50 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> AVX512VNNI/AVXVNNI has vpdpwssd for HImode, vpdpbusd for QImode, so
> Adjust HImode sdot_prod expander and add QImode usdot_prod expander
> to enhance vectorization for dotprod.
>
> Bootstraped/regtested on x86_64-pc-lin
On Thu, Nov 4, 2021 at 11:25 AM Hongyu Wang wrote:
>
> > Could you add a testcase for that?
>
> Yes, updated patch.
>
> Hongtao Liu via Gcc-patches 于2021年11月4日周四 上午10:25写道:
> >
> > On Thu, Nov 4, 2021 at 9:19 AM Hongyu Wang via Gcc-patches
> > wrote:
On Thu, Nov 4, 2021 at 9:19 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> _tile_loadd, _tile_stored, _tile_streamloadd intrinsics are defined by
> macro, so the parameters should be wrapped by parentheses to accept
> expressions.
>
> Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde.
On Fri, Oct 29, 2021 at 10:34 AM Martin Sebor wrote:
>
> On 10/28/21 7:47 PM, Hongtao Liu wrote:
> > On Fri, Oct 29, 2021 at 12:20 AM Martin Sebor via Gcc-patches
> > wrote:
> >>
> >> On 10/28/21 1:23 AM, liuhongt via Gcc-patches wrote:
> >>> Adju
On Thu, Oct 28, 2021 at 10:26 AM Hongtao Liu wrote:
>
> On Mon, Oct 25, 2021 at 4:24 PM liuhongt wrote:
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> I'm going to check in this patch if there's
On Fri, Oct 29, 2021 at 12:20 AM Martin Sebor via Gcc-patches
wrote:
>
> On 10/28/21 1:23 AM, liuhongt via Gcc-patches wrote:
> > Adjust code in check_vect_slp_aligned_store_usage to make it an exact
> > pattern match of the corresponding testcases.
> > These new target/xfail selectors are added a
On Mon, Oct 25, 2021 at 4:24 PM liuhongt wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
I'm going to check in this patch if there's no objection.
> gcc/ChangeLog:
>
> PR target/102464
> * config/i386/i386-builtin-types.def (V8HF_FTYPE_V8H
On Tue, Oct 26, 2021 at 5:51 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For _Float16 type, add insn and expanders to optimize x / y to
> x * rcp (y), and x / sqrt (y) to x * rsqrt (y).
> As Half float only have minor precision difference between div and
> mul * rcp, there is no need for New
On Mon, Oct 25, 2021 at 10:01 PM Rainer Orth
wrote:
>
> gcc.target/i386/avx512f-pr96891-3.c currently FAILs on 32-bit Solaris/x86:
>
> FAIL: gcc.target/i386/avx512f-pr96891-3.c scan-assembler-times
> (?n)vpcmp[bwdq][ t]*\$7 4
>
> There are only 3 instances of the expected pattern because
On Mon, Oct 25, 2021 at 1:59 PM liuhongt wrote:
>
> Canoicalize & and nop_convert order for
> __atomic_fetch_or_*, __atomic_fetch_xor_*,
> __atomic_xor_fetch_*,__sync_fetch_and_or_*,
> __sync_fetch_and_xor_*,__sync_xor_and_fetch_*,
> __atomic_fetch_and_*,__sync_fetch_and_and_* when mask is constan
On Mon, Oct 25, 2021 at 1:59 PM liuhongt wrote:
>
> Similar for sqrt/sqrtl.
>
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ok for trunk?
> gcc/ChangeLog:
>
> PR target/102464
> * match.pd: Simplify (_Float16) sqrtf((float) a) to .SQRT(a)
> when direct_inter
On Fri, Oct 22, 2021 at 1:57 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi,
>
> This patch is to support transform in fast-math something like
> _mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to
> _mm512_fmadd_pch(a, b, x1).
>
> And support transform _mm512_add_ph(x1, _mm512_f
i is
On Wed, Oct 13, 2021 at 8:34 PM Richard Biener via Gcc-patches
wrote:
>
> On Sun, Oct 10, 2021 at 3:49 PM H.J. Lu wrote:
> >
> > Changes in v4:
> >
> > 1. Bypass redundant check when inputs have been transformed to the
> > equivalent canonical form with valid bit operation.
> >
> > Changes
de 100644
> index 000..bad4fa9394e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr102812.c
> @@ -0,0 +1,12 @@
> +/* PR target/102812 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse4 -mno-avx" } */
> +/* { dg-final { scan-assembler-no
On Wed, Oct 20, 2021 at 1:31 PM Hongyu Wang via Gcc-patches
wrote:
>
> Since _Float16 type is enabled under sse2 target, returning
> V8HFmode vector without AVX512F target would generate wrong
> vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this.
>
> Bootstraped/regtested on x86_64-pc-lin
On Thu, Oct 21, 2021 at 10:06 AM Hongtao Liu wrote:
>
> On Thu, Oct 21, 2021 at 9:20 AM Hongtao Liu wrote:
> >
> > On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches
> > wrote:
> > >
> > > Hi,
> > >
> > >
> > >
On Thu, Oct 21, 2021 at 9:20 AM Hongtao Liu wrote:
>
> On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches
> wrote:
> >
> > Hi,
> >
> >
> > On Tue, Oct 19, 2021 at 11:03 AM liuhongt via Gcc-patches <
> > gcc-patches@gcc.gnu.org
On Wed, Oct 20, 2021 at 7:34 PM Christophe Lyon via Gcc-patches
wrote:
>
> Hi,
>
>
> On Tue, Oct 19, 2021 at 11:03 AM liuhongt via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> > updated patch:
> > 1. Add documents in doc/sourcebuild.texi (Effective-Target Keywords).
> > 2. Reduce -novec
On Fri, Oct 15, 2021 at 3:11 PM Kewen.Lin via Gcc-patches
wrote:
>
> on 2021/10/14 下午6:56, Kewen.Lin via Gcc-patches wrote:
> > Hi Hongtao,
> >
> > on 2021/10/14 下午3:11, liuhongt wrote:
> >> Hi Kewen:
> >> Cound you help to verify if this patch fix those regressions
> >> for rs6000 port.
> >>
>
On Fri, Oct 15, 2021 at 11:37 PM Martin Sebor wrote:
>
> On 10/14/21 1:11 AM, liuhongt wrote:
> > Hi Kewen:
> >Cound you help to verify if this patch fix those regressions
> > for rs6000 port.
> >
> > As discussed in [1], this patch add xfail/target selector to those
> > testcases, also make a
On Fri, Oct 15, 2021 at 4:38 PM dianhong.xu--- via Gcc-patches
wrote:
>
> From: dianhong xu
>
> Add *_set1_pch (_Float16 _Complex A) intrinsics.
>
> gcc/ChangeLog:
>
> * config/i386/avx512fp16intrin.h:
> (_mm512_set1_pch): New intrinsic.
> * config/i386/avx512fp16vlintrin.
checks for NULL seems reasonable according to documents,
> > op0,op1,target maybe NULL.
> Thanks for pointing it out, didn't realize the difference between
> these 2 functions.
LGTM.
>
> Updated patch.
>
> Hongtao Liu 于2021年10月15日周五 下午1:54写道:
> >
> > On Fri, O
On Fri, Oct 15, 2021 at 1:37 PM Hongyu Wang wrote:
>
> > This part seems not related to vector shuffle.
> Yes, have separated this part to another patch and checked-in.
>
> Updated patch. Ok for this one?
>
> Hongtao Liu via Gcc-patches 于2021年10月14日周四 下午2:33写道:
> >
On Fri, Oct 15, 2021 at 1:07 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For V4HFmode, doing vector concat like
>
> __builtin_shufflevector (a, b, {0, 1, 2, 3, 4, 5, 6, 7})
>
> could trigger ICE since it is not handled in ix86_vector_init ().
>
> Handle HFmode like HImode to avoid such ICE.
On Thu, Oct 14, 2021 at 10:39 AM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> This patch supports HFmode vector shuffle by creating HImode subreg when
> expanding permutation expr.
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,} and sde{-m32,}
> OK for master?
>
> gcc/ChangeLog:
>
>
On Wed, Oct 13, 2021 at 5:07 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> Current mask/mask3 implementation for complex fma contains
> duplicated parameter in macro, which may cause error at -O0.
> Refactor macro implementation to builtins to avoid potential
> error.
>
> For round intrinsic w
601 - 700 of 1294 matches
Mail list logo