Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-08-10 Thread Hongtao Liu via Gcc-patches
Ping^3 On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote: > > ping ^2 > > On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > > > ping > > > > On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > > > > > Those two define_insn

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-08-10 Thread Hongtao Liu via Gcc-patches
Ping^3 On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote: > > ping ^2 > > On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > > > ping > > > > On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote: > > > > > > Bootstrap is ok, r

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-09 Thread Hongtao Liu via Gcc-patches
On Fri, Aug 7, 2020 at 11:02 PM Kirill Yukhin wrote: > > Hello, > > On 05 авг 09:29, Hongtao Liu wrote: > > On Tue, Aug 4, 2020 at 6:28 PM Kirill Yukhin > > wrote: > > > > > > On 04 авг 13:26, Kirill Yukhin wrote: > > > > Could you please

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-04 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 4, 2020 at 6:28 PM Kirill Yukhin wrote: > > On 04 авг 13:26, Kirill Yukhin wrote: > > Could you please clarify, how your patch relared to [1]? > > I see from the bug that it describes perf issue w.r.t. scalar > > operations. > Sorry for Typo, it's pr96243.

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping ^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote: > > > > Bootstrap is ok, regression test is ok for i386 backend. > > > > gcc/ > > PR target/962

Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping ^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > > > Those two define_insns have same pattern, and > > _load_mask would always be matched since it show up > > earlier

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-04 Thread Hongtao Liu via Gcc-patches
ping^2 On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote: > > ping > > On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote: > > > > Correct PR number in ChangeLog > > it's pr96243. > > > > On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: > >

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-08-04 Thread Hongtao Liu via Gcc-patches
Update patch. There are a lot of avx512 define_insns which lack corresponding memory broadcast version, i only add *avx512f_mul3_bcst and *avx512dq_mul3_bcst in this patch. On Fri, Jul 24, 2020 at 10:37 AM Hongtao Liu wrote: > > On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote: > >

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote: > > Correct PR number in ChangeLog > it's pr96243. > > On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: > > > > Hi: > > For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a > > boolea

Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote: > > Those two define_insns have same pattern, and > _load_mask would always be matched since it show up > earlier in the md file, and it may lose some opportunity in > pass_reload since _load_mask only have constraint &quo

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-07-27 Thread Hongtao Liu via Gcc-patches
ping On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote: > > Bootstrap is ok, regression test is ok for i386 backend. > > gcc/ > PR target/96262 > * config/i386/i386-expand.c > (ix86_expand_vec_shift_qihi_constant): Refine. > > gcc/testsuite/

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote: > > On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote: > > > > Hello, > > sorry for taking so long to get to this. > > > diff --git a/gcc/config/i386/i386-features.c > > > b/gcc/config/i386/i386-featu

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote: > > Hello, > sorry for taking so long to get to this. > > diff --git a/gcc/config/i386/i386-features.c > > b/gcc/config/i386/i386-features.c > > index 535fc7e981d..8f81d101382 100644 > > --- a/gcc/config/i386/i386-features.c > > +++

[PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-07-22 Thread Hongtao Liu via Gcc-patches
Bootstrap is ok, regression test is ok for i386 backend. gcc/ PR target/96262 * config/i386/i386-expand.c (ix86_expand_vec_shift_qihi_constant): Refine. gcc/testsuite/ * gcc.target/i386/pr96262-1.c: New test. --- gcc/config/i386/i386-expand.c | 6

[PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-07-21 Thread Hongtao Liu via Gcc-patches
matched. 2020-07-21 Hongtao Liu gcc/ PR target/96246 * config/i386/sse.md (_load_mask, _load_mask): Extend to generate blendm instructions. (_blendm, _blendm): Change define_insn to define_expand. gcc/testsuite/ * gcc.target/i386/avx512bw-p

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-20 Thread Hongtao Liu via Gcc-patches
Correct PR number in ChangeLog it's pr96243. On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote: > > Hi: > For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a > boolean value and try to do some optimization. But it is not true for > vector compare, also other place

[PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-19 Thread Hongtao Liu via Gcc-patches
Hi: For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a boolean value and try to do some optimization. But it is not true for vector compare, also other places in rtl passes hold the same assumption. Bootstrap is ok, regression test is ok for i386 backend. 2020-07-20 Hongtao Liu

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-17 Thread Hongtao Liu via Gcc-patches
ping! On Fri, Jul 10, 2020 at 5:24 PM Hongtao Liu wrote: > > + maintainer. > cc H.J > > On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote: > > > > Hi: > > For a constant vector having one duplicated value, there's no need > > to put the whole vect

Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-10 Thread Hongtao Liu via Gcc-patches
+ maintainer. cc H.J On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote: > > Hi: > For a constant vector having one duplicated value, there's no need > to put the whole vector in the constant pool, using embedded broadcast > instead. > > Bootstrap test is Ok, regression

[PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-09 Thread Hongtao Liu via Gcc-patches
broadcast instead. 2020-07-09 Hongtao Liu gcc/ChangeLog: PR target/87767 * config/i386/i386-features.c (replace_constant_pool_with_broadcast): New function. (constant_pool_broadcast): Ditto. (class pass_constant_pool_broadcast): New pass. (make_pass_constant_pool_broadcast): Ditto. * config

[PATCH] Optimize V*QImode shift by constant using same operation on V*HImode [PR95524]

2020-06-16 Thread Hongtao Liu via Gcc-patches
Sorry,i mistakenly deleted local mail for https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548174.html, so i send an another email. > What I mean is that op2 is a CONST_INT, which in theory can have any > HOST_WIDE_INT values. > By assigning that to unsigned int variable, you are effectively >

Re: [PATCH] Optimize V*QImode shift by constant using same operation on V*HImode [PR95524]

2020-06-15 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 15, 2020 at 9:48 PM Jakub Jelinek wrote: > > On Mon, Jun 15, 2020 at 09:29:29PM +0800, Hongtao Liu via Gcc-patches wrote: > > Basically i "copy" this optimization from clang i386 backend, Refer > > to pr95524 for details. > > Bootstrap is ok, regr

[PATCH] Optimize V*QImode shift by constant using same operation on V*HImode [PR95524]

2020-06-15 Thread Hongtao Liu via Gcc-patches
Hi: Basically i "copy" this optimization from clang i386 backend, Refer to pr95524 for details. Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: PR target/95524 * gcc/config/i386/i386-expand.c (ix86_expand_vec_shift_qihi_constant): New

Re: [PATCH] Optimize multiplication for V8QI,V16QI,V32QI under TARGET_AVX512BW [target/95488]

2020-06-12 Thread Hongtao Liu via Gcc-patches
Thanks for the review. On Fri, Jun 12, 2020 at 11:28 AM Jeff Law wrote: > > On Fri, 2020-06-05 at 13:46 +0800, Hongtao Liu via Gcc-patches wrote: > > Hi: > > > > +/* Optimize vector MUL generation for V8QI, V16QI and V32QI > > + under TARGET_AVX512BW

[PATCH] Optimize multiplication for V8QI,V16QI,V32QI under TARGET_AVX512BW [target/95488]

2020-06-04 Thread Hongtao Liu via Gcc-patches
Hi: +/* Optimize vector MUL generation for V8QI, V16QI and V32QI + under TARGET_AVX512BW. i.e. for v16qi a * b, it has + + vpmovzxbw ymm2, xmm0 + vpmovzxbw ymm3, xmm1 + vpmullw ymm4, ymm2, ymm3 + vpmovwb xmm0, ymm4 + + it would take less instructions than ix86_expand_vecop_qihi. +

Re: [PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-04 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 4, 2020 at 2:43 PM Richard Biener wrote: > > On Thu, 4 Jun 2020, Hongtao Liu wrote: > > > Hi Richard: > > Could you help review this patch. > > uros said he wouldn't review patches related to x86 vector ISA anymore. > > I can't spot anything wr

[PATCH] Fix typo in expander trunc2 [AVX512]

2020-06-04 Thread Hongtao Liu via Gcc-patches
This patch to is fix uppercase of mode in trunc2, it should be lowercase for standard pattern name. Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: * config/i386/sse.md (pmov_dst_3_lower): New mode attribute. (trunc2): Refine from trunc2.

Re: [PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-03 Thread Hongtao Liu via Gcc-patches
Hi Richard: Could you help review this patch. uros said he wouldn't review patches related to x86 vector ISA anymore. On Wed, Jun 3, 2020 at 10:26 AM Hongtao Liu wrote: > > Hi: > When dest is memory, zero-masking is not valid, only merging-masking > is available, > >

[PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-02 Thread Hongtao Liu via Gcc-patches
Hi: When dest is memory, zero-masking is not valid, only merging-masking is available, Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: * gcc/config/i386/sse.md (*vcvtps2ph_store): Refine from *vcvtps2ph_store. (vcvtps2ph256): Refine

Re: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-28 Thread Hongtao Liu via Gcc-patches
On Thu, May 28, 2020 at 11:37 PM H.J. Lu wrote: > > On Thu, May 28, 2020 at 8:00 AM Richard Sandiford > wrote: > > > > "Yangfei (Felix)" writes: > > > Thanks for reviewing this. > > > Attached please find the v5 patch. > > > Note: we also need to modify local variable "mode" once we catch one

Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Hongtao Liu via Gcc-patches
On Wed, May 27, 2020 at 8:01 PM Uros Bizjak wrote: > > On Wed, May 27, 2020 at 8:02 AM Hongtao Liu wrote: > > > > On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote: > > > > > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote: > > > > > >

Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote: > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote: > > > > According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit > > memory_operand instead of 128-bit one which exists in current > > implem

Re: [IMPORTANT] ChangeLog related changes

2020-05-26 Thread Hongtao Liu via Gcc-patches
Great, thanks! On Tue, May 26, 2020 at 2:08 PM Martin Liška wrote: > > On 5/26/20 7:22 AM, Hongtao Liu via Gcc wrote: > > i commit a separate patch alone only for ChangeLog files, should i revert > > it? > > Hello. > > I've just done it. > > Martin -- BR, Hongtao

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Tue, May 26, 2020 at 6:49 AM Jakub Jelinek via Gcc-patches wrote: > > Hi! > > I've turned the strict mode of Martin Liška's hook changes, > which means that from now on no commits to the trunk or release branches > should be changing any ChangeLog files together with the other files, >

[PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-25 Thread Hongtao Liu via Gcc-patches
implementation. Also for other vpmov instructions which have memory_operand narrower than 128bits. 2020-05-25 Hongtao Liu gcc/ChangeLog * config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine size of memory_operand according to Intel SDM. (avx512vl_v2div2qi2_mask_store): Ditto

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 8:00 PM Uros Bizjak wrote: > > On Mon, May 25, 2020 at 1:56 PM Hongtao Liu wrote: > > > > On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote: > > > > > > On Mon, 25 May 2020, Uros Bizjak wrote: > > > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote: > > On Mon, 25 May 2020, Uros Bizjak wrote: > > > On Mon, May 25, 2020 at 8:27 AM Richard Biener wrote: > > > > > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak > > > wrote: > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-25 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches
On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > Hi: > > This patch fix non-conforming expander for > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > refer to

[PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

2020-05-24 Thread Hongtao Liu via Gcc-patches
Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog PR target/95125 * config/i386/sse.md (sf2dfmode_lower): New mode attribute. (trunc2) New expander. (extend2): Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/pr95125-avx.c: New

[PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-23 Thread Hongtao Liu via Gcc-patches
Hi: This patch fix non-conforming expander for floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, refer to PR95211, PR95256. bootstrap ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: PR target/95211 PR target/95256 * config/i386/sse.md

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Hongtao Liu via Gcc-patches
On Fri, May 22, 2020 at 2:41 PM Uros Bizjak wrote: > > On Fri, May 22, 2020 at 6:55 AM Hongtao Liu wrote: > > > > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote: > > > > > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > > > >

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-21 Thread Hongtao Liu via Gcc-patches
On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote: > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > > > > &

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Hongtao Liu via Gcc-patches
On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > > > Hi: > > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > > > gcc/ChangeLog: > > PR target/92658 > >

[PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Hongtao Liu via Gcc-patches
Hi: Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog: PR target/92658 * config/i386/sse.md (trunc2, truncv32hiv32qi2, trunc2): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/pr92658-avx512f.c: New test. *

[PATCH] [PR94118]] Update documentation for x86 operand modifier.

2020-05-11 Thread Hongtao Liu via Gcc-patches
Documents operand modifiers which are available in asm stmt but missing in document. | Modifier | Description | Available in asm stmt | Existed in documentation | | --- | --- | --- | - | | L,W,B,Q,S,T | print the opcode suffix for specified size of operand. | Available | Not | | C |

[PATCH] Add enqcmd,avx512bf16,avx512vp2intersect to funcspec-56.inc

2020-05-06 Thread Hongtao Liu via Gcc-patches
Hi: Test is ok for funcspec-5.c, funcspec-6.c. gcc/testuite/ChangeLog * gcc.target/i386/funcspec-56.inc: Add enqcmd, avx512bf16, avx512vp2intersect. gcc/testsuite/gcc.target/i386/funcspec-56.inc | 6 ++ 1 file changed, 6 insertions(+) diff --git

Re: [PATH] Enable GCC support for SERIALIZE

2020-05-05 Thread Hongtao Liu via Gcc-patches
On Mon, May 4, 2020 at 1:17 AM Uros Bizjak wrote: > > On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote: > > > > Hi: > > This patch is about to enable GCC support for SERIALIZE which would > > be in GLC. There's only 1 instruction: SERIALIZE, more details

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-05-05 Thread Hongtao Liu via Gcc-patches
On Mon, May 4, 2020 at 12:58 AM Uros Bizjak wrote: > > The part above is OK, but you are missing support for > __attribute__((__target__("..."))). Please see how for example -msgx > is handled in isa2_opts in i386-options.c and in > gcc.target/i386/funcspec-56.h test source. > > Please repost the

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-04-01 Thread Hongtao Liu via Gcc-patches
On Wed, Apr 1, 2020 at 3:32 PM Hongtao Liu wrote: > > Hi: > This patch is about to enable GCC support for TSXLDTRK which would > be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more > details please > refer to > https://software.intel.com/sites/defaul

[PATCH] Enable GCC support for TSXLDTRK

2020-04-01 Thread Hongtao Liu via Gcc-patches
Hi: This patch is about to enable GCC support for TSXLDTRK which would be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more details please refer to https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf I

[PATH] Enable GCC support for SERIALIZE

2020-04-01 Thread Hongtao Liu via Gcc-patches
Date: Wed, 4 Mar 2020 14:08:40 +0800 Subject: [PATCH] Enable GCC support for SERIALIZE 2020-03-04 Hongtao Liu 2020-03-04 Wei Xiao gcc/Changelog: * gcc/common/config/i386/i386-common.c (OPTION_MASK_ISA2_SERIALIZE_SET, OPTION_MASK_ISA2_SERIALIZE_UNSET): New macros. (ix86_handle_option): Handle

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-18 Thread Hongtao Liu
On Tue, Feb 18, 2020 at 7:00 PM Hongtao Liu wrote: > > On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote: > > > > > > > > On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote: > >> > >> > Changelog > >> > gcc/ > >> >

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-18 Thread Hongtao Liu
On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote: > > > > On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote: >> >> > Changelog >> > gcc/ >> >* config/i386/avx512vbmi2intrin.h >> >(_mm512_[,mask_,maskz_]shrdi_epi16, >> >_mm512_[,mask_,maskz_]shrdi_epi32, >> >

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-14 Thread Hongtao Liu
Done. On Fri, Feb 14, 2020 at 7:16 PM Uros Bizjak wrote: > > On Fri, Feb 14, 2020 at 8:06 AM Uros Bizjak wrote: > > > > On Fri, Feb 14, 2020 at 7:03 AM Hongtao Liu wrote: > > > > > > On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote: > > > > >

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-13 Thread Hongtao Liu
On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote: > > On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote: > > > > On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote: > > > > > > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrot

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-13 Thread Hongtao Liu
On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote: > > On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote: > > > > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrote: > > > > Changelog > > > > gcc/ > > > >* config/i386/avx512vbmi2intrin.h > > > >

[PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-12 Thread Hongtao Liu
Hi As mentioned in PR93724, several intrinsic macros lack a closing parenthesis. These macros are only used with -O0 option, and currently unit tests use -O2, so not covered. Bootstrap ok, regression tests on i386/x86_64 is ok. Ok for trunk? Changelog gcc/ *

Re: [PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-18 Thread Hongtao Liu
On Wed, Dec 18, 2019 at 4:26 PM Segher Boessenkool wrote: > > On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote: > > Hi: > > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a > > power of 2 and D mod C == 0. > > bootstrap an

Re: [PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-17 Thread Hongtao Liu
On Wed, Dec 18, 2019 at 10:50 AM Andrew Pinski wrote: > > On Tue, Dec 17, 2019 at 6:33 PM Hongtao Liu wrote: > > > > Hi: > > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a > > power of 2 and D mod C == 0. > > bootstrap and make

[PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-17 Thread Hongtao Liu
Hi: This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a power of 2 and D mod C == 0. bootstrap and make check is ok. changelog gcc/ * gcc/match.pd (A * C + (-D) = (A - D/C) * C. when C is a power of 2 and D mod C == 0): Add new simplification. gcc/testsuite

[PATCH]Add tune option for integer mask cmov, enable this tune for m_CORE_AVX512

2019-12-11 Thread Hongtao Liu
Hi: This patch is about to add tune option for integer mask cmov, for some targets has both integer mask register and sse mask register, this tune indicates to use integer one. Currently it's default on for m_CORE_AVX512. Bootstrap is ok, regression test on i386/x86_64 backends is ok. ok

Re: [PATCH] Fix unrecognizable insn of pr92865

2019-12-10 Thread Hongtao Liu
On Wed, Dec 11, 2019 at 3:54 PM Jakub Jelinek wrote: > > On Wed, Dec 11, 2019 at 09:55:24AM +0800, Hongtao Liu wrote: > > Changelog > > gcc/ > > PR target/92865 > > * config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable > > integer mask cmov

Re: [PATCH] Fix unrecognizable insn of pr92865

2019-12-10 Thread Hongtao Liu
On Tue, Dec 10, 2019 at 4:11 PM Jakub Jelinek wrote: > > On Tue, Dec 10, 2019 at 01:47:50PM +0800, Hongtao Liu wrote: > > This patch is to enable integer mask cmp/cmov under AVX512F even > > with TARGET_XOP . > > Bootstrap and regression test on i386/x86_64 backend

[PATCH] Fix unrecognizable insn of pr92865

2019-12-09 Thread Hongtao Liu
Hi jakub: This patch is to enable integer mask cmp/cmov under AVX512F even with TARGET_XOP . Bootstrap and regression test on i386/x86_64 backend is ok. Changelog: PR target/92865 * gcc/config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable integer mask cmov when available

[PATCH] Use OPTION_MASK_ISA2_$target_[SET, UNSET, ] to indicate those for x_ix86_isa_flags2

2019-12-09 Thread Hongtao Liu
Hi uros: This patch is about to rename OPTION_MASK_ISA_$target_[SET,UNSET, ] to OPTION_MASK_ISA2_$target_[SET,UNSET, ] for those targets setting x_ix86_isa_flags2. target list as bellow: - 188static struct ix86_target_opts isa2_opts[] = 189{ 190 { "-mcx16",

Re: [PATCH] Enable mask operation for 128/256-bit vector VCOND_EXPR under avx512f (PR92686)

2019-12-08 Thread Hongtao Liu
On Thu, Dec 5, 2019 at 4:03 PM Jakub Jelinek wrote: > > On Thu, Dec 05, 2019 at 09:56:46AM +0800, Hongtao Liu wrote: > > --- a/gcc/config/i386/i386-expand.c > > +++ b/gcc/config/i386/i386-expand.c > > + /* Using vector move with mask register. */ > > +

Re: [PATCH] Enable mask operation for 128/256-bit vector VCOND_EXPR under avx512f (PR92686)

2019-12-04 Thread Hongtao Liu
On Wed, Dec 4, 2019 at 4:22 PM Jakub Jelinek wrote: > > On Wed, Dec 04, 2019 at 10:07:05AM +0800, Hongtao Liu wrote: > > Changelog > > gcc/ > > PR target/92686 > > * config/i386/sse.md > > (*_cmp3, > > *_cmp3, > > *_uc

[PATCH] Enable mask operation for 128/256-bit vector VCOND_EXPR under avx512f (PR92686)

2019-12-03 Thread Hongtao Liu
Hi: Currently for VCOND_EXPR, integer mask operation is only available for 512-bit vector, but since mask register is related to isa not vector size, under avx512f we can also have 128/256-bit vector condition move. My local tests show there's no boost frequency penalty for using integer mask

[PATCH] Fix TYPO of avx512f_maskcmp3.

2019-11-26 Thread Hongtao Liu
hi jakub: VF is used for differentiating AVX512F/AVX/SSE, but there's condition TARGET_AVX512F in avx512f_maskcmp3, it must be a TYPO and should be VF_AVX512VL instead. Bootstrap and regression test on i386/x86_64 backend is ok. OK for trunk? diff --git a/gcc/config/i386/sse.md

Re: [PATCH] Split X86_TUNE_AVX128_OPTIMAL into X86_TUNE_AVX256_SPLIT_REGS and X86_TUNE_AVX128_OPTIMAL

2019-11-17 Thread Hongtao Liu
On Sat, Nov 16, 2019 at 7:27 AM Jeff Law wrote: > > On 11/14/19 5:21 AM, Richard Biener wrote: > > On Tue, Nov 12, 2019 at 11:35 AM Hongtao Liu wrote: > >> > >> Hi: > >> As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html > >

Re: [PATCH] Set AVX128_OPTIMAL for all avx targets.

2019-11-12 Thread Hongtao Liu
On Tue, Nov 12, 2019 at 4:41 PM Richard Biener wrote: > > On Tue, Nov 12, 2019 at 9:29 AM Hongtao Liu wrote: > > > > On Tue, Nov 12, 2019 at 4:19 PM Richard Biener > > wrote: > > > > > > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote: > > >

[PATCH] Split X86_TUNE_AVX128_OPTIMAL into X86_TUNE_AVX256_SPLIT_REGS and X86_TUNE_AVX128_OPTIMAL

2019-11-12 Thread Hongtao Liu
Hi: As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html > So yes, it's poorly named. A preparatory patch to clean this up > (and maybe split it into TARGET_AVX256_SPLIT_REGS and TARGET_AVX128_OPTIMAL) > would be nice. Bootstrap and regression test for i386 backend is ok.

Re: [PATCH] Set AVX128_OPTIMAL for all avx targets.

2019-11-12 Thread Hongtao Liu
On Tue, Nov 12, 2019 at 4:29 PM Richard Biener wrote: > > On Tue, Nov 12, 2019 at 9:19 AM Richard Biener > wrote: > > > > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote: > > > > > > Hi: > > > This patch is about to set X86_TUNE_AVX128_OPTIMA

Re: [PATCH] Set AVX128_OPTIMAL for all avx targets.

2019-11-12 Thread Hongtao Liu
On Tue, Nov 12, 2019 at 4:19 PM Richard Biener wrote: > > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote: > > > > Hi: > > This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for > > all AVX target because we found there's still performanc

[PATCH] Set AVX128_OPTIMAL for all avx targets.

2019-11-11 Thread Hongtao Liu
Hi: This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for all AVX target because we found there's still performance gap between 128-bit auto-vectorization and 256-bit auto-vectorization even with epilog vectorized. The performance influence of setting avx128_optimal as default on

Re: [PATCH target/92295] Fix inefficient vector constructor

2019-11-06 Thread Hongtao Liu
Ping! On Sat, Nov 2, 2019 at 9:38 PM Hongtao Liu wrote: > > Hi Jakub: > Could you help reviewing this patch. > > PS: Since this patch is related to vectors(avx512f), and Uros > mentioned before that he has no intension to maintain avx512f. > > On Fri, Nov 1, 2019 at 9:

Re: [PATCH target/92295] Fix inefficient vector constructor

2019-11-02 Thread Hongtao Liu
Hi Jakub: Could you help reviewing this patch. PS: Since this patch is related to vectors(avx512f), and Uros mentioned before that he has no intension to maintain avx512f. On Fri, Nov 1, 2019 at 9:12 AM Hongtao Liu wrote: > > Hi uros: > This patch is about to fix inefficie

[PATCH target/92295] Fix inefficient vector constructor

2019-10-31 Thread Hongtao Liu
Hi uros: This patch is about to fix inefficient vector constructor. Currently in ix86_expand_vector_init_concat, vector are initialized per 2 elements which can miss some optimization opportunity like pr92295. Bootstrap and i386 regression test is ok. Ok for trunk? Changelog gcc/

[PATCH] Remove redudant iptr when operand already has a scalar mode.

2019-10-26 Thread Hongtao Liu
> BTW: Please also note that there is no need to use or operand > mode override in scalar insn templates for intel asm dialect when > operand already has a scalar mode. https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html This patch is to remove redundant when operand already has a scalar

[PATCH] Adjust predicates and constraints of scalar insns

2019-10-25 Thread Hongtao Liu
> Looking into sse.md, there is a lot of inconsistencies in existing *vm > patterns w.r.t. operand constraints. Unfortunately, these were copied > into proposed patterns. One example is existing > > (define_insn "_vmsqrt2" > [(set (match_operand:VF_128 0 "register_operand" "=x,v") >

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-25 Thread Hongtao Liu
Update patch. On Fri, Oct 25, 2019 at 4:01 PM Uros Bizjak wrote: > > On Fri, Oct 25, 2019 at 7:55 AM Hongtao Liu wrote: > > > > On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote: > > > > > > On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote: > > &

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-24 Thread Hongtao Liu
On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote: > > On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote: > > > > On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote: > > > > > > Update patch: > > > Add m constraint to define_insn (sse_1_round > &g

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-24 Thread Hongtao Liu
On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote: > > On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote: > > > > Update patch: > > Add m constraint to define_insn (sse_1_round > *sse_1_round > when under sse4 but not avx512f. > > It looks to me that the o

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-22 Thread Hongtao Liu
Update patch: Add m constraint to define_insn (sse_1_round): Change constraint x to xm since vround support memory operand. * (*sse4_1_round): Ditto. Bootstrap and regression test ok. On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu wrote: > > Hi uros: > This patch fi

[PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-22 Thread Hongtao Liu
Hi uros: This patch fixes false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale. Bootstrap ok, regression test on i386/x86 ok. It does something like this: - For scalar instructions with both xmm operands: op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ for scalar

Re: [wwwdocs] Update gcc-10/changes.html re Intel ISA (was: gcc-wwwdocs branch master updated. 63fbcfeaf27d9dd2083ccbd34bdff8fccb63949c)

2019-10-20 Thread Hongtao Liu
On Mon, Oct 21, 2019 at 1:15 AM Gerald Pfeifer wrote: > > On Fri, 11 Oct 2019, liuho...@gcc.gnu.org wrote: > > commit 63fbcfeaf27d9dd2083ccbd34bdff8fccb63949c > > Author: liuhongt > > Date: Fri Oct 11 14:27:47 2019 +0800 > > > > Update gcc10 changes with new intel ISA. > > I just applied

Re: [PATCH target/92035] Add missing avx512f intrinsics

2019-10-12 Thread Hongtao Liu
On Sat, Oct 12, 2019 at 4:15 PM Jakub Jelinek wrote: > > Hi! > > > gcc/ > > * config/i386/avx512fintrin.h (_mm_mask_roundscale_ss, > > _mm_maskz_roundscale_ss, _mm_maskz_roundscale_round_ss, > > _mm_maskz_roundscale_round_ss, _mm_mask_roundscale_sd, > >

[PATCH target/92035] Add missing avx512f intrinsics

2019-10-12 Thread Hongtao Liu
Hi: This patch is enabling missing avx512f intrinsics listed as _mm_mask_roundscale_sd _mm_mask_roundscale_round_sd _mm_maskz_roundscale_sd _mm_maskz_roundscale_round_sd _mm_mask_roundscale_ss _mm_mask_roundscale_round_ss _mm_maskz_roundscale_ss _mm_maskz_roundscale_round_ss Bootstrap ok,

[PATCH target/87007]Extend rpad to handle AVX512F vcvtusi2ss/vcvtusi2sd

2019-09-17 Thread Hongtao Liu
Hi Uros: This patch extend pass rpad to handle AVX512F vcvtusi2ss/vcvtusi2sd. 538.image_r would be improved by 4% with single copy run on skylake workstation. Bootstrap ok. regression test for i386/x86 backend ok. Ok for trunk? Changelog gcc/ * config/i386/i386.md

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-09-04 Thread Hongtao Liu
On Wed, Sep 4, 2019 at 9:44 AM Hongtao Liu wrote: > > On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote: > > > > On Tue, Sep 3, 2019 at 1:33 PM Richard Biener > > wrote: > > > > > > > Note: > > > > > Removing limit of cost would in

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-09-03 Thread Hongtao Liu
On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote: > > On Tue, Sep 3, 2019 at 1:33 PM Richard Biener > wrote: > > > > > Note: > > > > Removing limit of cost would introduce lots of regressions in SPEC2017 > > > > as follow > > > > > > > > 531.deepsjeng_r -7.18%

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-09-03 Thread Hongtao Liu
On Mon, Sep 2, 2019 at 4:41 PM Uros Bizjak wrote: > > On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote: > > > > > which is not the case with core_cost (and similar with skylake_cost): > > > > > > 2, 2, 4,/* cost of moving XMM,YMM,

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-09-02 Thread Hongtao Liu
On Mon, Sep 2, 2019 at 6:23 PM Richard Biener wrote: > > On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote: > > > > > which is not the case with core_cost (and similar with skylake_cost): > > > > > > 2, 2, 4,/* cost of moving XMM,YMM,

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-09-02 Thread Hongtao Liu
> which is not the case with core_cost (and similar with skylake_cost): > > 2, 2, 4,/* cost of moving XMM,YMM,ZMM register */ > {6, 6, 6, 6, 12},/* cost of loading SSE registers >in 32,64,128,256 and 512-bit */ > {6, 6, 6, 6, 12},

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-30 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 2:18 PM Uros Bizjak wrote: > > On Fri, Aug 30, 2019 at 2:08 AM Hongtao Liu wrote: > > > > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote: > > > > > > 2019-08-28 Uroš Bizjak > > > > > > * config/i386/i3

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-29 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 8:10 AM Hongtao Liu wrote: > > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote: > > > > 2019-08-28 Uroš Bizjak > > > > * config/i386/i386.c (ix86_register_move_cost): Do not > > limit the cost of moves to/from XMM registe

Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-29 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote: > > 2019-08-28 Uroš Bizjak > > * config/i386/i386.c (ix86_register_move_cost): Do not > limit the cost of moves to/from XMM register to minimum 8. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > Actually

Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-08-06 Thread Hongtao Liu
his or attach the patch instead. > > >> >> > > >> >> > Index: ChangeLog > > >> >> > === > > >> >> > --- ChangeLog (revision 272668) > > >> >>

<    7   8   9   10   11   12   13   >