Ping^3
On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote:
>
> ping ^2
>
> On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote:
> >
> > ping
> >
> > On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote:
> > >
> > > Those two define_insn
Ping^3
On Tue, Aug 4, 2020 at 4:21 PM Hongtao Liu wrote:
>
> ping ^2
>
> On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote:
> >
> > ping
> >
> > On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote:
> > >
> > > Bootstrap is ok, r
On Fri, Aug 7, 2020 at 11:02 PM Kirill Yukhin wrote:
>
> Hello,
>
> On 05 авг 09:29, Hongtao Liu wrote:
> > On Tue, Aug 4, 2020 at 6:28 PM Kirill Yukhin
> > wrote:
> > >
> > > On 04 авг 13:26, Kirill Yukhin wrote:
> > > > Could you please
On Tue, Aug 4, 2020 at 6:28 PM Kirill Yukhin wrote:
>
> On 04 авг 13:26, Kirill Yukhin wrote:
> > Could you please clarify, how your patch relared to [1]?
> > I see from the bug that it describes perf issue w.r.t. scalar
> > operations.
>
Sorry for Typo, it's pr96243.
ping ^2
On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote:
>
> ping
>
> On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote:
> >
> > Bootstrap is ok, regression test is ok for i386 backend.
> >
> > gcc/
> > PR target/962
ping ^2
On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote:
>
> ping
>
> On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote:
> >
> > Those two define_insns have same pattern, and
> > _load_mask would always be matched since it show up
> > earlier
ping^2
On Mon, Jul 27, 2020 at 5:31 PM Hongtao Liu wrote:
>
> ping
>
> On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote:
> >
> > Correct PR number in ChangeLog
> > it's pr96243.
> >
> > On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote:
> >
Update patch.
There are a lot of avx512 define_insns which lack corresponding memory
broadcast version, i only add *avx512f_mul3_bcst and
*avx512dq_mul3_bcst in this patch.
On Fri, Jul 24, 2020 at 10:37 AM Hongtao Liu wrote:
>
> On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote:
> >
ping
On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu wrote:
>
> Correct PR number in ChangeLog
> it's pr96243.
>
> On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote:
> >
> > Hi:
> > For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a
> > boolea
ping
On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu wrote:
>
> Those two define_insns have same pattern, and
> _load_mask would always be matched since it show up
> earlier in the md file, and it may lose some opportunity in
> pass_reload since _load_mask only have constraint &quo
ping
On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu wrote:
>
> Bootstrap is ok, regression test is ok for i386 backend.
>
> gcc/
> PR target/96262
> * config/i386/i386-expand.c
> (ix86_expand_vec_shift_qihi_constant): Refine.
>
> gcc/testsuite/
On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu wrote:
>
> On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote:
> >
> > Hello,
> > sorry for taking so long to get to this.
> > > diff --git a/gcc/config/i386/i386-features.c
> > > b/gcc/config/i386/i386-featu
On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka wrote:
>
> Hello,
> sorry for taking so long to get to this.
> > diff --git a/gcc/config/i386/i386-features.c
> > b/gcc/config/i386/i386-features.c
> > index 535fc7e981d..8f81d101382 100644
> > --- a/gcc/config/i386/i386-features.c
> > +++
Bootstrap is ok, regression test is ok for i386 backend.
gcc/
PR target/96262
* config/i386/i386-expand.c
(ix86_expand_vec_shift_qihi_constant): Refine.
gcc/testsuite/
* gcc.target/i386/pr96262-1.c: New test.
---
gcc/config/i386/i386-expand.c | 6
matched.
2020-07-21 Hongtao Liu
gcc/
PR target/96246
* config/i386/sse.md (_load_mask,
_load_mask): Extend to generate blendm
instructions.
(_blendm, _blendm): Change
define_insn to define_expand.
gcc/testsuite/
* gcc.target/i386/avx512bw-p
Correct PR number in ChangeLog
it's pr96243.
On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu wrote:
>
> Hi:
> For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a
> boolean value and try to do some optimization. But it is not true for
> vector compare, also other place
Hi:
For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a
boolean value and try to do some optimization. But it is not true for
vector compare, also other places in rtl passes hold the same
assumption.
Bootstrap is ok, regression test is ok for i386 backend.
2020-07-20 Hongtao Liu
ping!
On Fri, Jul 10, 2020 at 5:24 PM Hongtao Liu wrote:
>
> + maintainer.
> cc H.J
>
> On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote:
> >
> > Hi:
> > For a constant vector having one duplicated value, there's no need
> > to put the whole vect
+ maintainer.
cc H.J
On Thu, Jul 9, 2020 at 4:33 PM Hongtao Liu wrote:
>
> Hi:
> For a constant vector having one duplicated value, there's no need
> to put the whole vector in the constant pool, using embedded broadcast
> instead.
>
> Bootstrap test is Ok, regression
broadcast instead.
2020-07-09 Hongtao Liu
gcc/ChangeLog:
PR target/87767
* config/i386/i386-features.c
(replace_constant_pool_with_broadcast): New function.
(constant_pool_broadcast): Ditto.
(class pass_constant_pool_broadcast): New pass.
(make_pass_constant_pool_broadcast): Ditto.
* config
Sorry,i mistakenly deleted local mail for
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548174.html, so i
send an another email.
> What I mean is that op2 is a CONST_INT, which in theory can have any
> HOST_WIDE_INT values.
> By assigning that to unsigned int variable, you are effectively
>
On Mon, Jun 15, 2020 at 9:48 PM Jakub Jelinek wrote:
>
> On Mon, Jun 15, 2020 at 09:29:29PM +0800, Hongtao Liu via Gcc-patches wrote:
> > Basically i "copy" this optimization from clang i386 backend, Refer
> > to pr95524 for details.
> > Bootstrap is ok, regr
Hi:
Basically i "copy" this optimization from clang i386 backend, Refer
to pr95524 for details.
Bootstrap is ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog:
PR target/95524
* gcc/config/i386/i386-expand.c
(ix86_expand_vec_shift_qihi_constant): New
Thanks for the review.
On Fri, Jun 12, 2020 at 11:28 AM Jeff Law wrote:
>
> On Fri, 2020-06-05 at 13:46 +0800, Hongtao Liu via Gcc-patches wrote:
> > Hi:
> >
> > +/* Optimize vector MUL generation for V8QI, V16QI and V32QI
> > + under TARGET_AVX512BW
Hi:
+/* Optimize vector MUL generation for V8QI, V16QI and V32QI
+ under TARGET_AVX512BW. i.e. for v16qi a * b, it has
+
+ vpmovzxbw ymm2, xmm0
+ vpmovzxbw ymm3, xmm1
+ vpmullw ymm4, ymm2, ymm3
+ vpmovwb xmm0, ymm4
+
+ it would take less instructions than ix86_expand_vecop_qihi.
+
On Thu, Jun 4, 2020 at 2:43 PM Richard Biener wrote:
>
> On Thu, 4 Jun 2020, Hongtao Liu wrote:
>
> > Hi Richard:
> > Could you help review this patch.
> > uros said he wouldn't review patches related to x86 vector ISA anymore.
>
> I can't spot anything wr
This patch to is fix uppercase of mode in trunc2, it
should be lowercase for standard pattern name.
Bootstrap is ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog:
* config/i386/sse.md (pmov_dst_3_lower): New mode attribute.
(trunc2): Refine from
trunc2.
Hi Richard:
Could you help review this patch.
uros said he wouldn't review patches related to x86 vector ISA anymore.
On Wed, Jun 3, 2020 at 10:26 AM Hongtao Liu wrote:
>
> Hi:
> When dest is memory, zero-masking is not valid, only merging-masking
> is available,
>
>
Hi:
When dest is memory, zero-masking is not valid, only merging-masking
is available,
Bootstrap is ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog:
* gcc/config/i386/sse.md (*vcvtps2ph_store):
Refine from *vcvtps2ph_store.
(vcvtps2ph256): Refine
On Thu, May 28, 2020 at 11:37 PM H.J. Lu wrote:
>
> On Thu, May 28, 2020 at 8:00 AM Richard Sandiford
> wrote:
> >
> > "Yangfei (Felix)" writes:
> > > Thanks for reviewing this.
> > > Attached please find the v5 patch.
> > > Note: we also need to modify local variable "mode" once we catch one
On Wed, May 27, 2020 at 8:01 PM Uros Bizjak wrote:
>
> On Wed, May 27, 2020 at 8:02 AM Hongtao Liu wrote:
> >
> > On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote:
> > >
> > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote:
> > > >
> >
On Mon, May 25, 2020 at 8:41 PM Uros Bizjak wrote:
>
> On Mon, May 25, 2020 at 2:21 PM Hongtao Liu wrote:
> >
> > According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> > memory_operand instead of 128-bit one which exists in current
> > implem
Great, thanks!
On Tue, May 26, 2020 at 2:08 PM Martin Liška wrote:
>
> On 5/26/20 7:22 AM, Hongtao Liu via Gcc wrote:
> > i commit a separate patch alone only for ChangeLog files, should i revert
> > it?
>
> Hello.
>
> I've just done it.
>
> Martin
--
BR,
Hongtao
On Tue, May 26, 2020 at 6:49 AM Jakub Jelinek via Gcc-patches
wrote:
>
> Hi!
>
> I've turned the strict mode of Martin Liška's hook changes,
> which means that from now on no commits to the trunk or release branches
> should be changing any ChangeLog files together with the other files,
>
implementation. Also for other vpmov instructions which have
memory_operand narrower than 128bits.
2020-05-25 Hongtao Liu
gcc/ChangeLog
* config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine
size of memory_operand according to Intel SDM.
(avx512vl_v2div2qi2_mask_store): Ditto
On Mon, May 25, 2020 at 8:00 PM Uros Bizjak wrote:
>
> On Mon, May 25, 2020 at 1:56 PM Hongtao Liu wrote:
> >
> > On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote:
> > >
> > > On Mon, 25 May 2020, Uros Bizjak wrote:
> > >
> >
On Mon, May 25, 2020 at 7:36 PM Richard Biener wrote:
>
> On Mon, 25 May 2020, Uros Bizjak wrote:
>
> > On Mon, May 25, 2020 at 8:27 AM Richard Biener wrote:
> > >
> > > On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak
> > > wrote:
> > >
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote:
>
> On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote:
> >
> > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote:
> > >
> > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote:
> > > >
>
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote:
>
> On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote:
> >
> > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote:
> > >
> > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote:
> > > >
>
On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote:
>
> On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote:
> >
> > Hi:
> > This patch fix non-conforming expander for
> > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > refer to
Bootstrap is ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog
PR target/95125
* config/i386/sse.md (sf2dfmode_lower): New mode attribute.
(trunc2) New expander.
(extend2): Ditto.
gcc/testsuite/ChangeLog
* gcc.target/i386/pr95125-avx.c: New
Hi:
This patch fix non-conforming expander for
floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
refer to PR95211, PR95256.
bootstrap ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog:
PR target/95211 PR target/95256
* config/i386/sse.md
On Fri, May 22, 2020 at 2:41 PM Uros Bizjak wrote:
>
> On Fri, May 22, 2020 at 6:55 AM Hongtao Liu wrote:
> >
> > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote:
> > >
> > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote:
> > > >
>
On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote:
>
> On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote:
> >
> > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote:
> > >
> > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote:
> > > >
> &
On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote:
>
> On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote:
> >
> > Hi:
> > Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> >
> > gcc/ChangeLog:
> > PR target/92658
> >
Hi:
Bootstrap is ok, regression test on i386/x86-64 backend is ok.
gcc/ChangeLog:
PR target/92658
* config/i386/sse.md
(trunc2, truncv32hiv32qi2,
trunc2): New expander.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512f.c: New test.
*
Documents operand modifiers which are available in asm stmt but
missing in document.
| Modifier | Description | Available in asm stmt | Existed in documentation |
| --- | --- | --- | - |
| L,W,B,Q,S,T | print the opcode suffix for specified size of
operand. | Available | Not |
| C |
Hi:
Test is ok for funcspec-5.c, funcspec-6.c.
gcc/testuite/ChangeLog
* gcc.target/i386/funcspec-56.inc: Add enqcmd, avx512bf16,
avx512vp2intersect.
gcc/testsuite/gcc.target/i386/funcspec-56.inc | 6 ++
1 file changed, 6 insertions(+)
diff --git
On Mon, May 4, 2020 at 1:17 AM Uros Bizjak wrote:
>
> On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote:
> >
> > Hi:
> > This patch is about to enable GCC support for SERIALIZE which would
> > be in GLC. There's only 1 instruction: SERIALIZE, more details
On Mon, May 4, 2020 at 12:58 AM Uros Bizjak wrote:
>
> The part above is OK, but you are missing support for
> __attribute__((__target__("..."))). Please see how for example -msgx
> is handled in isa2_opts in i386-options.c and in
> gcc.target/i386/funcspec-56.h test source.
>
> Please repost the
On Wed, Apr 1, 2020 at 3:32 PM Hongtao Liu wrote:
>
> Hi:
> This patch is about to enable GCC support for TSXLDTRK which would
> be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more
> details please
> refer to
> https://software.intel.com/sites/defaul
Hi:
This patch is about to enable GCC support for TSXLDTRK which would
be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more
details please
refer to
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
I
Date: Wed, 4 Mar 2020 14:08:40 +0800
Subject: [PATCH] Enable GCC support for SERIALIZE
2020-03-04 Hongtao Liu
2020-03-04 Wei Xiao
gcc/Changelog:
* gcc/common/config/i386/i386-common.c (OPTION_MASK_ISA2_SERIALIZE_SET,
OPTION_MASK_ISA2_SERIALIZE_UNSET): New macros.
(ix86_handle_option): Handle
On Tue, Feb 18, 2020 at 7:00 PM Hongtao Liu wrote:
>
> On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote:
> >
> >
> >
> > On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote:
> >>
> >> > Changelog
> >> > gcc/
> >> >
On Tue, Feb 18, 2020 at 4:24 PM Uros Bizjak wrote:
>
>
>
> On Thu, Feb 13, 2020 at 9:39 AM Uros Bizjak wrote:
>>
>> > Changelog
>> > gcc/
>> >* config/i386/avx512vbmi2intrin.h
>> >(_mm512_[,mask_,maskz_]shrdi_epi16,
>> >_mm512_[,mask_,maskz_]shrdi_epi32,
>> >
Done.
On Fri, Feb 14, 2020 at 7:16 PM Uros Bizjak wrote:
>
> On Fri, Feb 14, 2020 at 8:06 AM Uros Bizjak wrote:
> >
> > On Fri, Feb 14, 2020 at 7:03 AM Hongtao Liu wrote:
> > >
> > > On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote:
> > > >
>
On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote:
>
> On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote:
> >
> > On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote:
> > >
> > > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrot
On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote:
>
> On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote:
> >
> > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrote:
> > > > Changelog
> > > > gcc/
> > > >* config/i386/avx512vbmi2intrin.h
> > > >
Hi
As mentioned in PR93724, several intrinsic macros lack a closing
parenthesis. These macros are only used with -O0 option, and currently
unit tests use -O2, so not covered.
Bootstrap ok, regression tests on i386/x86_64 is ok.
Ok for trunk?
Changelog
gcc/
*
On Wed, Dec 18, 2019 at 4:26 PM Segher Boessenkool
wrote:
>
> On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote:
> > Hi:
> > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > power of 2 and D mod C == 0.
> > bootstrap an
On Wed, Dec 18, 2019 at 10:50 AM Andrew Pinski wrote:
>
> On Tue, Dec 17, 2019 at 6:33 PM Hongtao Liu wrote:
> >
> > Hi:
> > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > power of 2 and D mod C == 0.
> > bootstrap and make
Hi:
This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
power of 2 and D mod C == 0.
bootstrap and make check is ok.
changelog
gcc/
* gcc/match.pd (A * C + (-D) = (A - D/C) * C. when C is a
power of 2 and D mod C == 0): Add new simplification.
gcc/testsuite
Hi:
This patch is about to add tune option for integer mask cmov, for
some targets has both integer mask register and sse mask register,
this tune indicates to use integer one. Currently it's default on for
m_CORE_AVX512.
Bootstrap is ok, regression test on i386/x86_64 backends is ok.
ok
On Wed, Dec 11, 2019 at 3:54 PM Jakub Jelinek wrote:
>
> On Wed, Dec 11, 2019 at 09:55:24AM +0800, Hongtao Liu wrote:
> > Changelog
> > gcc/
> > PR target/92865
> > * config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable
> > integer mask cmov
On Tue, Dec 10, 2019 at 4:11 PM Jakub Jelinek wrote:
>
> On Tue, Dec 10, 2019 at 01:47:50PM +0800, Hongtao Liu wrote:
> > This patch is to enable integer mask cmp/cmov under AVX512F even
> > with TARGET_XOP .
> > Bootstrap and regression test on i386/x86_64 backend
Hi jakub:
This patch is to enable integer mask cmp/cmov under AVX512F even
with TARGET_XOP .
Bootstrap and regression test on i386/x86_64 backend is ok.
Changelog:
PR target/92865
* gcc/config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable
integer mask cmov when available
Hi uros:
This patch is about to rename OPTION_MASK_ISA_$target_[SET,UNSET, ]
to OPTION_MASK_ISA2_$target_[SET,UNSET, ] for those targets setting
x_ix86_isa_flags2.
target list as bellow:
-
188static struct ix86_target_opts isa2_opts[] =
189{
190 { "-mcx16",
On Thu, Dec 5, 2019 at 4:03 PM Jakub Jelinek wrote:
>
> On Thu, Dec 05, 2019 at 09:56:46AM +0800, Hongtao Liu wrote:
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > + /* Using vector move with mask register. */
> > +
On Wed, Dec 4, 2019 at 4:22 PM Jakub Jelinek wrote:
>
> On Wed, Dec 04, 2019 at 10:07:05AM +0800, Hongtao Liu wrote:
> > Changelog
> > gcc/
> > PR target/92686
> > * config/i386/sse.md
> > (*_cmp3,
> > *_cmp3,
> > *_uc
Hi:
Currently for VCOND_EXPR, integer mask operation is only available
for 512-bit vector, but since mask register is related to isa not
vector size, under avx512f we can also have 128/256-bit vector
condition move. My local tests show there's no boost frequency penalty
for using integer mask
hi jakub:
VF is used for differentiating AVX512F/AVX/SSE, but there's
condition TARGET_AVX512F in avx512f_maskcmp3, it must be a TYPO
and should be VF_AVX512VL instead.
Bootstrap and regression test on i386/x86_64 backend is ok.
OK for trunk?
diff --git a/gcc/config/i386/sse.md
On Sat, Nov 16, 2019 at 7:27 AM Jeff Law wrote:
>
> On 11/14/19 5:21 AM, Richard Biener wrote:
> > On Tue, Nov 12, 2019 at 11:35 AM Hongtao Liu wrote:
> >>
> >> Hi:
> >> As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html
> >
On Tue, Nov 12, 2019 at 4:41 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 9:29 AM Hongtao Liu wrote:
> >
> > On Tue, Nov 12, 2019 at 4:19 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> > >
Hi:
As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html
> So yes, it's poorly named. A preparatory patch to clean this up
> (and maybe split it into TARGET_AVX256_SPLIT_REGS and TARGET_AVX128_OPTIMAL)
> would be nice.
Bootstrap and regression test for i386 backend is ok.
On Tue, Nov 12, 2019 at 4:29 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 9:19 AM Richard Biener
> wrote:
> >
> > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> > >
> > > Hi:
> > > This patch is about to set X86_TUNE_AVX128_OPTIMA
On Tue, Nov 12, 2019 at 4:19 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> >
> > Hi:
> > This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for
> > all AVX target because we found there's still performanc
Hi:
This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for
all AVX target because we found there's still performance gap between
128-bit auto-vectorization and 256-bit auto-vectorization even with
epilog vectorized.
The performance influence of setting avx128_optimal as default on
Ping!
On Sat, Nov 2, 2019 at 9:38 PM Hongtao Liu wrote:
>
> Hi Jakub:
> Could you help reviewing this patch.
>
> PS: Since this patch is related to vectors(avx512f), and Uros
> mentioned before that he has no intension to maintain avx512f.
>
> On Fri, Nov 1, 2019 at 9:
Hi Jakub:
Could you help reviewing this patch.
PS: Since this patch is related to vectors(avx512f), and Uros
mentioned before that he has no intension to maintain avx512f.
On Fri, Nov 1, 2019 at 9:12 AM Hongtao Liu wrote:
>
> Hi uros:
> This patch is about to fix inefficie
Hi uros:
This patch is about to fix inefficient vector constructor.
Currently in ix86_expand_vector_init_concat, vector are initialized
per 2 elements which can miss some optimization opportunity like
pr92295.
Bootstrap and i386 regression test is ok.
Ok for trunk?
Changelog
gcc/
> BTW: Please also note that there is no need to use or operand
> mode override in scalar insn templates for intel asm dialect when
> operand already has a scalar mode.
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html
This patch is to remove redundant when operand already has a scalar
> Looking into sse.md, there is a lot of inconsistencies in existing *vm
> patterns w.r.t. operand constraints. Unfortunately, these were copied
> into proposed patterns. One example is existing
>
> (define_insn "_vmsqrt2"
> [(set (match_operand:VF_128 0 "register_operand" "=x,v")
>
Update patch.
On Fri, Oct 25, 2019 at 4:01 PM Uros Bizjak wrote:
>
> On Fri, Oct 25, 2019 at 7:55 AM Hongtao Liu wrote:
> >
> > On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote:
> > >
> > > On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
> > &
On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote:
>
> On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
> >
> > On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote:
> > >
> > > Update patch:
> > > Add m constraint to define_insn (sse_1_round > &g
On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
>
> On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote:
> >
> > Update patch:
> > Add m constraint to define_insn (sse_1_round > *sse_1_round > when under sse4 but not avx512f.
>
> It looks to me that the o
Update patch:
Add m constraint to define_insn (sse_1_round):
Change constraint x to xm
since vround support memory operand.
* (*sse4_1_round): Ditto.
Bootstrap and regression test ok.
On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu wrote:
>
> Hi uros:
> This patch fi
Hi uros:
This patch fixes false dependence of scalar operations
vrcp/vsqrt/vrsqrt/vrndscale.
Bootstrap ok, regression test on i386/x86 ok.
It does something like this:
-
For scalar instructions with both xmm operands:
op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ
for scalar
On Mon, Oct 21, 2019 at 1:15 AM Gerald Pfeifer wrote:
>
> On Fri, 11 Oct 2019, liuho...@gcc.gnu.org wrote:
> > commit 63fbcfeaf27d9dd2083ccbd34bdff8fccb63949c
> > Author: liuhongt
> > Date: Fri Oct 11 14:27:47 2019 +0800
> >
> > Update gcc10 changes with new intel ISA.
>
> I just applied
On Sat, Oct 12, 2019 at 4:15 PM Jakub Jelinek wrote:
>
> Hi!
>
> > gcc/
> > * config/i386/avx512fintrin.h (_mm_mask_roundscale_ss,
> > _mm_maskz_roundscale_ss, _mm_maskz_roundscale_round_ss,
> > _mm_maskz_roundscale_round_ss, _mm_mask_roundscale_sd,
> >
Hi:
This patch is enabling missing avx512f intrinsics listed as
_mm_mask_roundscale_sd
_mm_mask_roundscale_round_sd
_mm_maskz_roundscale_sd
_mm_maskz_roundscale_round_sd
_mm_mask_roundscale_ss
_mm_mask_roundscale_round_ss
_mm_maskz_roundscale_ss
_mm_maskz_roundscale_round_ss
Bootstrap ok,
Hi Uros:
This patch extend pass rpad to handle AVX512F vcvtusi2ss/vcvtusi2sd.
538.image_r would be improved by 4% with single copy run on skylake
workstation.
Bootstrap ok. regression test for i386/x86 backend ok.
Ok for trunk?
Changelog
gcc/
* config/i386/i386.md
On Wed, Sep 4, 2019 at 9:44 AM Hongtao Liu wrote:
>
> On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote:
> >
> > On Tue, Sep 3, 2019 at 1:33 PM Richard Biener
> > wrote:
> >
> > > > > Note:
> > > > > Removing limit of cost would in
On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak wrote:
>
> On Tue, Sep 3, 2019 at 1:33 PM Richard Biener
> wrote:
>
> > > > Note:
> > > > Removing limit of cost would introduce lots of regressions in SPEC2017
> > > > as follow
> > > >
> > > > 531.deepsjeng_r -7.18%
On Mon, Sep 2, 2019 at 4:41 PM Uros Bizjak wrote:
>
> On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote:
> >
> > > which is not the case with core_cost (and similar with skylake_cost):
> > >
> > > 2, 2, 4,/* cost of moving XMM,YMM,
On Mon, Sep 2, 2019 at 6:23 PM Richard Biener
wrote:
>
> On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu wrote:
> >
> > > which is not the case with core_cost (and similar with skylake_cost):
> > >
> > > 2, 2, 4,/* cost of moving XMM,YMM,
> which is not the case with core_cost (and similar with skylake_cost):
>
> 2, 2, 4,/* cost of moving XMM,YMM,ZMM register */
> {6, 6, 6, 6, 12},/* cost of loading SSE registers
>in 32,64,128,256 and 512-bit */
> {6, 6, 6, 6, 12},
On Fri, Aug 30, 2019 at 2:18 PM Uros Bizjak wrote:
>
> On Fri, Aug 30, 2019 at 2:08 AM Hongtao Liu wrote:
> >
> > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
> > >
> > > 2019-08-28 Uroš Bizjak
> > >
> > > * config/i386/i3
On Fri, Aug 30, 2019 at 8:10 AM Hongtao Liu wrote:
>
> On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
> >
> > 2019-08-28 Uroš Bizjak
> >
> > * config/i386/i386.c (ix86_register_move_cost): Do not
> > limit the cost of moves to/from XMM registe
On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak wrote:
>
> 2019-08-28 Uroš Bizjak
>
> * config/i386/i386.c (ix86_register_move_cost): Do not
> limit the cost of moves to/from XMM register to minimum 8.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> Actually
his or attach the patch instead.
> > >> >>
> > >> >> > Index: ChangeLog
> > >> >> > ===
> > >> >> > --- ChangeLog (revision 272668)
> > >> >>
1101 - 1200 of 1240 matches
Mail list logo