Hi all,
When I was backporting my doc patch in gcc trunk today, I found when adding
-march=gracemont in GCC14, the corresponding wwwdoc is missing. This patch
is adding that.
Ok for wwwdocs trunk?
Thx,
Haochen
---
htdocs/gcc-14/changes.html | 4
1 file changed, 4 insertions(+)
diff --git
Hi all,
For AVX10.2 convert tests, all of them are missing mask tests
previously, this patch will add them in the tests.
Tested on sde with assembler with corresponding insts. Ok for trunk?
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Enhance mas
Hi all,
For AVX10.2 convert tests, all of them are missing mask tests
previously, this patch will add them in the tests.
Tested on sde with assembler with these insts. Ok for trunk?
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Enhance mask test.
Hi all,
Since commit r15-3594, we fixed the bugs in MASK_TYPE for AVX10.2
testcases, but we missed the following four.
The tests are not FAIL since the binutils part haven't been merged
yet, which leads to UNSUPPORTED test. But the avx512f-mask-type.h
needs to be included, otherwise, it will be c
Hi all,
Since r15-3539, there are requests coming in to add other alias option
documentation. This patch will add all ot them, including corei7, corei7-avx,
core-avx-i, core-avx2, atom, slm, gracemont and emerarldrapids.
Also in the patch, I reordered that part of documentation, currently all
the
Hi all,
This patch will add those recent aliased CPU names into documentation
for clearness.
Ready to push for trunk and backport to GCC14 and part of the patch to
GCC13 as an obvious fix if no objection.
Thx,
Haochen
gcc/ChangeLog:
PR target/116617
* doc/invoke.texi: Add meteo
Hi all,
In avx512f-mask-type.h, we need SIZE being defined to get
MASK_TYPE defined correctly. Fix those testcases where
SIZE are not defined before the include for avv512f-mask-type.h.
Note that for convert intrins in AVX10.2, they will need more
modifications due to the current tests did not in
Hi all,
The intrin for non-optimized got a typo in mask type, which will cause
the high bits of __mmask32 being unexpectedly zeroed.
The test does not fail under O0 with current 1b since the testcase is
wrong. We need to include avx512-mask-type.h after SIZE is defined, or
it will always be __mma
Hi all,
Sorry for the disturb since I mis-typoed gcc-patches to gcc-patchs, resend
the patch.
This patch will add documentation for recent update in x86-64 backend.
Ok for wwwdocs trunk?
Thx,
Haochen
---
Mention AVX10.2 support and Xeon Phi removal in GCC 15.
---
htdocs/gcc-15/changes.html
From: Levy Hsu
AVX10.2 introduces several non-exception instructions for BF16 vector.
Enable vectorized BF add/sub/mul/div operation by supporting standard
optab for them.
gcc/ChangeLog:
* config/i386/sse.md (div3): New expander for BFmode div.
(VF_BHSD): New mode iterator with
From: Levy Hsu
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_use_mask_cmp_p): Add BFmode
for int mask cmp.
* config/i386/sse.md (vec_cmp): New
vec_cmp expand for VBF modes.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-bf-vector-cmpp-1.c:
From: Levy Hsu
gcc/ChangeLog:
* config/i386/sse.md
(3): New define expand pattern for BF smaxmin.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-bf-vector-smaxmin-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-smaxmin-1.c: New test.
---
gcc/config/i3
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_fp_compare): Add UNSPEC to
support the optimization.
* config/i386/i386.cc (ix86_fp_compare_code_to_integer): Add NE/EQ.
* config/i386/i386.md (*cmpx): New define_insn.
(*cmpxhf): Di
From: Levy Hsu
gcc/ChangeLog:
* config/i386/sse.md: Expand VF2H to VF2HB with VBF modes.
---
gcc/config/i386/sse.md | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b374783429c..2de592a9c8f 100644
---
From: Levy Hsu
gcc/ChangeLog:
* config/i386/sse.md: Add V8BF/V16BF/V32BF to mode iterator FMAMODEM.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-bf-vector-fma-1.c: New test.
* gcc.target/i386/avx10_2-bf-vector-fma-1.c: New test.
---
gcc/config/i386/sse.md
Hi all,
I have just commited AVX10.2 new instructions patches into trunk hours
ago. The next and final part for AVX10.2 upstream is to optimize code
with AVX10.2 new instructions.
In this patch series, it will contain the following optimizations:
- VNNI instruction auto vectorize (PATCH 1).
From: "Hu, Lin1"
Currently, when we input !__builtin_isunordered (a, b) && (a != b), gcc
will emit
ucomiss %xmm1, %xmm0
movl $1, %ecx
setp %dl
setnp %al
cmovne %ecx, %edx
andl %edx, %eax
movzbl %al, %eax
In fact,
xorl %eax, %eax
ucomiss %xmm1, %xmm0
setne %al
is better.
gcc/
gcc/ChangeLog:
* config/i386/sse.md (VI1_AVX512VNNIBW): New.
(VI2_AVX10_2): Ditto.
(sdot_prod): Add AVX10.2
to auto vectorize and combine 512 bit part.
(udot_prod): Ditto.
(sdot_prodv64qi): Removed.
(udot_prodv64qi): Ditto.
(usdot_pro
in): Ditto.
(ix86_expand_builtin): Change function call.
* config/i386/i386.md (UNSPEC_COMX): New unspec.
* config/i386/sse.md
(avx10_2_vcomx): New.
(_comi): Add HFmode.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-compare-1.c: New test.
Co-authored-by: Hao
Since BF8 and FP16 have same bits for exponent, the type conversion
between them is just a cast for fraction part. We will use a sequence
of instrctions instead of new instructions to do that. For convenience,
intrins are also provided.
gcc/ChangeLog:
* config/i386/avx10_2-512convertintri
From: "Zhang, Jun"
gcc/ChangeLog:
* config/config.gcc: Add avx10_2copyintrin.h.
* config/i386/i386.md (avx10_2): New isa attribute.
* config/i386/immintrin.h: Include avx10_2copyintrin.h.
* config/i386/sse.md
(sse_movss_): Add new constraints to handle AVX
gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto.
Co-authored-by: Lin Hu
Co-authored-by: Haochen Jiang
---
gcc/config.gcc|3 +-
gcc/config/i3
From: konglin1
gcc/ChangeLog:
* config/i386/avx10_2-512bf16intrin.h: Add new intrinsics.
* config/i386/avx10_2bf16intrin.h: Diito.
* config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE
for new type.
* config/i386/i386-builtin.def (BDESC): Add ne
From: konglin1
gcc/ChangeLog:
* config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h.
* config/i386/i386-builtin-types.def : Add new
DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF,
V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF,
V8BF_FTYPE_V8BF_V8
From: "Hu, Lin1"
gcc/ChangeLog:
* config.gcc: Add avx10_2satcvtintrin.h and
avx10_2-512satcvtintrin.h.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI),
(V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI),
(V16
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md
(avx10_2_vcvttpd2dqs):
New.
(avx10_2_vcvttpd2qqs):
Ditto.
(avx10_2_vcvttps2dqs):
Ditto.
(avx10_2_vcvttps2qqs):
: Ditto.
* gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto.
Co-authored-by: Haochen Jiang
---
gcc/config.gcc| 3 +-
gcc/config/i386/avx10_2-512mediaintrin.h | 234 +++
gcc/config/i386
gcc/ChangeLog:
* config/i386/avx10_2-512mediaintrin.h: Add new intrins.
* config/i386/avx10_2mediaintrin.h: Ditto.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-builtins.cc (def_builtin): Handle shared
builtins between AVXVNNIINT16 and
Hi all,
The AVX10.2 ymm rounding patches has been merged to trunk around
6 hours ago. As mentioned before, next step will be AVX10.2 new
instruction support.
This patch series could be divided into three part.
The first patch will refactor m512-check.h under testsuite to reuse
AVX-512 helper fun
After AVX10 introduction, we still want to use AVX512 helper functions
to avoid duplicate code. In order to reuse them, we need to do some refactor
to make sure each function define happen under correct ISA to avoid ABI
warnings.
gcc/testsuite/ChangeLog:
* gcc.target/i386/m512-check.h: Wr
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add new builtin test.
* gcc.target/i386/sse-13.c: Ditto.
* g
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md:
(reducep):
Add condition check.
(_rndscale): Ditto.
gcc/testsuite/ChangeLog:
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/sse.md:
(_scalef): Add condition check.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c:
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md:
(_fmsub__mask): Add conditi
From: "Hu, Lin1"
gcc/ChangeLog:
* config.gcc: Add avx10_2roundingintrin.h.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add new builtin test.
* gcc.target/i386/sse-13.c: Ditto.
* g
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add new builtin test.
* gcc.target/i386/sse-13.c: Ditto.
* g
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md:
(_fnmsub__mask3): Add condition check.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c:
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md
(unspec_fix_truncv8sfv8si2):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: Add new intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_built
Hi all,
The initial patch for AVX10.2 has been merged this week.
For the upcoming patches, we will first upstream ymm rounding control part.
In ymm rounding part, ALL the instructions in AVX512 with 512-bit rounding
control will also have 256-bit rounding control in AVX10.2.
For clearness, the
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md:
(_fmaddsub__mask): Add cond
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md
(avx512fp16_fix_trunc2):
Ex
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: Add new intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_built
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/sse.md:
(_fmadd__mask3): Add condition check.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c:
From: "Hu, Lin1"
gcc/ChangeLog:
* config/i386/avx10_2roundingintrin.h: New intrins.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin):
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features): Handle
avx10.2.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_2_256_SET): New.
(OPTION_MASK_ISA2_AVX10_2_512_SET): Ditto.
(OPTION_MASK_ISA2_AVX10_1_256_UNSET):
Hi all,
AVX10.2 tech details has been just published on July 31st in the
following link:
https://cdrdv2.intel.com/v1/dl/getContent/828965
For new features and instructions, we could divide them into two parts.
One is ymm rounding control, the other is the new instructions.
In the following week
Hi all,
In GCC13/12, there is no _mm_avx512_setzero_ps/d since it is introduced
in GCC14.
Fix the backport issue as obvious in:
https://gcc.gnu.org/pipermail/gcc-regression/2024-July/080385.html
Thx,
Haochen
gcc/ChangeLog:
* config/i386/avx512dqintrin.h (_mm_reduce_round_sd): Use
Hi all,
I added related O0 testcase in this patch.
Ok for trunk and backport to GCC 14 and GCC 13?
Thx,
Haochen
---
Changes in v2: Add testcases.
---
Under -O0, with the "newly" introduced intrins, the variable will be
transformed as mem instead of the origin symbol_ref. The compiler will
th
Hi all,
I have added related testcases into the patch.
Ok for trunk and backport to GCC 14, GCC 13 and GCC 12?
Thx,
Haochen
---
Changes in v2: Add related testcases
---
There are several typo in AVX512 intrins macro define. Correct them to solve
errors when compiled with -O0.
gcc/ChangeLog
Hi all,
Under -O0, with the "newly" introduced intrins, the variable will be
transformed as mem instead of the origin symbol_ref. The compiler will
then treat the operand as invalid and turn the operation into nop, which
is not expected. Use macro for non-optimize to keep the variable as
symbol_re
Hi all,
There are several typo in AVX512 intrins macro define. They will eventually
result in errors with -O0. This patch will fix that.
Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk and backport to GCC14,
GCC 13 and GCC 12?
Thx,
Haochen
gcc/ChangeLog:
* config/i386/avx512dqintrin.
Hi all,
I tested with %a and it works. Therefore I suppose it is a better solution.
Bootstrapped and regtested on x86-64-pc-linux-gnu. Ok for trunk and backport
to GCC 13 and 14?
Thx,
Haochen
---
Changes in v2: Use %a in pattern
---
For prefetchi instructions, RIP-relative address is explici
Hi all,
For prefetchi instructions, RIP-relative address is explicitly mentioned
for operand and assembler obeys that rule strictly. This makes
instruction like:
prefetchit0 bar
got illegal for assembler, which should be a broad usage for prefetchi.
Explicitly add (%rip) after funct
Hi all,
I revised the patch according to the comment.
Ok for trunk?
Thx,
Haochen
---
Changes in v2: Add suffix for mov to make the test more robust.
---
For compile test, we should generate valid asm except for special purposes.
Fix the compile test that generates invalid asm.
gcc/testsuite
Hi all,
For compile test, we should generate valid asm except for special purposes.
Fix the compile test that generates invalid asm.
Regtested on x86-64-pc-linux-gnu. Ok for trunk?
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-egprs-names.c: Use ax for short and
a
Hi all,
For AMX instructions related with memory, we will treat the memory
size as not specified since there won't be different size causing
confusion for memory.
This will change the output under Intel mode, which is broken for now when
using with assembler and aligns to current binutils behavio
Hi all,
AVX10 Documentaion has specified ecx value as 0 for AVX10 version and
vector size under 0x24 subleaf. Although for ecx=1, the bits are all
reserved for now, we still need to specify ecx as 0 to avoid dirty
value in ecx.
Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk and backport to GCC
Hi all,
Since AVX10 is the first major ISA introduced after AVX-512, we propose
to add target_clones support for it.
Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since
it is only for priority but not for implication, it won't be an issue.
Bootstrapped and regtested on x86_64-pc-
Hi all,
This is the v3 patch to fix PR115069. The new testcase has passed.
Changes in v3:
- Simplify the testcase.
Changes in v2:
- Add a testcase.
- Change the comment for the early exit.
Thx,
Haochen
Since vpermq is really slow, we should avoid using it for permutation
when vpmovwb is
Hi all,
This is the v2 patch to fix PR115069. The new testcase has passed.
Changes in v2:
- Added a testcase.
- Change the comment for the early exit.
Thx,
Haochen
Since vpermq is really slow, we should avoid using it for permutation
when vpmovwb is not available (needs AVX512BW) for ix86_e
Hi all,
Since vpermq is really slow, we should avoid using it when it is
the only instruction could be used for ix86_expand_vecop_qihi2.
Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
PR target/115069
* config/i386/i386-expand.cc (i
From: liuhongt
When hot loop is small enough to fix into one cacheline, we should align
the loop with ceil_log2 (loop_size) without considering maximum
skipp bytes. It will help code prefetch.
gcc/ChangeLog:
* config/i386/i386.cc (ix86_avoid_jump_mispredicts): Change
gen_pad to
Previously, we use 16:11:8 in generic tune for Intel processors, which
lead to cross cache line issue and result in some random performance
penalty in benchmarks with small loops commit to commit.
After changing to always aligning to 16 bytes, it will somehow solve
the issue.
gcc/ChangeLog:
n. We planned to backport it to GCC14.2.
Thx,
Haochen
Haochen Jiang (1):
Adjust generic loop alignment from 16:11:8 to 16 for Intel processors
liuhongt (1):
Align tight&hot loop without considering max skipping bytes.
gcc/config/i386/i386.cc | 148 ++-
g
Hi all,
The array index should not be over 8 for v8hi, or it will fail
under -O0 or using -fstack-protector.
This patch aims to fix that, which is mentioned in PR110621.
Commit as obvious and backport to GCC13.
Thx,
Haochen
gcc/testsuite/ChangeLog:
PR target/110621
* gcc.targe
Hi all,
When we are using -mavx10.1-256 in command line and avx10.1-256 in
target attribute together, zmm should never be generated. But current
GCC will generate zmm since it wrongly enables EVEX512 for non-explicitly
set AVX512. This patch will fix that issue.
Regtested on x86_64-pc-linux-gnu.
Hi all,
This patch fixes an bug in mapping which caused auto dispatch failed.
Sierra Forest is in processor_types enum, but not processor_subtypes.
Committed as obvious and backport to GCC13.
Thx,
Haochen
gcc/ChangeLog:
* common/config/i386/i386-common.cc (processor_alias_table):
Hi all,
When I am checking GCC14 documentation, I found that MCore forgot to uncomment
the title for their part, which caused the documentation is mixed with x86.
Uncomment that and commit as obvious.
Thx,
Haochen
---
htdocs/gcc-14/changes.html | 2 +-
1 file changed, 1 insertion(+), 1 deletio
Hi all,
Recently, I happened to run i386.exp under -DDEBUG and found some fail.
This patch aims to fix that. Ok for trunk?
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/adx-check.h: Include stdio.h when DEBUG
is defined.
* gcc.target/i386/avx512fp16-vscalefph-
Hi all,
This patch removes all redundant set in vnni patterns.
Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
* config/i386/sse.md (sdot_prod): Remove redundant SET.
(usdot_prod): Ditto.
(sdot_prod): Ditto.
(udot_prod): Ditto.
---
gcc/config/i386/sse.md | 4
1
Hi all,
This patch aims to add AVX10.1 related macros for libgomp's request. The
request comes following:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642025.html
Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
PR target/113288
* config/i386/i386-c.cc (ix86_target_macros_i
Hi Richard,
It seems that I send out a not updated patch. This patch should what
I want to send.
Thx,
Haochen
gcc/ChangeLog:
* doc/invoke.texi: Add -mevex512.
---
gcc/doc/invoke.texi | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc
Hi all,
In invoke.texi, -mevex512 is missing. This patch adds that.
Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
* doc/invoke.texi: Add -mevex512.
---
gcc/doc/invoke.texi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6
After commit 01f4251b8775c832a92d55e2df57c9ac72eaceef, early break
vectorization is supported. The two testcases need to be fixed.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-xorsign-1.c: Fix testcase.
* gcc.target/i386/part-vect-absneghf.c: Ditto.
---
gcc/testsuite/gcc
Hi all,
This is the v2 patch for the wwwdocs change regarding to review.
If there is no objection, I will push this change next Tuesday.
Changes is v2:
- Remove RAO-INT from Grand Ridge
- Remove the mask register restriction for -mno-evex512
- Arrange the options alphabetically
- Other
Hi all,
There is a recent change in AVX10 documentation which allows 64 bit mask
register instructions in AVX10-256, the documentation comes following:
Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture Specification
https://cdrdv2.intel.com/v1/dl/getContent/784267
The Converged Vecto
Hi all,
According to ISE050 published at the end of September, RAO-INT will not
be in Grand Ridge anymore. This patch aims to remove it.
The documentation comes following:
https://cdrdv2.intel.com/v1/dl/getContent/671368
Regtested on x86_64-pc-linux-gnu. Ok for trunk and backport to GCC13?
Thx
Hi all,
This patch will fix the testcase fail previously introduced.
Approved by another thread:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640288.html
Pushed to trunk.
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110790-2.c: Change scan-assembler from shrq
Hi all,
This patch will mention the following changes in wwwdocs for x86_64 backend:
- AVX10.1 support
- APX EGPR, PUSH2POP2, PPX and NDD support
- Xeon Phi ISAs deprecated
Also I adjust the words in x86_64 part for GCC 13. Ok for gcc-wwwdocs?
Thx,
Haochen
Mention AVX10.1 support, APX su
Since Knight Landing and Knight Mill microarchitectures are EOL, we
would like to remove its support in GCC 15. In GCC 14, we will first
emit a warning for the usage.
gcc/ChangeLog:
* config/i386/driver-i386.cc (host_detect_local_cpu):
Do not append "-mno-" for Xeon Phi ISAs.
Hi all,
Since Knight Landing and Knight Mill microarchitectures were EOL in 2019
and previously ICC and ICX has removed the support and emitted errors, we
would also like to remove the support in GCC to reduce maintainence effort.
The deprecated Xeon Phi ISAs are AVX512PF, AVX512ER, AVX5124VNNIW,
Hi all,
This patch should be able to fix the current issue mentioned in PR112643.
Also, I fixed some legacy issues in code related to AVX512/AVX10.
Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
PR target/112643
* config/i386/driver-i386.cc (check_avx10_avx512_features):
Re
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Add avx10_set and version and detect avx10.1.
(cpu_indicator_init): Handle avx10.1-512.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_1_256_SET): New.
(OPTION_MASK_
Hi all,
This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512
support, it makes a lot easier to add them comparing to the August version.
Detail for AVX10 is shown below:
Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture Specification
It describes the Intel Advan
Hi all,
This patch aims fo fix the wrong isa attribute which caused regression
on PR111907.
Regtested on x86_64-pc-linux-gnu. Ok for trunk?
Thx,
Haochen
gcc/ChangeLog:
PR target/111907
* config/i386/i386.md (avx_noavx512vl): Add missing definition.
* config/i386/sse.md
gcc/ChangeLog:
* config/i386/avx512bf16vlintrin.h
(_mm_avx512_castsi128_ps): New.
(_mm256_avx512_castsi256_ps): Ditto.
(_mm_avx512_slli_epi32): Ditto.
(_mm256_avx512_slli_epi32): Ditto.
(_mm_avx512_cvtepi16_epi32): Ditto.
(_mm256_avx512_cvtep
gcc/ChangeLog:
* config/i386/avx512bf16vlintrin.h: Change intrin call.
* config/i386/avx512fintrin.h
(_mm_avx512_undefined_ps): New.
(_mm_avx512_undefined_pd): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512vbmivlintrin.h: Ditto.
gcc/ChangeLog:
PR target/111889
* config/i386/avx512bf16intrin.h: Push no-evex512 target.
* config/i386/avx512bf16vlintrin.h: Ditto.
* config/i386/avx512bitalgvlintrin.h: Ditto.
* config/i386/avx512bwintrin.h: Ditto.
* config/i386/avx512dqintrin.h: D
Hi all,
These four patches are going to fix no-evex512 function attribute. The detail
of the issue comes following:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111889
My proposal for this problem is to also push "no-evex512" when defining
128/256 intrins in AVX512.
Besides, I added some new in
Hi all,
This patch fixed two obvious bug in current evex512 implementation.
Also, I moved AVX512CD+AVX512VL part out of the AVX512VL to avoid
accidental handle miss in avx512cd in the future.
Ok for trunk?
BRs,
Haochen
gcc/ChangeLog:
* config/i386/avx512cdintrin.h (target): Push evex5
Hi all,
This patch mentions recent update for x86-64 backend, including ISAs enabled
update on previous introduced CPU and newly introduced options/ISAs/CPUs.
Ok for wwwdocs?
Thx,
Haochen
---
htdocs/gcc-13/changes.html | 8
htdocs/gcc-14/changes.html | 19 +++
2 files
1 - 100 of 189 matches
Mail list logo