from:"Jan Beulich"

[PATCH] gcc/doc: adjust __builtin_choose_expr() description

2024-06-19 Thread Jan Beulich

Present wording has misled people to believe the ?: operator would be evaluating all three of the involved expressions. gcc/ * doc/extend.texi: Clarify __builtin_choose_expr() similarity to the ?: operator. --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -14962,9 +14962,9

[gcc r15-1331] configure: adjustments for building with in-tree binutils

2024-06-14 Thread Jan Beulich via Gcc-cvs

https://gcc.gnu.org/g:4b1f486fefb3969f35ff6d49f544eb0ac9f49f1f commit r15-1331-g4b1f486fefb3969f35ff6d49f544eb0ac9f49f1f Author: Jan Beulich Date: Fri Jun 14 13:28:40 2024 +0200 configure: adjustments for building with in-tree binutils For one setting ld_ver in a conditional

[PATCH] configure: adjustments for building with in-tree binutils

2024-06-12 Thread Jan Beulich

For one setting ld_ver in a conditional (no in-tree ld) when it's used, for x86 at least, in unconditional ways can't be quite right. And then prefixing relative paths to binaries with ${objdir}/, when ${objdir} nowadays resolves to just .libs, can at best be a leftover that wasn't properly

[gcc r14-10297] libgcc/aarch64: also provide AT_HWCAP2 fallback

2024-06-10 Thread Jan Beulich via Gcc-cvs

https://gcc.gnu.org/g:6bd8a3a7a8943184b16321f626d98045316c commit r14-10297-g6bd8a3a7a8943184b16321f626d98045316c Author: Jan Beulich Date: Mon Jun 10 08:47:58 2024 +0200 libgcc/aarch64: also provide AT_HWCAP2 fallback Much like AT_HWCAP is already provided in case

[gcc r15-1128] libgcc/aarch64: also provide AT_HWCAP2 fallback

2024-06-10 Thread Jan Beulich via Gcc-cvs

https://gcc.gnu.org/g:48d6d8c9e91018a625a797d50ac4def88376a515 commit r15-1128-g48d6d8c9e91018a625a797d50ac4def88376a515 Author: Jan Beulich Date: Mon Jun 10 08:47:58 2024 +0200 libgcc/aarch64: also provide AT_HWCAP2 fallback Much like AT_HWCAP is already provided in case

[PATCH] libgcc/aarch64: also provide AT_HWCAP2 fallback

2024-05-29 Thread Jan Beulich

Much like AT_HWCAP is already provided in case the platform headers don't have the value (yet). libgcc/ * config/aarch64/cpuinfo.c: Provide AT_HWCAP2. --- Observed as build failure with 14.1.0, so may want backporting there. --- a/libgcc/config/aarch64/cpuinfo.c +++

Re: Patches submission policy change

2024-04-04 Thread Jan Beulich via Gcc

On 03.04.2024 15:11, Christophe Lyon wrote: > On Wed, 3 Apr 2024 at 10:30, Jan Beulich wrote: >> >> On 03.04.2024 10:22, Christophe Lyon wrote: >>> Dear release managers and developers, >>> >>> TL;DR: For the sake of improving precommit CI covera

Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc

On 03.04.2024 10:57, Richard Biener wrote: > On Wed, 3 Apr 2024, Jan Beulich wrote: >> On 03.04.2024 10:45, Jakub Jelinek wrote: >>> On Wed, Apr 03, 2024 at 10:22:24AM +0200, Christophe Lyon wrote: >>>> Any concerns/objections? >>> >>> I'm all for i

Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc

On 03.04.2024 10:45, Jakub Jelinek wrote: > On Wed, Apr 03, 2024 at 10:22:24AM +0200, Christophe Lyon wrote: >> Any concerns/objections? > > I'm all for it, in fact I've been sending it like that myself for years > even when the policy said not to. In most cases, the diff for the > regenerated

Re: Patches submission policy change

2024-04-03 Thread Jan Beulich via Gcc

On 03.04.2024 10:22, Christophe Lyon wrote: > Dear release managers and developers, > > TL;DR: For the sake of improving precommit CI coverage and simplifying > workflows, I’d like to request a patch submission policy change, so > that we now include regenerated files. This was discussed during

Re: CREL relocation format for ELF

2024-03-28 Thread Jan Beulich via Gcc

On 28.03.2024 08:43, Fangrui Song wrote: > On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song wrote: >> >> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song wrote: >>> >>> The relocation formats REL and RELA for ELF are inefficient. In a >>> release build of Clang for x86-64, .rela.* sections consume a >>>

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc

On 23.01.2024 10:21, LIU Hao wrote: > 在 2024-01-23 17:03, Jan Beulich 写道: >> Hmm, that would suggest to me that the Dwarf code abuses the interface. >> A "name" certainly shouldn't be an expression. And hence the result of >> the example ought to be >> >&g

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc

On 23.01.2024 10:00, LIU Hao wrote: > 在 2024-01-23 16:38, Jan Beulich 写道: >> Right, but this is very "draft". You can't blindly assume the gas you use >> actually can deal with quotation. > > Let's assume that for the time being, but there's something else;

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-23 Thread Jan Beulich via Gcc

On 23.01.2024 02:27, LIU Hao wrote: > 在 2024-01-22 16:39, Jan Beulich 写道: >> Right, I did some work in that direction a while ago. But iirc there are >> still cases left to be addressed. > > Attached is a draft patch for GCC, bootstrapped on {i686,x86_64}-w64-min

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-22 Thread Jan Beulich via Gcc

On 20.01.2024 13:40, LIU Hao wrote: > 在 2024-01-19 17:13, Jan Beulich 写道: >> But I see a severe issue with your aim at confining strict mode to >> compiler generated code only: In inline assembly (see your mentioning of >> APP / NO_APP above) you still potentially re

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-19 Thread Jan Beulich via Gcc

On 18.01.2024 17:40, LIU Hao wrote: > 在 2024-01-18 20:54, Jan Beulich 写道: >> I'm sorry, but most of your proposal may even be considered for being >> acceptable only if you would gain buy-off from the MASM guys. Anything >> MASM treats as valid ought to be permitted b

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-18 Thread Jan Beulich via Gcc

On 19.01.2024 02:42, LIU Hao wrote: > In addition, `as -msyntax=intel -mnaked-reg` doesn't seem to be equivalent to > `.intel_syntax noprefix`: > > $ as -msyntax=intel -mnaked-reg <<< 'mov eax, DWORD PTR gs:0x48' -o a.o > {standard input}: Assembler messages: > {standard input}:1:

Re: RFC: Formalization of the Intel assembly syntax (PR53929)

2024-01-18 Thread Jan Beulich via Gcc

On 18.01.2024 06:34, LIU Hao wrote: > My complete proposal can be found at > . > Some ideas actually > reflect the AT syntax. I hope it helps. I'm sorry, but most of your proposal may even be considered for being

Re: [PATCH] binutils: v2: experimental use of libdiagnostics in gas

2023-11-21 Thread Jan Beulich

On 21.11.2023 23:20, David Malcolm wrote: > @@ -101,6 +109,29 @@ had_warnings (void) >return warning_count; > } > > +#if USE_LIBDIAGNOSTICS > +static diagnostic_manager *diag_mgr; > +#endif > + > +void messages_init (void) > +{ > +#if USE_LIBDIAGNOSTICS > + diag_mgr =

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Jan Beulich

On 07.11.2023 15:32, David Malcolm wrote: > On Tue, 2023-11-07 at 11:03 +0100, Jan Beulich wrote: >> On 06.11.2023 23:29, David Malcolm wrote: >>> All of the locations are just lines; does gas do column numbers at >>> all? >>> (or ranges?) >> >>

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Jan Beulich

On 06.11.2023 23:29, David Malcolm wrote: > Here's a patch for gas in binutils that makes it use libdiagnostics > (with some nasty hardcoded paths to specific places on my hard drive > to make it easier to develop the API). > > For now this hardcodes adding two sinks: a text sink on stderr, and >

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-11-06 Thread Jan Beulich

On 25.06.2023 08:41, Hongtao Liu wrote: > On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu wrote: >> >> On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote: >>> >>> On 25.06.2023 07:12, Hongtao Liu wrote: >>>> On Wed, Jun 21, 2023 at 2:

Re: Intel AVX10.1 Compiler Design and Support

2023-08-10 Thread Jan Beulich via Gcc-patches

On 10.08.2023 15:12, Phoebe Wang wrote: >> The psABI should have some simple rule covering all of the above I think. > > psABI has a rule for the case doesn't mean the rule is a well defined ABI > in practice. A well defined ABI should guarantee 1) interlinkable across > different compile

Re: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Jan Beulich via Gcc-patches

On 09.08.2023 09:38, Hongtao Liu wrote: > On Wed, Aug 9, 2023 at 3:17 PM Jan Beulich wrote: >> >> On 09.08.2023 04:14, Hongtao Liu wrote: >>> On Wed, Aug 9, 2023 at 9:21 AM Hongtao Liu wrote: >>>> >>>> On Wed, Aug 9, 2023 at 3:55 AM Joseph Myers

Re: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Jan Beulich via Gcc-patches

On 09.08.2023 04:14, Hongtao Liu wrote: > On Wed, Aug 9, 2023 at 9:21 AM Hongtao Liu wrote: >> >> On Wed, Aug 9, 2023 at 3:55 AM Joseph Myers wrote: >>> >>> Do you have any comments on the interaction of AVX10 with the >>> micro-architecture levels defined in the ABI (and supported with >>>

[PATCH 10/10] x86: drop redundant "prefix_data16" attributes

2023-08-03 Thread Jan Beulich via Gcc-patches

The attribute defaults to 1 for TI-mode insns of type sselog, sselog1, sseiadd, sseimul, and sseishft. In *v8hi3 [smaxmin] and *v16qi3 [umaxmin] also drop the similarly stray "prefix_extra" at this occasion. These two max/min flavors are encoded in 0f space. gcc/ * config/i386/mmx.md

[PATCH 08/10] x86: add missing "prefix" attribute to VF{,C}MULC

2023-08-03 Thread Jan Beulich via Gcc-patches

gcc/ * config/i386/sse.md (__): Add "prefix" attribute. (avx512fp16_sh_v8hf): Likewise. --- Talking of "prefix": Shouldn't at least V32HF and V32BF have it also default to "evex"? (It won't matter right here, but it may matter elsewhere.) ---

[PATCH 06/10] x86: drop stray "prefix_extra"

2023-08-03 Thread Jan Beulich via Gcc-patches

While the attribute is relevant for legacy- and VEX-encoded insns, it is of no relevance for EVEX-encoded ones. While there in avx512dq_broadcast_1 add the missing "length_immediate". gcc/ * config/i386/sse.md (*_eq3_1): Drop "prefix_extra".

[PATCH 05/10] x86: replace/correct bogus "prefix_extra"

2023-08-03 Thread Jan Beulich via Gcc-patches

In the rdrand and rdseed cases "prefix_0f" is meant instead. For mmx_floatv2siv2sf2 1 is correct only for the first alternative. For the integer min/max cases 1 uniformly applies to legacy and VEX encodings (the UB and SW variants are dealt with separately anyway). Same for {,V}MOVNTDQA. Unlike

[PATCH 09/10] x86: correct "length_immediate" in a few cases

2023-08-03 Thread Jan Beulich via Gcc-patches

When first added explicitly in 3ddffba914b2 ("i386.md (sse4_1_round2): Add avx512f alternative"), "*" should not have been used for the pre-existing alternative. The attribute was plain missing. Subsequent changes adding more alternatives then generously extended the bogus pattern. Apparently

[PATCH 07/10] x86: add (adjust) XOP insn attributes

2023-08-03 Thread Jan Beulich via Gcc-patches

Many were lacking "prefix" and "prefix_extra", some had a bogus value of 2 for "prefix_extra" (presumably inherited from their SSE5 counterparts, which are long gone) and a meaningless "prefix_data16" one. Where missing, "mode" attributes are also added. (Note that "sse4arg" and "ssemuladd" ones

[PATCH 03/10] x86: "ssemuladd" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches

They're all VEX3- (also covering XOP) or EVEX-encoded. Express that in the default calculation of "prefix". FMA4 insns also all have a 1-byte immediate operand. Where the default calculation is not sufficient / applicable, add explicit "prefix" attributes. While there also add a "mode" attribute

[PATCH 04/10] x86: "prefix_extra" can't really be "2"

2023-08-03 Thread Jan Beulich via Gcc-patches

In the three remaining instances separate "prefix_0f" and "prefix_rep" are what is wanted instead. gcc/ * config/i386/i386.md (rdbase): Add "prefix_0f" and "prefix_rep". Drop "prefix_extra". (wrbase): Likewise. (ptwrite): Likewise. --- a/gcc/config/i386/i386.md

[PATCH 02/10] x86: "sse4arg" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches

Record common properties in other attributes' default calculations: There's always a 1-byte immediate, and they're always encoded in a VEX3- like manner (note that "prefix_extra" already evaluates to 1 in this case). The drop now (or already previously) redundant explicit attributes, adding "mode"

[PATCH 01/10] x86: "prefix_extra" tidying

2023-08-03 Thread Jan Beulich via Gcc-patches

Drop SSE5 leftovers from both its comment and its default calculation. A value of 2 simply cannot occur anymore. Instead extend the comment to mention the use of the attribute in "length_vex", clarifying why "prefix_extra" can actually be meaningful on VEX-encoded insns despite those not having

[PATCH 00/10] x86: (mainly) "prefix_extra" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches

Having noticed various bogus uses, I thought I'd go through and audit them all. This is the result, with some other attributes also adjusted as noticed in the process. (I think this tidying also is a good thing to have ahead of APX further complicating insn length calculations.) 01:

[PATCH] MAINTAINERS: correct my email address

2023-08-01 Thread Jan Beulich via Gcc-patches

-Jan Beulich +Jan Beulich David Billinghurst Tomas Bily Laurynas Biveinis

[PATCH RESEND] libatomic: drop redundant all-multi command

2023-07-31 Thread Jan Beulich via Gcc-patches

./multilib.am already specifies this same command, and make warns about the earlier one being ignored when seeing the later one. All that needs retaining to still satisfy the preceding comment is the extra dependency. libatomic/ * Makefile.am (all-multi): Drop commands. *

[PATCH] x86: fold two of vec_dupv2df's alternatives

2023-07-31 Thread Jan Beulich via Gcc-patches

By using Yvm in the source, both can be expressed in one. gcc/ * sse.md (vec_dupv2df): Fold the middle two of the alternatives. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -13784,21 +13784,20 @@ (set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")])

Re: [PATCH] x86: slightly enhance "vec_dupv2df"

2023-07-17 Thread Jan Beulich via Gcc-patches

On 17.07.2023 08:09, Hongtao Liu wrote: > On Fri, Jul 14, 2023 at 5:40 PM Jan Beulich via Gcc-patches > wrote: >> >> Introduce a new alternative permitting all 32 registers to be used as >> source without AVX512VL, by broadcasting to the full 512 bits in that >> cas

Re: [PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Jan Beulich via Gcc-patches

On 14.07.2023 12:10, Uros Bizjak wrote: > On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote: >> >> The corresponding insn serves this purpose quite fine, and leads to >> slightly less (generated) code. All we need is the insn to not have a >> leading * i

[PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Jan Beulich via Gcc-patches

The corresponding insn serves this purpose quite fine, and leads to slightly less (generated) code. All we need is the insn to not have a leading * in its name, while retaining that * for "extendhfsf2". Introduce a mode attribute in exchange to achieve that. gcc/ * config/i386/i386.md

[PATCH] x86: avoid maybe_gen_...()

2023-07-14 Thread Jan Beulich via Gcc-patches

In the (however unlikely) event that no insn can be found for the requested mode, using maybe_gen_...() without (really) checking its result for being a null rtx would lead to silent bad code generation. gcc/ * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Use

[PATCH] x86: slightly enhance "vec_dupv2df"

2023-07-14 Thread Jan Beulich via Gcc-patches

Introduce a new alternative permitting all 32 registers to be used as source without AVX512VL, by broadcasting to the full 512 bits in that case. (The insn would also permit all registers to be used as destination, but V2DFmode doesn't.) gcc/ * config/i386/sse.md (vec_dupv2df): Add new

Re: [PATCH] x86: improve fast bfloat->float conversion

2023-07-11 Thread Jan Beulich via Gcc-patches

On 11.07.2023 08:45, Liu, Hongtao wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Tuesday, July 11, 2023 2:08 PM >> >> There's nothing AVX512BW-ish in here, so no reason to use Yw as the >> constraints for the AVX alternative. Furthermore by

[PATCH] x86: improve fast bfloat->float conversion

2023-07-11 Thread Jan Beulich via Gcc-patches

There's nothing AVX512BW-ish in here, so no reason to use Yw as the constraints for the AVX alternative. Furthermore by using the 512-bit form of VPSSLD (in a new alternative) all 32 registers can be used directly by the insn without AVX512VL needing to be enabled. Also adjust the originally last

[PATCH v3] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-07-11 Thread Jan Beulich via Gcc-patches

... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the (uniform) need for VEX3 in the broadcast ones. When EVEX encoding is respective the

Re: [r14-2314 Regression] FAIL: gcc.target/i386/pr100711-2.c scan-assembler-times vpandn 8 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches

On 07.07.2023 09:46, Hongtao Liu wrote: > On Fri, Jul 7, 2023 at 3:18 PM Jan Beulich via Gcc-regression > wrote: >> >> On 06.07.2023 13:57, haochen.jiang wrote: >>> On Linux/x86_64, >>> >>> e007369c8b67bcabd57c4fed8cff2a

Re: [r14-2310 Regression] FAIL: gcc.target/i386/pr53652-1.c scan-assembler-times pandn[ \\t] 2 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches

On 07.07.2023 09:30, Hongtao Liu wrote: > On Fri, Jul 7, 2023 at 3:13 PM Jan Beulich via Gcc-regression > wrote: >> >> On 06.07.2023 13:57, haochen.jiang wrote: >>> On Linux/x86_64, >>> >>> 2d11c99dfca3cc603dbbfafb3afc41

Re: [r14-2314 Regression] FAIL: gcc.target/i386/pr100711-2.c scan-assembler-times vpandn 8 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches

On 06.07.2023 13:57, haochen.jiang wrote: > On Linux/x86_64, > > e007369c8b67bcabd57c4fed8cff2a6db82e78e6 is the first bad commit > commit e007369c8b67bcabd57c4fed8cff2a6db82e78e6 > Author: Jan Beulich > Date: Wed Jul 5 09:49:16 2023 +0200 > > x86: yet more PR targ

Re: [r14-2310 Regression] FAIL: gcc.target/i386/pr53652-1.c scan-assembler-times pandn[ \\t] 2 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches

On 06.07.2023 13:57, haochen.jiang wrote: > On Linux/x86_64, > > 2d11c99dfca3cc603dbbfafb3afc41689a68e40f is the first bad commit > commit 2d11c99dfca3cc603dbbfafb3afc41689a68e40f > Author: Jan Beulich > Date: Wed Jul 5 09:41:09 2023 +0200 > > x86: use VPTERNLOG

Re: [PATCH 2/2] x86: slightly correct / simplify *vec_extractv2ti

2023-07-05 Thread Jan Beulich via Gcc-patches

On 05.07.2023 10:47, Hongtao Liu wrote: > On Wed, Jul 5, 2023 at 4:01 PM Jan Beulich via Gcc-patches > wrote: >> >> V2TImode values cannot appear in the upper 16 YMM registers without >> AVX512VL being enabled. Therefore forcing 512-bit mode (also not >> ref

Re: [PATCH 1/2] x86: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi

2023-07-05 Thread Jan Beulich via Gcc-patches

On 05.07.2023 10:40, Hongtao Liu wrote: > On Wed, Jul 5, 2023 at 4:00 PM Jan Beulich via Gcc-patches > wrote: >> >> The middle alternative each was unusable without enabling AVX512DQ (in >> addition to AVX512VL), which is entirely unrelated here. The last >> alter

[PATCH 2/2] x86: slightly correct / simplify *vec_extractv2ti

2023-07-05 Thread Jan Beulich via Gcc-patches

V2TImode values cannot appear in the upper 16 YMM registers without AVX512VL being enabled. Therefore forcing 512-bit mode (also not reflected in the "mode" attribute) is pointless. gcc/ * config/i386/sse.md (*vec_extractv2ti): Drop g modifiers. --- a/gcc/config/i386/sse.md +++

[PATCH 1/2] x86: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi

2023-07-05 Thread Jan Beulich via Gcc-patches

The middle alternative each was unusable without enabling AVX512DQ (in addition to AVX512VL), which is entirely unrelated here. The last alternative is usable with AVX512VL only (due to type restrictions on what may be put in the upper 16 YMM registers), and hence is pointlessly forcing 512-bit

[PATCH 0/2] x86: vec_extract_* adjustments

2023-07-05 Thread Jan Beulich via Gcc-patches

1: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi 2: slightly correct / simplify *vec_extractv2ti Jan

[PATCH] x86: suppress avx512f-copysign.c testcase for 32-bit

2023-07-05 Thread Jan Beulich via Gcc-patches

The test installed by "x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F" won't succeed on 32-bit, for floating point operations being done there (by default) without using SIMD insns. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit. ---

Re: [PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-07-04 Thread Jan Beulich via Gcc-patches

On 27.06.2023 07:11, Hongtao Liu wrote: > On Tue, Jun 20, 2023 at 5:34 PM Hongtao Liu wrote: >> >> On Tue, Jun 20, 2023 at 5:03 PM Jan Beulich wrote: >>> >>> On 20.06.2023 10:33, Hongtao Liu wrote: >>>> On Tue, Jun 20, 2023 at 3:07 PM Jan Beulich via

Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-25 Thread Jan Beulich via Gcc-patches

On 25.06.2023 09:30, Hongtao Liu wrote: > On Sun, Jun 25, 2023 at 3:23 PM Hongtao Liu wrote: >> >> On Sun, Jun 25, 2023 at 3:13 PM Hongtao Liu wrote: >>> >>> On Sun, Jun 25, 2023 at 1:52 PM Jan Beulich wrote: >>>> >>>> On 25.06.2023 06:

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-25 Thread Jan Beulich via Gcc-patches

On 25.06.2023 07:12, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches > wrote: >> >> --- >> For the purpose here (and elsewhere) bcst_vector_operand() (really: >> bcst_mem_operand()) isn't permissive enough: We'd want it to allo

Re: [PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-25 Thread Jan Beulich via Gcc-patches

On 25.06.2023 07:06, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches > wrote: >> >> With respective two-operand bitwise operations now expressable by a >> single VPTERNLOG, add splitters to also deal with ior and xor >> counterparts

Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-24 Thread Jan Beulich via Gcc-patches

On 25.06.2023 06:42, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches > wrote: >> >> +(define_code_iterator andor [and ior]) >> +(define_code_attr nlogic [(and "nor") (ior "nand")]) >> +(define

Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-21 Thread Jan Beulich via Gcc-patches

On 21.06.2023 09:44, Jan Beulich wrote: > On 21.06.2023 09:37, Hongtao Liu wrote: >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches >> wrote: >>> >>> Isn't prefix_extra use bogus here? What extra prefix does vbroadcastss >> According to c

Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-21 Thread Jan Beulich via Gcc-patches

On 21.06.2023 09:37, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches > wrote: >> >> Is there a reason why vec_dupv4sf uses sseshuf1 for its shuffle >> alternatives, but *vec_dupv4si uses sselog1? I'd be happy to correct >> this i

[PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-21 Thread Jan Beulich via Gcc-patches

Following two-operand bitwise operations, add another splitter to also deal with not followed by broadcast all on its own, which can be expressed as simple embedded broadcast instead once a broadcast operand is actually permitted in the respective insn. While there also permit a broadcast operand

[PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-21 Thread Jan Beulich via Gcc-patches

With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (*iornot3, *xnor3) which

[PATCH 3/5] x86: allow memory operand for AVX2 splitter for PR target/100711

2023-06-21 Thread Jan Beulich via Gcc-patches

The intended broadcast (with AVX512) can very well be done right from memory. gcc/ * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 form of splitter for PR target/100711. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17356,7 +17356,7 @@

[PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms

2023-06-21 Thread Jan Beulich via Gcc-patches

When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG*. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing

[PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-21 Thread Jan Beulich via Gcc-patches

All combinations of and, ior, xor, and not involving two operands can be expressed that way in a single insn. gcc/ PR target/93768 * config/i386/i386.cc (ix86_rtx_costs): Further special-case bitwise vector operations. * config/i386/sse.md (*iornot3): New insn.

[PATCH 0/5] x86: make better use of VPTERNLOG{D,Q}

2023-06-21 Thread Jan Beulich via Gcc-patches

While there are some quite sophisticated 4-operand expanders, 2-operand binary logic which can't be expressed by just VPAND, VPANDN, VPOR, or VPXOR doesn't utilize this insn to carry out such operations in a single insn. Therefore the first two patches address one of the sub-aspects of PR

[PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-21 Thread Jan Beulich via Gcc-patches

... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the possible need for VEX3 in the broadcast ones. When EVEX encoding is required the

[PATCH] x86: add -mprefer-vector-width=512 to new avx512f-dupv2di.c testcase

2023-06-21 Thread Jan Beulich via Gcc-patches

This is to cover testing also being done with -march=cascadelake. --- Committing as obvious. --- a/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { ! ia32 } } } */ -/* { dg-options "-mavx512f

Re: [PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-20 Thread Jan Beulich via Gcc-patches

On 20.06.2023 10:33, Hongtao Liu wrote: > On Tue, Jun 20, 2023 at 3:07 PM Jan Beulich via Gcc-patches > wrote: >> >> I guess the underlying pattern, going along the lines of what >> one_cmpl2 uses, can be applied elsewhere >> as well. > That should be guarded

[PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-20 Thread Jan Beulich via Gcc-patches

There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a

Re: [PATCH v2] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-19 Thread Jan Beulich via Gcc-patches

On 19.06.2023 04:07, Liu, Hongtao wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Friday, June 16, 2023 2:22 PM >> >> --- a/gcc/config/i386/sse.md >> +++ b/gcc/config/i386/sse.md >> @@ -12597,11 +12597,11 @@ >> (set_attr

[PATCH v2] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-16 Thread Jan Beulich via Gcc-patches

There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a

[PATCH v2] x86: correct and improve "*vec_dupv2di"

2023-06-16 Thread Jan Beulich via Gcc-patches

The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Jan Beulich via Gcc-patches

On 15.06.2023 09:45, Hongtao Liu wrote: > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > wrote: >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches >> wrote: >>> +case 3: >>> + return "%vmovddup\t{%1, %0|%0, %1}"; &

Re: [PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-15 Thread Jan Beulich via Gcc-patches

On 15.06.2023 07:23, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 5:03 PM Jan Beulich wrote: >> >> On 14.06.2023 09:41, Hongtao Liu wrote: >>> On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches >>> wrote: >>>> >>>> ... in v

[PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Jan Beulich via Gcc-patches

The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/

Re: [PATCH] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-14 Thread Jan Beulich via Gcc-patches

On 14.06.2023 10:10, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 1:59 PM Jan Beulich via Gcc-patches > wrote: >> >> There's no reason to constrain this to AVX512VL, as the wider operation >> is not usable for more narrow operands only when the possible memory >

Re: [PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-14 Thread Jan Beulich via Gcc-patches

On 14.06.2023 09:41, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches > wrote: >> >> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are >> never longer (yet sometimes shorter) than the corresponding VSHUFPS / >>

[PATCH] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-13 Thread Jan Beulich via Gcc-patches

There's no reason to constrain this to AVX512VL, as the wider operation is not usable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a single-insn one (leaving aside moves which the

[PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-13 Thread Jan Beulich via Gcc-patches

... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the need for VEX3 in the broadcast ones. When EVEX encoding is required the broadcast

[PATCH] x86: add Bk and Br to comment list B's sub-chars

2023-06-13 Thread Jan Beulich via Gcc-patches

gcc/ * config/i386/constraints.md: Mention k and r for B. --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -162,7 +162,9 @@ ;; g GOT memory operand. ;; m Vector memory operand ;; c Constant memory operand +;; k TLS address that allows insn using

[PATCH] x86/AVX512: use VMOVDDUP for broadcast to V2DF

2023-06-13 Thread Jan Beulich via Gcc-patches

Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on double precision floating values - is more appropriate to use here, and it can also result in shorter insn encodings when source is memory or %xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX prefix then instead

Re: [PATCH v3] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-06-13 Thread Jan Beulich via Gcc-patches

On 13.06.2023 05:28, Fangrui Song wrote: > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/large-data.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target lp64 } */ > +/* { dg-options "-O2 -mcmodel=large -mlarge-data-threshold=4" } */ > +/* { dg-final {

Re: [PATCH v2] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-05-26 Thread Jan Beulich via Gcc-patches

On 25.05.2023 18:11, Fangrui Song wrote: > On 2023-05-25, Jan Beulich wrote: >> On 25.05.2023 17:16, Fangrui Song wrote: >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >>> @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the de

Re: [PATCH v2] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-05-25 Thread Jan Beulich via Gcc-patches

On 25.05.2023 17:16, Fangrui Song wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. > > @opindex mlarge-data-threshold > @item -mlarge-data-threshold=@var{threshold} > -When @option{-mcmodel=medium} is

Re: x86: making better use of vpternlog{d,q}

2023-05-25 Thread Jan Beulich via Gcc

On 24.05.2023 11:01, Hongtao Liu wrote: > On Wed, May 24, 2023 at 3:58 PM Jan Beulich via Gcc wrote: >> >> Hello, >> >> for a couple of years I was meaning to extend the use of these AVX512F >> insns beyond the pretty minimalistic ones there are so far. Now th

x86: making better use of vpternlog{d,q}

2023-05-24 Thread Jan Beulich via Gcc

Hello, for a couple of years I was meaning to extend the use of these AVX512F insns beyond the pretty minimalistic ones there are so far. Now that I've got around to at least draft something, I ran into a couple of issues I cannot explain. I'd like to start with understanding the unexpected

Re: Ping: [PATCH] testsuite/C++: suppress filename canonicalization in module tests

2023-04-28 Thread Jan Beulich via Gcc-patches

On 28.04.2023 00:24, Nathan Sidwell wrote: > On 4/25/23 11:04, Jan Beulich wrote: >> On 28.06.2022 16:06, Jan Beulich wrote: >>> The pathname underneath gcm.cache/ is determined from the effective name >>> used for the main input file of a particular module.

Re: [PATCH] testsuite: adjust NOP expectations for RISC-V

2023-04-27 Thread Jan Beulich via Gcc-patches

On 26.04.2023 17:45, Palmer Dabbelt wrote: > On Wed, 26 Apr 2023 08:26:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote: >> >> >> On 4/25/23 08:50, Jan Beulich via Gcc-patches wrote: >>> RISC-V will emit ".option nopic" when -fno-pie is in effect, which >>

Ping: [PATCH] testsuite/C++: suppress filename canonicalization in module tests

2023-04-25 Thread Jan Beulich via Gcc-patches

On 28.06.2022 16:06, Jan Beulich wrote: > The pathname underneath gcm.cache/ is determined from the effective name > used for the main input file of a particular module. When modules are > built, no canonicalization occurs for the main input file. Hence the > module file would

[PATCH v2] testsuite/C++: cope with IPv6 being unavailable

2023-04-25 Thread Jan Beulich via Gcc-patches

When IPv6 is disabled in the kernel, the error message coming back from Cody::OpenInet6() is different from the sole so far expected one. --- v2: Re-base. --- a/gcc/testsuite/g++.dg/modules/bad-mapper-3.C +++ b/gcc/testsuite/g++.dg/modules/bad-mapper-3.C @@ -1,6 +1,6 @@ // {

[PATCH] testsuite: adjust NOP expectations for RISC-V

2023-04-25 Thread Jan Beulich via Gcc-patches

RISC-V will emit ".option nopic" when -fno-pie is in effect, which matches the generic pattern. Just like done for Alpha, special-case RISC-V. --- A couple more targets look to be affected as well, simply because their "no-operation" insn doesn't match the expectation. With the apparently

Re: Problems when building NT kernel drivers with GCC / LD

2022-11-28 Thread Jan Beulich via Gcc

On 28.11.2022 09:40, Jonathan Wakely wrote: > On Mon, 28 Nov 2022, 08:08 Jan Beulich via Gcc, wrote: > >> On 26.11.2022 20:04, Pali Rohár wrote: >>> On Monday 21 November 2022 08:24:36 Jan Beulich wrote: >>>> But then, with you replying to >>>> me

Re: Problems when building NT kernel drivers with GCC / LD

2022-11-28 Thread Jan Beulich via Gcc

On 26.11.2022 20:04, Pali Rohár wrote: > On Monday 21 November 2022 08:24:36 Jan Beulich wrote: >> But then, with you replying to >> me specifically, perhaps you're wrongly assuming that I would be >> planning to look into addressing any or all of these? My earlier reply >&

Re: Problems when building NT kernel drivers with GCC / LD

2022-11-20 Thread Jan Beulich via Gcc

On 20.11.2022 14:10, Pali Rohár wrote: > On Saturday 05 November 2022 02:26:52 Pali Rohár wrote: >> On Saturday 05 November 2022 01:57:49 Pali Rohár wrote: >>> On Monday 31 October 2022 10:55:59 Jan Beulich wrote: >>>> On 30.10.2022 02:06, Pali Rohár via Binutils wrot

Re: Problems when building NT kernel drivers with GCC / LD

2022-10-31 Thread Jan Beulich via Gcc

On 30.10.2022 02:06, Pali Rohár via Binutils wrote: > * GCC or LD (not sure who) sets memory alignment characteristics > (IMAGE_SCN_ALIGN_MASK) into the sections of PE executable binary. > These characteristics should be only in COFF object files, not > executable binaries. Specially they

1 2 3 >

1 - 100 of 248 matches

Mail list logo