[PATCH] LoongArch: Fix soft-float builds of libffi

2024-01-26 Thread Yang Yujie
This patch correspond to the upstream PR: https://github.com/libffi/libffi/pull/817 libffi/ChangeLog: * src/loongarch64/ffi.c: Avoid defining floats in struct call_context if the ABI is soft-float. --- libffi/src/loongarch64/ffi.c | 2 ++ 1 file changed, 2 insertions(+) diff

Re: [PATCH v4] RISC-V: Implement TLS Descriptors.

2024-01-26 Thread Fangrui Song
On Mon, Dec 4, 2023 at 11:02 PM Tatsuyuki Ishi wrote: > > This implements TLS Descriptors (TLSDESC) as specified in [1]. > > The 4-instruction sequence is implemented as a single RTX insn for > simplicity, but this can be revisited later if instruction scheduling or > more flexible RA is desired.

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread chenglulu
在 2024/1/26 下午6:57, Xi Ruoyao 写道: On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: 在 2024/1/26 下午4:49, Xi Ruoyao 写道: On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: v3 -> v4:    1. Add macro support for TLS symbols    2. Added support for loading __get_tls_addr symbol address

[PATCH] RISC-V: Add require-effective-target to pr113429 testcase

2024-01-26 Thread Patrick O'Neill
The pr113429 testcase fails with newlib spike runs. Adding require-effective-target rv64 and riscv_v fixes the issue. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr113429.c: Add require-effective-target rv64 and riscv_v Signed-off-by: Patrick O'Neill --- Tested using

Re: Re: [Committed] RISC-V: Add regression test for vsetvl bug pr113429

2024-01-26 Thread 钟居哲
newlib rv32gcv juzhe.zh...@rivai.ai From: Patrick O'Neill Date: 2024-01-27 08:38 To: juzhe.zh...@rivai.ai; gcc-patches CC: kito.cheng; law; rdapp; vineetg Subject: Re: [Committed] RISC-V: Add regression test for vsetvl bug pr113429 What target/config are these failures on? I tried rv64gcv,

Re: [Committed] RISC-V: Add regression test for vsetvl bug pr113429

2024-01-26 Thread Patrick O'Neill
What target/config are these failures on? I tried rv64gcv, rv64gc, rv32gcv, and rv32gc with RUNTESTFLAGS="rvv.exp" and don't see these failures. Thanks, Patrick On 1/25/24 23:20, juzhe.zh...@rivai.ai wrote: This patch causes the following regression: FAIL:

Re: [PATCH] testsuite: Fix vect_long_mult on Power [PR109705]

2024-01-26 Thread Andrew Pinski
On Mon, Jan 15, 2024 at 6:43 PM Kewen.Lin wrote: > > Hi, > > As pointed out by the discussion in PR109705, the current > vect_long_mult effective target check on Power is broken. > This patch is to fix it accordingly. > > With additional change by adding a guard vect_long_mult > in

Re: [PATCH] c++: problematic assert in reference_binding [PR113141]

2024-01-26 Thread Jason Merrill
On 1/26/24 17:11, Jason Merrill wrote: On 1/26/24 16:52, Jason Merrill wrote: On 1/25/24 14:18, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/13?  This isn't a very satisfactory fix, but at least it safely fixes these testcases I guess. 

Re: [PATCH] c++: problematic assert in reference_binding [PR113141]

2024-01-26 Thread Jason Merrill
On 1/26/24 16:52, Jason Merrill wrote: On 1/25/24 14:18, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/13?  This isn't a very satisfactory fix, but at least it safely fixes these testcases I guess.  Note that there's implementation

Re: [PATCH] c++: problematic assert in reference_binding [PR113141]

2024-01-26 Thread Jason Merrill
On 1/25/24 14:18, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/13? This isn't a very satisfactory fix, but at least it safely fixes these testcases I guess. Note that there's implementation disagreement about the second testcase, GCC

Re: [PATCH 2/2] RISC-V/testsuite: Also verify if-conversion runs for pr105314.c

2024-01-26 Thread Maciej W. Rozycki
On Wed, 24 Jan 2024, Jeff Law wrote: > > Do we have consensus now to move forward with this change as posted? I'd > > like to get these patches ticked off ASAP. > I think it should move forward. I think having the RTL tests deals with > Andrew's concern and the testcase adjustment has value

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-26 Thread Jason Merrill
On 12/5/23 20:52, Lewis Hyatt wrote: Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608 There are two related issues here really, a regression since GCC 11 where we can ICE after restoring a PCH, and a deeper issue with bogus locations assigned to macros that were defined prior to

Re: [PATCH] c++: #pragma doesn't disable -Wunused-label [PR113582]

2024-01-26 Thread Jason Merrill
On 1/25/24 20:38, Marek Polacek wrote: Low prio and not a regression. Feel free to ignore till GCC 15. OK for stage 1. Bootstrapped/regtested on x86_64-pc-linux-gnu. -- >8 -- The PR complains that void do_something(){ #pragma GCC diagnostic push #pragma GCC diagnostic ignored

Re: [PATCH] c++: implement [[gnu::non_owning]] [PR110358]

2024-01-26 Thread Jason Merrill
On 1/25/24 20:37, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Since -Wdangling-reference has false positives that can't be prevented, we should offer an easy way to suppress the warning. Currently, that is only possible by using a #pragma, either

Re: [PATCH] c++/modules: Stream additional fields for DECL_STRUCT_FUNCTION [PR113580]

2024-01-26 Thread Jason Merrill
On 1/26/24 10:49, Patrick Palka wrote: On Fri, 26 Jan 2024, Nathaniel Shead wrote: This patch just adds enough of the fields from 'function' to fix the ICE in the linked PR. I suppose there might be more fields from this type that should be propagated, but I don't know enough to find out which

Re: [PATCH V3 4/4] RISC-V: Enable assert for insn_has_dfa_reservation

2024-01-26 Thread Edwin Lu
On 1/25/2024 9:06 AM, Robin Dapp wrote: /* If we ever encounter an insn without an insn reservation, trip an assert so we can find and fix this problem. */ -#if 0 + if (! insn_has_dfa_reservation_p (insn)) { +print_rtl(stderr, insn); +fprintf(stderr, "%d", get_attr_type

Re: [PATCH V3 3/4] RISC-V: Use default cost model for insn scheduling

2024-01-26 Thread Edwin Lu
On 1/25/2024 9:06 AM, Robin Dapp wrote: 39 additional unique testsuite failures (scan dumps) will still be present. I don't know how optimal the new output is compared to the old. Should I update the testcase expected output to match the new scan dumps? Currently, without vector op latency,

[middle-end PATCH] Constant fold {-1,-1} << 1 in simplify-rtx.cc

2024-01-26 Thread Roger Sayle
This patch addresses a missed optimization opportunity in the RTL optimization passes. The function simplify_const_binary_operation will constant fold binary operators with two CONST_INT operands, and those with two CONST_VECTOR operands, but is missing compile-time evaluation of binary

Re: [PATCH V3 2/4] RISC-V: Add vector related pipelines

2024-01-26 Thread Edwin Lu
On 1/25/2024 9:06 AM, Robin Dapp wrote: Thanks, that looks better IMHO. +;; Copyright (C) 2011-2024 Free Software Foundation, Inc. +;; Contributed by Andrew Waterman (and...@sifive.com). +;; Based on MIPS target for GNU compiler. You might want to change that, as well as the date. While at

RE: [PATCH] libstdc++: Make PSTL algorithms accept C++20 iterators [PR110512]

2024-01-26 Thread Dvorskiy, Mikhail
Hi everyone, Let me explain a reason of the issue connected with _PSTL_USAGE_WARNINGS macro with GCC14. Firstly, there is no such issue on version GCC 13.2.0, because _PSTL_PRAGMA_MESSAGE is defined in #pragma message only if _PSTL_USAGE_WARNINGS > 0, please have a look:

[wwwdocs][patch] gcc-14/changes.html (amdgcn): Update for gfx1030/gfx1100

2024-01-26 Thread Tobias Burnus
Mention that gfx1030/gfx1100 are now supported. As noted in another thread, LLVM 15's assembler is now required, before LLVM 13.0.1 would do. (Alternatively, disabling gfx1100 support would do.) Hence, the added link to the install documentation. Comments, suggestions? Tobias

[patch] install.texi: For gcn, recommend LLVM 15, unless gfx1100 is disabled (was: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs)

2024-01-26 Thread Tobias Burnus
Hi, Thomas Schwinge wrote: amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs ... Further down in that file, we state: @anchor{amdgcn-x-amdhsa} @heading amdgcn-*-amdhsa AMD GCN GPU target. Instead of GNU Binutils, you will need to install

Re: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

2024-01-26 Thread Richard Biener
> Am 26.01.2024 um 17:22 schrieb Thomas Schwinge : > > Hi! > > Great progress that you've made! :-) > >> On 2024-01-26T13:32:02+0100, Tobias Burnus wrote: >> Tobias Burnus wrote: >>> Am 24.01.24 um 17:01 schrieb Tobias Burnus: Okay to enable gfx1100 multilib building and to document

Re: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

2024-01-26 Thread Thomas Schwinge
Hi! Great progress that you've made! :-) On 2024-01-26T13:32:02+0100, Tobias Burnus wrote: > Tobias Burnus wrote: >> Am 24.01.24 um 17:01 schrieb Tobias Burnus: >>> Okay to enable gfx1100 multilib building and to document gfx1100 in >>> the manual? >> >> and, with this patch, additionally

Re: [PATCH] c++/modules: Stream additional fields for DECL_STRUCT_FUNCTION [PR113580]

2024-01-26 Thread Patrick Palka
On Fri, 26 Jan 2024, Nathaniel Shead wrote: > This patch just adds enough of the fields from 'function' to fix the ICE > in the linked PR. I suppose there might be more fields from this type > that should be propagated, but I don't know enough to find out which > they might be yet, since a lot of

[PATCH v2] arm: Fix missing bti instruction for virtual thunks

2024-01-26 Thread Richard Ball
v2: Formatting and test options fix. Adds missing bti instruction at the beginning of a virtual thunk, when bti is enabled. gcc/ChangeLog: * config/arm/arm.cc (arm_output_mi_thunk): Emit insn for bti_c when bti is enabled. gcc/testsuite/ChangeLog: *

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-26 Thread Qing Zhao
> On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: > > > I haven't looked at the patch, but it sounds you give the result > the wrong type. Then patching up all use cases instead of the > type seems wrong. Yes, this is for resolving a very early gimplification issue as I reported last Nov:

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 14:21, Richard Biener wrote: On Fri, 26 Jan 2024, Jakub Jelinek wrote: On Fri, Jan 26, 2024 at 03:04:11PM +0100, Richard Biener wrote: Otherwise it looks reasoanble to me, but let's see what Andrew thinks. 'n' before 'a', please. ;-) ?! I've misspelled a word. @@ -1443,6

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 14:04, Richard Biener wrote: On Fri, 26 Jan 2024, Andrew Stubbs wrote: On 26/01/2024 12:06, Jakub Jelinek wrote: On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: The following avoids registering unsupported GCN offload devices when iterating over available ones.

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Jakub Jelinek wrote: > On Fri, Jan 26, 2024 at 03:04:11PM +0100, Richard Biener wrote: > > > > Otherwise it looks reasoanble to me, but let's see what Andrew thinks. > > > > > > 'n' before 'a', please. ;-) > > > > ?! > > I've misspelled a word. > > > @@ -1443,6 +1445,16

Re: [PATCH] debug/103047 - argument order of inlined functions

2024-01-26 Thread Jakub Jelinek
On Fri, Jan 26, 2024 at 03:16:15PM +0100, Richard Biener wrote: > The inliner puts variables for parameters of the inlined functions > in the inline scope in reverse order. The following reverses them > again so that we get consistent ordering between the > DW_TAG_subprogram

[PATCH] debug/103047 - argument order of inlined functions

2024-01-26 Thread Richard Biener
The inliner puts variables for parameters of the inlined functions in the inline scope in reverse order. The following reverses them again so that we get consistent ordering between the DW_TAG_subprogram DW_TAG_formal_parameter and the DW_TAG_inlined_subroutine DW_TAG_formal_parameter set. I

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Jakub Jelinek
On Fri, Jan 26, 2024 at 03:04:11PM +0100, Richard Biener wrote: > > > Otherwise it looks reasoanble to me, but let's see what Andrew thinks. > > > > 'n' before 'a', please. ;-) > > ?! I've misspelled a word. > @@ -1443,6 +1445,16 @@ suitable_hsa_agent_p (hsa_agent_t agent) >switch

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Andrew Stubbs wrote: > On 26/01/2024 12:06, Jakub Jelinek wrote: > > On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: > >> The following avoids registering unsupported GCN offload devices > >> when iterating over available ones. With a Zen4 desktop CPU > >>

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 12:06, Jakub Jelinek wrote: On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: The following avoids registering unsupported GCN offload devices when iterating over available ones. With a Zen4 desktop CPU you will have an IGPU (unspported) which will otherwise be made

Re: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

2024-01-26 Thread Tobias Burnus
Hi Richard, Richard Biener wrote: Looks good to me. Thanks - I will commit it after lunch to see whether someone else has additional comments. +@item gfx1030 +Compile for RDNA2 gfx1030 devices (GFX10 series). + +@item gfx1100 +Compile for RDNA3 gfx1100 devices (GFX11 series). Btw, "GFX10"

Re: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Tobias Burnus wrote: > Now with patch ... > > Tobias Burnus wrote: > > Hi all, hi Richard & Andrew, > > > > Am 24.01.24 um 17:01 schrieb Tobias Burnus: > >> This patch obviously depends on Andrew's; he wrote in the previous email of > >> this thread regarding his patch: > >>

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Tobias Burnus wrote: > Jakub Jelinek wrote: > > On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: > >> libgomp/ > >> * plugin/plugin-gcn.c (suitable_hsa_agent_p): Filter out > >> agents with unsupported ISA. > ... > >> @@ -1443,6 +1445,13 @@

Re: [patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

2024-01-26 Thread Tobias Burnus
Now with patch ... Tobias Burnus wrote: Hi all, hi Richard & Andrew, Am 24.01.24 um 17:01 schrieb Tobias Burnus: This patch obviously depends on Andrew's; he wrote in the previous email of this thread regarding his patch: Andrew Stubbs wrote: This is enough to get gfx1100 working for most

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Tobias Burnus
Jakub Jelinek wrote: On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: libgomp/ * plugin/plugin-gcn.c (suitable_hsa_agent_p): Filter out agents with unsupported ISA. ... @@ -1443,6 +1445,13 @@ suitable_hsa_agent_p (hsa_agent_t agent) switch (device_type)

[patch] amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs (was: [patch] amdgcn: config.gcc - enable gfx1100 multilib; add gfx1100 to docs)

2024-01-26 Thread Tobias Burnus
Hi all, hi Richard & Andrew, Am 24.01.24 um 17:01 schrieb Tobias Burnus: This patch obviously depends on Andrew's; he wrote in the previous email of this thread regarding his patch: Andrew Stubbs wrote: This is enough to get gfx1100 working for most purposes, on top of the patch that Tobias

Re: [PATCH] genopinit: Split init_all_optabs [PR113575]

2024-01-26 Thread Richard Biener
On Fri, Jan 26, 2024 at 9:17 AM Robin Dapp wrote: > > Hi, > > init_all_optabs initializes > 1 patterns for riscv targets. This > leads to pathological situations in dataflow analysis (which can occur > with many adjacent stores). > To alleviate this this patch makes genopinit split the

Re: [PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Jakub Jelinek
On Fri, Jan 26, 2024 at 01:00:28PM +0100, Richard Biener wrote: > The following avoids registering unsupported GCN offload devices > when iterating over available ones. With a Zen4 desktop CPU > you will have an IGPU (unspported) which will otherwise be made > available. This causes testcases

[PATCH] Avoid registering unsupported OMP offload devices

2024-01-26 Thread Richard Biener
The following avoids registering unsupported GCN offload devices when iterating over available ones. With a Zen4 desktop CPU you will have an IGPU (unspported) which will otherwise be made available. This causes testcases like libgomp.c-c++-common/non-rect-loop-1.c which iterate over all decives

Re: [PATCH] Fix architecture support in OMP_OFFLOAD_init_device for gcn

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 11:42, Richard Biener wrote: The following makes the existing architecture support check work instead of being optimized away (enum vs. -1). This avoids later asserts when we assume such devices are never actually used. Tested as previously, now the error is libgomp: GCN fatal

Re: [PATCH v2 2/2] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-01-26 Thread Victor Do Nascimento
On 1/26/24 10:53, Richard Sandiford wrote: > Victor Do Nascimento writes: >> @@ -712,6 +760,27 @@ ENTRY (libat_test_and_set_16) >> END (libat_test_and_set_16) >> >> >> +/* Alias all LSE128_LRCPC3 ifuncs to their specific implementations, >> + that is, map it to LSE128, LRCPC or CORE as

[PATCH] Fix architecture support in OMP_OFFLOAD_init_device for gcn

2024-01-26 Thread Richard Biener
The following makes the existing architecture support check work instead of being optimized away (enum vs. -1). This avoids later asserts when we assume such devices are never actually used. Tested as previously, now the error is libgomp: GCN fatal error: Unknown GCN agent architecture Runtime

Re: [patch] gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

2024-01-26 Thread Tobias Burnus
Andrew Stubbs wrote: We can move on to COV5 for GCC 15, probably. I'm not aware of any great blocker, but it sets a minimum LLVM. And as our testing hardware showed, it also bumps the minimal ROCm to 5.2 (as 5.1 fails with COV5). Otherwise, as mentioned, COV5 was added to LLVM 14, but as we

Re: [patch] gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 10:39, Tobias Burnus wrote: Hi all, Andrew Stubbs wrote: On 26/01/2024 07:29, Richard Biener wrote: If you link against prebuilt objects with COV 5 it seems there's no way to override the COV version GCC uses?  That is, do we want to add a -mcode-object-version=... option to

[PATCH] c++/modules: Stream additional fields for DECL_STRUCT_FUNCTION [PR113580]

2024-01-26 Thread Nathaniel Shead
This patch just adds enough of the fields from 'function' to fix the ICE in the linked PR. I suppose there might be more fields from this type that should be propagated, but I don't know enough to find out which they might be yet, since a lot of them seem to be only set after gimplification.

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread chenglulu
在 2024/1/26 下午6:57, Xi Ruoyao 写道: On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: 在 2024/1/26 下午4:49, Xi Ruoyao 写道: On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: v3 -> v4:    1. Add macro support for TLS symbols    2. Added support for loading __get_tls_addr symbol address

Re: [PATCH] Avoid using an unsupported agent when offloading to GCN

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 10:40, Richard Biener wrote: The following avoids selecting an unsupported agent early, avoiding later asserts when we rely on it being supported. tested on x86_64-unknown-linux-gnu -> amdhsa-gcn on gfx1060 that's the alternative to the other patch. I do indeed seem to get the

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > > > v3 -> v4: > > >    1. Add macro support for TLS symbols > > >    2. Added support for loading __get_tls_addr symbol address using > > > call36.

Re: [PATCH v2 2/2] libatomic: Add rcpc3 128-bit atomic operations for AArch64

2024-01-26 Thread Richard Sandiford
Victor Do Nascimento writes: > @@ -712,6 +760,27 @@ ENTRY (libat_test_and_set_16) > END (libat_test_and_set_16) > > > +/* Alias all LSE128_LRCPC3 ifuncs to their specific implementations, > + that is, map it to LSE128, LRCPC or CORE as appropriate. */ > + > +ALIAS (libat_exchange_16,

[PATCH] Avoid using an unsupported agent when offloading to GCN

2024-01-26 Thread Richard Biener
The following avoids selecting an unsupported agent early, avoiding later asserts when we rely on it being supported. tested on x86_64-unknown-linux-gnu -> amdhsa-gcn on gfx1060 that's the alternative to the other patch. I do indeed seem to get the other (unsupported) agent selected somehow

Re: [PATCH] Avoid assert for unknown device ISAs in GCN libgomp plugin

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 10:30, Richard Biener wrote: When the agent reports a device ISA we don't support avoid hitting an assert, instead report the raw integers as error. I'm not sure whether -1 is special as I didn't figure where that field is initialized. But I guess since agents are not rejected

Re: [patch] gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

2024-01-26 Thread Tobias Burnus
Hi all, Andrew Stubbs wrote: On 26/01/2024 07:29, Richard Biener wrote: If you link against prebuilt objects with COV 5 it seems there's no way to override the COV version GCC uses?  That is, do we want to add a -mcode-object-version=... option to allow the user to override this (and

[PATCH] Avoid assert for unknown device ISAs in GCN libgomp plugin

2024-01-26 Thread Richard Biener
When the agent reports a device ISA we don't support avoid hitting an assert, instead report the raw integers as error. I'm not sure whether -1 is special as I didn't figure where that field is initialized. But I guess since agents are not rejected upfront when registering them I might be able

Re: [PATCH] amdgcn: additional gfx1100 support

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 10:22, Richard Biener wrote: On Fri, 26 Jan 2024, Andrew Stubbs wrote: On 26/01/2024 09:45, Richard Biener wrote: On Fri, 26 Jan 2024, Richard Biener wrote: === libgomp Summary === # of expected passes29126 # of unexpected failures697 # of

Re: [PATCH] amdgcn: additional gfx1100 support

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Andrew Stubbs wrote: > On 26/01/2024 09:45, Richard Biener wrote: > > On Fri, 26 Jan 2024, Richard Biener wrote: > > > > === libgomp Summary === > > > > # of expected passes29126 > > # of unexpected failures697 > > # of unexpected

[PATCH] tree-optimization/113602 - datarefs of non-addressables

2024-01-26 Thread Richard Biener
We can end up creating ADDR_EXPRs of non-addressable entities during for example vectorization. The following plugs this in data-ref analysis when that would create such invalid ADDR_EXPR as part of analyzing the ref structure. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Re: [PATCH] amdgcn: additional gfx1100 support

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 09:45, Richard Biener wrote: On Fri, 26 Jan 2024, Richard Biener wrote: === libgomp Summary === # of expected passes29126 # of unexpected failures697 # of unexpected successes 1 # of expected failures 703 # of unresolved

Re: [patch] gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

2024-01-26 Thread Andrew Stubbs
On 26/01/2024 07:29, Richard Biener wrote: On Fri, Jan 26, 2024 at 12:04 AM Tobias Burnus wrote: When targeting AMD GPUs, the LLVM assembler (and linker) are used. Two days ago LLVM changed the default for the AMDHSA code object version (COV) from 4 to 5. In principle, we do not care which

Re: [PATCH] aarch64: Fix undefinedness while testing the J constraint [PR100204]

2024-01-26 Thread Alex Coplan
On 25/01/2024 11:57, Andrew Pinski wrote: > The J constraint can invoke undefined behavior due to it taking the > negative of the ival if ival was HWI_MIN. The fix is simple as casting > to `unsigned HOST_WIDE_INT` before doing the negative of it. This > does that. Thanks for doing this. > >

Re: [patch] gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

2024-01-26 Thread Andrew Stubbs
On 25/01/2024 23:03, Tobias Burnus wrote: When targeting AMD GPUs, the LLVM assembler (and linker) are used. Two days ago LLVM changed the default for theAMDHSA code object version (COV) from 4 to 5. In principle, we do not care which COV is used as long as it works; unfortunately,

Re: [PATCH] amdgcn: additional gfx1100 support

2024-01-26 Thread Richard Biener
On Fri, 26 Jan 2024, Richard Biener wrote: > On Wed, 24 Jan 2024, Andrew Stubbs wrote: > > > This is enough to get gfx1100 working for most purposes, on top of the > > patch that Tobias committed a week or so ago; there are still some test > > failures to investigate, and probably some tuning to

Re: [PATCH v2 3/5] C: Implement musttail attribute for returns

2024-01-26 Thread Joseph Myers
On Fri, 26 Jan 2024, Andi Kleen wrote: > > > I don't have tests for that but since it's not new behavior I suppose > > > that's sufficient. > > > > Each attribute should have tests that invalid uses are appropriately > > diagnosed. See gcc.dg/c23-attr-fallthrough-2.c for examples of such tests

Re: [PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-26 Thread chenglulu
在 2024/1/26 下午4:59, chenglulu 写道: 在 2024/1/26 下午4:52, Xi Ruoyao 写道: On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: +(define_insn "@load_tls"     [(set (match_operand:P 0 "register_operand" "=r")   (unspec:P       [(match_operand:P 1 "symbolic_operand" "")] -       

Re: [PATCH v2 3/5] C: Implement musttail attribute for returns

2024-01-26 Thread Andi Kleen
> > I don't have tests for that but since it's not new behavior I suppose > > that's sufficient. > > Each attribute should have tests that invalid uses are appropriately > diagnosed. See gcc.dg/c23-attr-fallthrough-2.c for examples of such tests > in the case of the [[fallthrough]] attribute.

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread chenglulu
在 2024/1/26 下午4:49, Xi Ruoyao 写道: On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: v3 -> v4:   1. Add macro support for TLS symbols   2. Added support for loading __get_tls_addr symbol address using call36.   3. Merge template got_load_tls_{ld/gd/le/ie}.   4. Enable explicit reloc for

Re: [PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-26 Thread chenglulu
在 2024/1/26 下午4:52, Xi Ruoyao 写道: On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: +(define_insn "@load_tls"    [(set (match_operand:P 0 "register_operand" "=r")   (unspec:P       [(match_operand:P 1 "symbolic_operand" "")] -     UNSPEC_TLS_GD))] +    

Re: [PATCH] amdgcn: additional gfx1100 support

2024-01-26 Thread Richard Biener
On Wed, 24 Jan 2024, Andrew Stubbs wrote: > This is enough to get gfx1100 working for most purposes, on top of the > patch that Tobias committed a week or so ago; there are still some test > failures to investigate, and probably some tuning to do. > > It might also get gfx1030 working too.

Re: [PATCH v4 2/4] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > +;; Use two registers to get the global symbol address from the got table. > +;; la.global rd, rt, sym > + > +(define_insn_and_split "movdi_symbolic_off64" > + [(set (match_operand:DI 0 "register_operand" "=r,r") > +   (match_operand:DI 1

Re: [PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > +(define_insn "@load_tls" >    [(set (match_operand:P 0 "register_operand" "=r") >   (unspec:P >       [(match_operand:P 1 "symbolic_operand" "")] > -     UNSPEC_TLS_GD))] > +     UNSPEC_TLS))] /* snip */ > +{ > +  enum

Re: [PATCH v2 3/5] C: Implement musttail attribute for returns

2024-01-26 Thread Joseph Myers
On Thu, 25 Jan 2024, Andi Kleen wrote: > On Thu, Jan 25, 2024 at 08:08:23PM +, Joseph Myers wrote: > > On Wed, 24 Jan 2024, Andi Kleen wrote: > > > > > Implement a C23 clang compatible musttail attribute similar to the earlier > > > C++ implementation in the C parser. > > > > I'd expect

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > v3 -> v4: >   1. Add macro support for TLS symbols >   2. Added support for loading __get_tls_addr symbol address using call36. >   3. Merge template got_load_tls_{ld/gd/le/ie}. >   4. Enable explicit reloc for extreme TLS GD/LD with

[PATCH v2] LoongArch: Adjust cost of vector_stmt that match multiply-add pattern.

2024-01-26 Thread Li Wei
We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r failed to vectorize effectively. For this reason, we adjust the cost of 128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit vectorization. The experimental results show that after the modification,

[Committed] RISC-V: Refine some codes of VSETVL PASS [NFC]

2024-01-26 Thread Juzhe-Zhong
gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): Refine some codes. (pre_vsetvl::emit_vsetvl): Ditto. --- gcc/config/riscv/riscv-vsetvl.cc | 69 +--- 1 file changed, 27 insertions(+), 42 deletions(-) diff --git

Re: [pushed][PATCH] LoongArch: Split vec_selects of bottom elements into simple move

2024-01-26 Thread chenglulu
Pushed to r14-8447. 在 2024/1/16 上午10:23, Jiahao Xu 写道: For below pattern, can be treated as a simple move because floating point and vector share a common register on loongarch64. (set (reg/v:SF 32 $f0 [orig:93 res ] [93]) (vec_select:SF (reg:V8SF 32 $f0 [115]) (parallel [

Re: [pushed][PATCH v1] LoongArch: Optimize implementation of single-precision floating-point approximate division.

2024-01-26 Thread chenglulu
Pushed to r14-8444. 在 2024/1/24 下午5:44, Li Wei 写道: We found that in the spec17 521.wrf program, some loop invariant code generated from single-precision floating-point approximate division calculation failed to propose a loop. This is because the pseudo-register that stores the intermediate

[PATCH] genopinit: Split init_all_optabs [PR113575]

2024-01-26 Thread Robin Dapp
Hi, init_all_optabs initializes > 1 patterns for riscv targets. This leads to pathological situations in dataflow analysis (which can occur with many adjacent stores). To alleviate this this patch makes genopinit split the init_all_optabs function into several init_optabs_xx functions that

Re: [pushed][PATCH v3] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-26 Thread chenglulu
在 2024/1/26 下午3:32, Richard Biener 写道: On Fri, Jan 26, 2024 at 7:23 AM chenxiaolong wrote: gcc/testsuite/ChangeLog: OK Pushed to r14-8445. Thank you everyone for your review! * gcc.dg/signbit-2.c: Added additional "-mlsx" compilation options. *

Re:[pushed] [PATCH v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

2024-01-26 Thread chenglulu
Pushed to r14-8446. 在 2024/1/16 上午10:32, Jiahao Xu 写道: Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the short-circuit operation instead of the non-short-circuit operation. SPEC2017 performance evaluation shows 1% performance improvement for fprate GEOMEAN and no

Re: [PATCH v1] LoongArch: Adjust cost of vector_stmt that match multiply-add pattern.

2024-01-26 Thread chenglulu
在 2024/1/24 下午5:36, Li Wei 写道: We found that when only 128-bit vectorization was enabled, 549.fotonik3d_r failed to vectorize effectively. For this reason, we adjust the cost of 128-bit vector_stmt that match the multiply-add pattern to facilitate 128-bit vectorization. The experimental

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-26 Thread Martin Uecker
I haven't looked at the patch, but it sounds you give the result the wrong type. Then patching up all use cases instead of the type seems wrong. Martin Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao: > Thanks a lot for the testing. > > Yes, I can repeat the issue with the