Re: [PATCH] libstdc++: add ARM SVE support to std::experimental::simd

2024-01-03 Thread Srinivas Yadav
Hi, Thanks a lot for the review. Sorry for the very late reply. The following are my comments on the feedback. > The main thing that worries me is: > > #if _GLIBCXX_SIMD_HAVE_SVE > constexpr inline int __sve_vectorized_size_bytes = > __ARM_FEATURE_SVE_BITS / 8; > #else > constexpr

Re:[pushed] [PATCH v1] LoongArch: testsuite:Add loongarch to gcc.dg/vect/slp-26.c.

2024-01-03 Thread chenglulu
Pushed to r14-6911. 在 2023/12/29 下午3:48, chenxiaolong 写道: In the LoongArch architecture, GCC supports the vectorization function tested by vect/slp-26.c, but there is no detection of loongarch in dg-finals. Add loongarch to the appropriate dg-finals. gcc/testsuite/ChangeLog: *

[Committed] RISC-V: Refine LMUL computation for MASK_LEN_LOAD/MASK_LEN_STORE IFN

2024-01-03 Thread Juzhe-Zhong
Notice a case has "Maximum lmul = 16" which is incorrect. Correct LMUL estimation for MASK_LEN_LOAD/MASK_LEN_STORE. Committed. gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (variable_vectorized_p): New function. (compute_nregs_for_mode): Refine LMUL.

Re:[pushed] [PATCH v1] LoongArch: testsuite:Fix FAIL in lasx-xvstelm.c file.

2024-01-03 Thread chenglulu
Pushed to r14-6909. 在 2023/12/29 上午9:45, chenxiaolong 写道: After implementing the cost model on the LoongArch architecture, the GCC compiler code has this feature turned on by default, which causes the lasx-xvstelm.c file test to fail. Through analysis, this test case can generate vectorization

Re:[pushed] [PATCH v2] LoongArch: Merge constant vector permuatation implementations.

2024-01-03 Thread chenglulu
Pushed to r14-6908. 在 2023/12/28 下午8:26, Li Wei 写道: There are currently two versions of the implementations of constant vector permutation: loongarch_expand_vec_perm_const_1 and loongarch_expand_vec_perm_const_2. The implementations of the two versions are different. Currently, only the

Re: [PATCH 3/5][_Hashtable] Avoid redundant usage of rehash policy

2024-01-03 Thread François Dumont
Here is an updated version.     libstdc++: [_Hashtable] Avoid redundant usage of rehash policy     Bypass call to __detail::__distance_fwd and the check if rehash is needed when     instantiating from an iterator range or assigning an initializer_list to an     unordered_multimap or

Re: Generalizing DejaGnu timeout scaling (was: Re: [PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor)

2024-01-03 Thread Hans-Peter Nilsson
On Wed, 3 Jan 2024, Jacob Bachmeyer wrote: > Comments before I start on an implementation? I'd suggest to await the conclusion of the debate: I *think* I've proved that dg-timeout-factor is already active as intended (all parts of a test), specifically when the compilation result is executed

Re: [PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor

2024-01-03 Thread Hans-Peter Nilsson
On Wed, 3 Jan 2024, Maciej W. Rozycki wrote: > On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote: > > > > The test execution timeout is different from the tool execution timeout > > > where it is GCC execution that is being guarded against taking excessive > > > amount of time on the test host

[committed, obvious] OpenMP: trivial cleanups to omp-general.cc

2024-01-03 Thread Sandra Loosemore
gcc/ChangeLog * omp-general.cc: Fix comment typos and misplaced/confusing comments. Delete redundant include of omp-general.h. --- gcc/omp-general.cc | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread Xi Ruoyao
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote: > > 在 2024/1/4 上午11:51, Xi Ruoyao 写道: > > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote: > > > +(define_insn "movdi_pcrel64" > > > + [(set (match_operand:DI 0 "register_operand" "=") > > > +   (match_operand:DI 1

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread chenglulu
在 2024/1/4 上午11:51, Xi Ruoyao 写道: On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote: +(define_insn "movdi_pcrel64" + [(set (match_operand:DI 0 "register_operand" "=") +   (match_operand:DI 1 "symbolic_pcrel64_operand")) +  (unspec:DI [(const_int 0)] +    UNSPEC_MOV_PCREL64) +  (use

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread Xi Ruoyao
On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote: > +(define_insn "movdi_pcrel64" > + [(set (match_operand:DI 0 "register_operand" "=") > +   (match_operand:DI 1 "symbolic_pcrel64_operand")) > +  (unspec:DI [(const_int 0)] > +    UNSPEC_MOV_PCREL64) > +  (use (reg:DI T3_REGNUM)) > + 

Generalizing DejaGnu timeout scaling (was: Re: [PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor)

2024-01-03 Thread Jacob Bachmeyer
Maciej W. Rozycki wrote: On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote: The test execution timeout is different from the tool execution timeout where it is GCC execution that is being guarded against taking excessive amount of time on the test host rather than the resulting test case

[PATCH] LoongArch: Fixed the problem of incorrect judgment of the immediate field of the [x]vld/[x]vst instruction.

2024-01-03 Thread Lulu Cheng
The [x]vld/[x]vst directive is defined as follows: [x]vld/[x]vst {x/v}d, rj, si12 When not modified, the immediate field of [x]vld/[x]vst is between 10 and 14 bits depending on the type. However, in loongarch_valid_offset_p, the immediate field is restricted first, so there is no error.

[PATCH v4] RISC-V: Add support for xtheadvector-specific intrinsics.

2024-01-03 Thread Jun Sha (Joshua)
This patch only involves the generation of xtheadvector special load/store instructions and vext instructions. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class th_loadstore_width): Define new builtin bases. (BASE): Define new builtin bases. *

[PATCH v4] RISC-V: Handle differences between XTheadvector and Vector

2024-01-03 Thread Jun Sha (Joshua)
This patch is to handle the differences in instruction generation between Vector and XTheadVector. In this version, we only support partial xtheadvector instructions that leverage directly from current RVV1.0 with simple adding "th." prefix. For different name xtheadvector instructions but share

Re: [PATCH] libgfortran: Bugfix if not define HAVE_ATOMIC_FETCH_ADD

2024-01-03 Thread Lipeng Zhu
On 2024/1/3 19:12, Tobias Burnus wrote: On 22.12.23 03:36, Lipeng Zhu wrote: This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is not defined in dec_waiting_unlocked function. libgfortran/ChangeLog:   * io/io.h (dec_waiting_unlocked): Use  

Re: [PATCH] libstdc++: testsuite: reduce max_size_type.cc exec time [PR113175]

2024-01-03 Thread Hans-Peter Nilsson
> From: Patrick Palka > Date: Tue, 2 Jan 2024 12:48:26 -0500 > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and release > branches (r14-205 was backported everywhere)? > > -- >8 -- > > The adjustment to max_size_type.cc in r14-205-g83470a5cd4c3d2 > inadvertently increased the

[committed] MIPS/testsuite: Include stdio.h in mipscop tests

2024-01-03 Thread YunQiang Su
gcc/testsuite * gcc.c-torture/compile/mipscop-1.c: Include stdio.h. * gcc.c-torture/compile/mipscop-2.c: Ditto. * gcc.c-torture/compile/mipscop-3.c: Ditto. * gcc.c-torture/compile/mipscop-4.c: Ditto. --- gcc/testsuite/gcc.c-torture/compile/mipscop-1.c | 1 +

[committed] MIPS: Add pattern insqisi_extended and inshisi_extended

2024-01-03 Thread YunQiang Su
This match pattern allows combination (zero_extract:DI 8, 24, QI) with an sign-extend to 32bit INS instruction on TARGET_64BIT. For SI mode, if the sign-bit is modified by bitops, we will need a sign-extend operation. Since 32bit INS instruction can be sure that result is sign-extended, and the

[committed] MIPS: Implement TARGET_INSN_COSTS

2024-01-03 Thread YunQiang Su
When combine some instructions, the generic `rtx_cost` may over estimate the cost of result RTL, due to that the RTL may be quite complex and `rtx_cost` has no information that this RTL can be convert to simple hardware instruction(s). In this case, Let's use `insn_count * perf_ratio` to estimate

[committed] MIPS: define_attr perf_ratio in mips.md

2024-01-03 Thread YunQiang Su
The accurate cost of an pattern can get with insn_count * perf_ratio The default value is set to 0 instead of 1, since that we will need to distinguish the default value and it is really set for an pattern. Since it is not set for most patterns yet, to use it, we will need to be sure

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2024-01-03 Thread Richard Sandiford
YunQiang Su writes: > On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms, > if 31 or above bits is polluted by an bitops, we will need an > truncate. Let's emit one, and mark let's use the same hardreg > as in and out, the RTL may like: > > (insn 21 20 24 2 (set (subreg/s/u:SI

ping: [PATCH] libcpp: Fix macro expansion for argument of __has_include [PR110558]

2024-01-03 Thread Lewis Hyatt
Hello- May I please ping this one? Thanks... https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html -Lewis On Tue, Dec 12, 2023 at 6:18 PM Lewis Hyatt wrote: > > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558 > > This is a small fix for the libcpp issue noted in

Re: [PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor

2024-01-03 Thread Richard Sandiford
"Maciej W. Rozycki" writes: > On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote: > >> > The test execution timeout is different from the tool execution timeout >> > where it is GCC execution that is being guarded against taking excessive >> > amount of time on the test host rather than the

[Committed] RISC-V: Fix indent

2024-01-03 Thread Juzhe-Zhong
Fix indent of some codes to make them 8 spaces align. Committed. gcc/ChangeLog: * config/riscv/vector.md: Fix indent. --- gcc/config/riscv/vector.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md

[Committed V3] RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]

2024-01-03 Thread Juzhe-Zhong
As PR113206 and PR113209, the bugs happens on the following situation: li a4,32 ... vsetvli zero,a4,e8,m8,ta,ma ... slliw a4,a3,24 sraiw a4,a4,24 bge a3,a1,.L8 sb a4,%lo(e)(a0) vsetvli zero,a4,e8,m8,ta,ma

Re: [patch] libgomp.texi: Document omp_display_env

2024-01-03 Thread Sandra Loosemore
On 1/3/24 11:31, Tobias Burnus wrote: Another small step in my side project of documenting all OpenMP routines in libgomp.texi Here, only 'omp_display_env' is added. (New since OpenMP 5.1 but since a long time in GCC, some fineprint in both the implementation and in the documentation is based

Re: [PATCH RFA] opts: -Werror=foo always implies -Wfoo [PR106213]

2024-01-03 Thread Jeff Law
On 12/19/23 15:17, Jason Merrill wrote: Tested x86_64-pc-linux-gnu, OK for trunk? -- 8< -- -Werror=foo implying -Wfoo wasn't working for -Wdeprecated-copy-dtor, because it is specified as the value 2 of warn_deprecated_copy, which shows up as CLVC_EQUAL, which is not one of the three

[PATCH 2/1] c++: access of class-scope partial tmpl spec

2024-01-03 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- Since partial template specializations can't be named directly, access control (when declared at class scope) doesn't apply to them, so we shouldn't have to set their TREE_PRIVATE / TREE_PROTECTED. This

[PATCH, committed] Fortran: fix FE memleak

2024-01-03 Thread Harald Anlauf
Dear all, I've committed the attached, simple & obvious patch for a gmp memory leak in gfc_get_nodesc_array_type that shows up when running f951 under valgrind e.g. on testcase gfortran.dg/class_optional_2.f90, after regtesting on x86_64-pc-linux-gnu. (Note that this does not address the

[PATCH] c++: explicit inst w/ many constrained partial specs [PR104634]

2024-01-03 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 13? -- >8 -- Here we neglect to emit the definitions of A::f2 and A::f4 despite the explicit instantiations ultimately because TREE_PUBLIC isn't set on the corresponding partial specializations, the

[patch] libgomp.texi: Document omp_display_env

2024-01-03 Thread Tobias Burnus
Another small step in my side project of documenting all OpenMP routines in libgomp.texi Here, only 'omp_display_env' is added. (New since OpenMP 5.1 but since a long time in GCC, some fineprint in both the implementation and in the documentation is based on TR11.) * * * RFC - regarding

Re: [PATCH 2/5][_Hashtable] Fix implementation inconsistencies

2024-01-03 Thread François Dumont
On 21/12/2023 23:07, Jonathan Wakely wrote: On Thu, 23 Nov 2023 at 21:59, François Dumont wrote: libstdc++: [_Hashtable] Fix some implementation inconsistencies Get rid of the different usages of the mutable keyword. For _Prime_rehash_policy methods are exported from the

Re: [RFA] [V3] new pass for sign/zero extension elimination

2024-01-03 Thread Richard Sandiford
Richard Sandiford writes: > Jeff Law writes: >> [...] >> + if (GET_CODE (x) == ZERO_EXTRACT) >> +{ >> + /* If either the size or the start position is unknown, >> + then assume we know nothing about what is overwritten. >> + This is overly

Re: [PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor

2024-01-03 Thread Maciej W. Rozycki
On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote: > > The test execution timeout is different from the tool execution timeout > > where it is GCC execution that is being guarded against taking excessive > > amount of time on the test host rather than the resulting test case > > executable run on

[committed] Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2024-01-03 Thread Kwok Cheung Yeung
On 09/11/2023 12:24 pm, Thomas Schwinge wrote: --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -350,6 +350,9 @@ enum omp_clause_code { /* OpenMP clause: doacross ({source,sink}:vec). */ OMP_CLAUSE_DOACROSS, + /* OpenMP clause: indirect [(constant-integer-expression)]. */ +

Re: Ping: [PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2024-01-03 Thread Jakub Jelinek
On Wed, Jan 03, 2024 at 11:42:58PM +0800, xndcn wrote: > Hi, I am new to this, and I really need your advice, thanks. > > I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE > internal-fn optimization > > for floating type or types contains padding (e.g., long double). > Please

Ping: [PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding

2024-01-03 Thread xndcn
Hi, I am new to this, and I really need your advice, thanks. I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE internal-fn optimization for floating type or types contains padding (e.g., long double). Please correct me if I happen to make any mistakes, Thanks! Firstly, about the

Re: [PATCH 1/1] gcc-14: document P1689R5 scanning output support

2024-01-03 Thread Ben Boeckel
On Mon, Nov 20, 2023 at 11:22:56 -0500, Ben Boeckel wrote: > --- > htdocs/gcc-14/changes.html | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html > index 7278f753..b506eeb1 100644 > --- a/htdocs/gcc-14/changes.html > +++

Re: [PATCH V2] RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]

2024-01-03 Thread Kito Cheng
LGTM with only few comment suggestion Juzhe-Zhong 於 2024年1月3日 週三,18:50寫道: > As PR113206 and PR113209, the bugs happens on the following situation: > > li a4,32 > ... > vsetvli zero,a4,e8,m8,ta,ma > ... > slliw a4,a3,24 > sraiw a4,a4,24 >

[committed] Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2024-01-03 Thread Kwok Cheung Yeung
Hello I have committed the following trivial patch to emit FUNC_MAP or IND_FUNC_MAP in separate branches of an if statement. Kwok On 09/11/2023 12:24 pm, Thomas Schwinge wrote: Similar to how you have it here: --- a/gcc/config/nvptx/mkoffload.cc +++ b/gcc/config/nvptx/mkoffload.cc @@

Re: [PATCH] Add support for function attributes and variable attributes

2024-01-03 Thread Guillaume Gomez
Ping David. :) Le lun. 18 déc. 2023 à 23:27, Guillaume Gomez a écrit : > > Ping David. :) > > Le sam. 9 déc. 2023 à 12:12, Guillaume Gomez > a écrit : > > > > Added it. > > > > Le jeu. 7 déc. 2023 à 18:13, Antoni Boucher a écrit : > > > > > > It seems like you forgot to prefix the commit

[PATCH v2] c++/modules: Emit definitions of ODR-used static members imported from modules [PR112899]

2024-01-03 Thread Nathaniel Shead
Linaro CI tells me that this patch caused regressions on ARM. I don't have an ARM machine available to test on, but it appears to have been caused by attempting to stream vtables as static data members, and ARM having different behaviour with regards to when DECL_INTERFACE_KNOWN is marked on

Re: [RFA] [V3] new pass for sign/zero extension elimination

2024-01-03 Thread Richard Sandiford
Jeff Law writes: > I know we're deep into stage3 and about to transition to stage4. So if > the consensus is for this to wait, I'll understand > > This it the V3 of the ext-dce patch based on Joern's work from last year. > > Changes since V2: >Handle MINUS >Minor logic cleanup for

[committed] Small tweaks for update-copyright.py

2024-01-03 Thread Jakub Jelinek
Hi! update-copyright.py --this-year FAILs on two spots in the modula2 directories. One is gpl_v3_without_node.texi, I think that is similar to other license files which we already exclude from updates. And the other is GmcOptions.cc, which has lines like mcPrintf_printf0 ((const char *)

Re: [PATCH] libgfortran: Bugfix if not define HAVE_ATOMIC_FETCH_ADD

2024-01-03 Thread Tobias Burnus
On 22.12.23 03:36, Lipeng Zhu wrote: This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is not defined in dec_waiting_unlocked function. libgfortran/ChangeLog: * io/io.h (dec_waiting_unlocked): Use __gthread_rwlock_wrlock/__gthread_rwlock_unlock or

Re: [PATCH] RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]

2024-01-03 Thread juzhe.zh...@rivai.ai
While working on PR113209, I noticed it is same issue so this patch not only fixes PR113206 bug, but also fixes PR113209. Send V2 with adding PR113209 test and PR target/113209: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641740.html juzhe.zh...@rivai.ai From: Juzhe-Zhong Date:

[PATCH V2] RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]

2024-01-03 Thread Juzhe-Zhong
As PR113206 and PR113209, the bugs happens on the following situation: li a4,32 ... vsetvli zero,a4,e8,m8,ta,ma ... slliw a4,a3,24 sraiw a4,a4,24 bge a3,a1,.L8 sb a4,%lo(e)(a0) vsetvli zero,a4,e8,m8,ta,ma

Pushed: [PATCH] LoongArch: Provide fmin/fmax RTL pattern for vectors

2024-01-03 Thread Xi Ruoyao
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed r14-6890. FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or it emits them sub-optimally. I've commented in PR112457 but maybe I should've created a new ticket... > 在 2024/1/1 上午3:15, Xi Ruoyao

[PATCH] RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]

2024-01-03 Thread Juzhe-Zhong
As PR113206, the bugs happens on the following situation: li a4,32 ... vsetvli zero,a4,e8,m8,ta,ma ... slliw a4,a3,24 sraiw a4,a4,24 bge a3,a1,.L8 sb a4,%lo(e)(a0) vsetvli zero,a4,e8,m8,ta,ma --> a4 is

[PATCH] c++: Export usings referring to global module fragment [PR109679]

2024-01-03 Thread Nathaniel Shead
Bootstrapped & regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- This patch stops 'add_binding_entity' from ignoring all names in the global module fragment, since they should still be exported if named in an exported using-declaration. PR c++/109679 gcc/cp/ChangeLog: *

Re: [PATCH gcc 1/3] Move GNU/Hurd startfile spec from config/i386/gnu.h to config/gnu.h

2024-01-03 Thread Richard Sandiford
Sergey Bugaev writes: > Since it's not i386-specific; this makes it possible to reuse it for other > architectures. > > Also, add a warning for the case gnu.h is specified before gnu-user.h, which > would cause gnu-user's version of the spec to override gnu's, and not the > other > way around as

Re: RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2024-01-03 Thread Lipeng Zhu
On 2023/12/21 19:42, Thomas Schwinge wrote: Hi! On 2023-12-13T21:52:29+0100, I wrote: On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote: On 2023/12/12 1:45, H.J. Lu wrote: On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote: On 2023/12/9 23:23, Jakub Jelinek wrote: On Sat, Dec 09, 2023 at

Re: [PATCH] LoongArch: Provide fmin/fmax RTL pattern for vectors

2024-01-03 Thread chenglulu
LGTM! Thanks! 在 2024/1/1 上午3:15, Xi Ruoyao 写道: We already had smin/smax RTL pattern using vfmin/vfmax instructions. But for smin/smax, it's unspecified what will happen if either operand contains any NaN operands. So we would not vectorize the loop with -fno-finite-math-only (the default for