[PATCH] i386: Handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx

2024-05-29 Thread Hu, Lin1
Hi, all This patch aims to extend __builtin_ia32_cmp[p|s][s|d] from avx to sse/sse2/avx, where its immediate is in range of [0, 7]. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/avxintrin.h: Move cmp[p|s][s|d] to

[PATCH, rs6000] Optimize vector construction with two vector doubleword loads [PR103568]

2024-05-29 Thread HAO CHEN GUI
Hi, This patch optimizes vector construction with two vector doubleword loads. It generates an optimal insn sequence as "xxlor" has lower latency than "mtvsrdd" on Power10. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. OK for the trunk? Thanks Gui Haochen

Re: [PATCH] Fix -Wstringop-overflow warning in 23_containers/vector/types/1.cc

2024-05-29 Thread François Dumont
Looks like this new version works the same to fix the warning without the issues reported here. All 23_containers/vector tests run in C++98/14/20 so far. Ok to commit once I've complete the testsuite (or some bot did it for me !) ? I'll look for a PR to associate, if you have one in mind do

[PATCH] [libstdc++-v3] [rtems] enable filesystem support

2024-05-29 Thread Alexandre Oliva
mkdir, chdir and chmod functions are defined in librtemscpu, that doesn't get linked in during libstdc++-v3 configure, but applications use -qrtems for linking, which brings those symbols in, so it makes sense to mark them as available so that the C++ filesystem APIs are enabled. Regstrapped on

[PATCH] Fix some opindex for some options [PR115022]

2024-05-29 Thread Andrew Pinski
While looking at the index I noticed that some options had `-` in the front for the index which is wrong. And then I noticed there was no index for `mcmodel=` for targets or had used `-mcmodel` incorrectly. This fixes both of those and regnerates the urls files see that `-mcmodel=` option now has

Re: Reverted recent patches to resource.cc

2024-05-29 Thread Jeff Law
On 5/29/24 8:41 PM, Hans-Peter Nilsson wrote: I do bootstraps and regression testsuite runs on a variety of systems via qemu (alpha, m68k, aarch64, s390, ppc64, etc). It ain't fast, but it does work if QEMU is in pretty good shape and you can find a root filesystem to use. That might

Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]

2024-05-29 Thread HAO CHEN GUI
Hi Kewen, 在 2024/5/29 13:26, Kewen.Lin 写道: > I can understand re-using "unordered" and "eq" will save some efforts than > doing with unspecs, but they are actually RTL codes instead of bits on the > specific hardware CR, a downside is that people who isn't aware of this > design point can have

Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-29 Thread Hongyu Wang
Gently ping :) Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can help to review this part? Thanks. Hongyu Wang 于2024年5月23日周四 16:27写道: > > Gently ping for this :) > Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can > help to review this part? > Thanks.

[PATCH-1v3] Value Range: Add range op for builtin isinf

2024-05-29 Thread HAO CHEN GUI
Hi, The builtin isinf is not folded at front end if the corresponding optab exists. It causes the range evaluation failed on the targets which has optab_isinf. For instance, range-sincos.c will fail on the targets which has optab_isinf as it calls builtin_isinf. This patch fixed the problem

[PATCH-3v2] Value Range: Add range op for builtin isnormal

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isnormal. It also adds two help function in frange to detect range of normal floating-point and range of subnormal or zero. Compared to previous version, the main change is to set the range to 1 if it's normal number otherwise to 0.

[PATCH-2v4] Value Range: Add range op for builtin isfinite

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isfinite. Compared to previous version, the main change is to set the range to 1 if it's finite number otherwise to 0. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652220.html Bootstrapped and tested on x86 and powerpc64-linux BE and LE

Re: Reverted recent patches to resource.cc

2024-05-29 Thread Hans-Peter Nilsson
> Date: Wed, 29 May 2024 20:07:22 -0600 > From: Jeff Law > > There appears to be only a single supported SPARC machine in > > cfarm: cfarm216, and I currently can't reach it due to what > > appears to be issues at my end. I guess I'll either fix > > that or breathe life into sparc-elf+sim. > Or

Re: Reverted recent patches to resource.cc

2024-05-29 Thread Jeff Law
On 5/29/24 7:28 PM, Hans-Peter Nilsson wrote: From: Hans-Peter Nilsson Date: Mon, 27 May 2024 19:51:47 +0200 2: Does not depend on 1, but corrects an incidentally found wart: find_basic_block calls fails too often. Replace it with "modern" insn-to-basic-block cross-referencing. 3: Just

Reverted recent patches to resource.cc

2024-05-29 Thread Hans-Peter Nilsson
> From: Hans-Peter Nilsson > Date: Mon, 27 May 2024 19:51:47 +0200 > 2: Does not depend on 1, but corrects an incidentally found wart: > find_basic_block calls fails too often. Replace it with "modern" > insn-to-basic-block cross-referencing. > > 3: Just an addendum to 2: removes an "if",

[PATCH] aarch64: Add vector floating point extend patterns [PR113880, PR113869]

2024-05-29 Thread Pengxuan Zheng
This patch improves vectorization of certain floating point widening operations for the aarch64 target by adding vector floating point extend patterns for V2SF->V2DF and V4HF->V4SF conversions. PR target/113880 PR target/113869 gcc/ChangeLog: *

[PATCH v3 1/2] RISC-V: add option -m(no-)autovec-segment

2024-05-29 Thread Patrick O'Neill
From: Greg McGary Add option -m(no-)autovec-segment to enable/disable autovectorizer from emitting vector segment load/store instructions. This is useful for performance experiments. gcc/ChangeLog: * config/riscv/autovec.md (vec_mask_len_load_lanes, vec_mask_len_store_lanes):

[PATCH v3 2/2] Prevent divide-by-zero

2024-05-29 Thread Patrick O'Neill
From: Greg McGary gcc/ChangeLog: * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent divide-by-zero. * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice. --- No changes in v3. Depends on the risc-v backend option added in patch 1 to trigger the ICE. ---

[PATCH v3 0/2] RISC-V: add option -m(no-)autovec-segment

2024-05-29 Thread Patrick O'Neill
Sending v3 to fixup testsuite issues and whitespace linter issue. v2 changelog: Rebased to squash Edwin's fixup into Greg's patch. Split out the middle-end change and xfailed the associated testcase so the second patch can land seperately. Relying on pre-commit CI for full testing. v3

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

2024-05-29 Thread Jeff Law
On 5/28/24 1:01 AM, Richard Biener wrote: On Fri, May 24, 2024 at 10:46 AM Mariam Arutunian wrote: This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a

Re: CFG edge visualization to path-printing bootstrap failure

2024-05-29 Thread David Malcolm
On Wed, 2024-05-29 at 15:26 -0400, David Edelsohn wrote: > On Mon, May 20, 2024 at 1:56 PM David Edelsohn > wrote: > > > Hi, David > > > > Unfortunately r15-636-g770657d02c986c causes a bootstrap failure on > > AIX > > when building f951 in stage2.  cc1 and cc1plus link successfully. > > There

Re: PING: Re: [PATCH] selftest: invoke "diff" when ASSERT_STREQ fails

2024-05-29 Thread David Malcolm
On Wed, 2024-05-29 at 16:35 -0400, Eric Gallager wrote: > On Tue, May 28, 2024 at 1:21 PM David Malcolm > wrote: > > > > Ping. > > > > This patch has actually been *very* helpful to me when debugging > > selftest failures involving ASSERT_STREQ. > > > > Thanks > > Dave > > > > Currently

Re: [pushed] wwwdocs: news: Google+ is no more

2024-05-29 Thread Eric Gallager
Maybe also add a mention of the toolchain's Mastodon account while you're there? https://fosstodon.org/@gnutools On Sun, May 26, 2024 at 6:05 PM Gerald Pfeifer wrote: > > Keep the reference as text; just not the link. > > Gerald > --- > htdocs/news.html | 3 +-- > 1 file changed, 1

Re: PING: Re: [PATCH] selftest: invoke "diff" when ASSERT_STREQ fails

2024-05-29 Thread Eric Gallager
On Tue, May 28, 2024 at 1:21 PM David Malcolm wrote: > > Ping. > > This patch has actually been *very* helpful to me when debugging > selftest failures involving ASSERT_STREQ. > > Thanks > Dave > Currently `diff` is only listed under the "Tools/packages necessary for modifying GCC" section of

Re: [PATCH] [testsuite] conditionalize dg-additional-sources on target and type

2024-05-29 Thread Mike Stump
On May 23, 2024, at 6:28 AM, Alexandre Oliva wrote; > I came up with an entirely different approach: > > > g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when > check_vect_support_and_set_flags finds vector support lacking for > execution tests: tests decay to compile tests,

Re: CFG edge visualization to path-printing bootstrap failure

2024-05-29 Thread David Edelsohn
On Mon, May 20, 2024 at 1:56 PM David Edelsohn wrote: > Hi, David > > Unfortunately r15-636-g770657d02c986c causes a bootstrap failure on AIX > when building f951 in stage2. cc1 and cc1plus link successfully. There > doesn't seem to be a similar failure for powerpc64-linux BE or LE. > > The

Re: [PATCH v9 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-05-29 Thread Qing Zhao
Richard and Joseph: > On May 28, 2024, at 17:09, Qing Zhao wrote: > >>> >>> diff --git a/gcc/varasm.cc b/gcc/varasm.cc >>> index fa17eff551e8..d75b23668925 100644 >>> --- a/gcc/varasm.cc >>> +++ b/gcc/varasm.cc >>> @@ -5082,6 +5082,11 @@ initializer_constant_valid_p_1 (tree value, tree >>>

[PATCH v2 10/12] OpenMP: Remove dead code from declare variant reimplementation

2024-05-29 Thread Sandra Loosemore
After reimplementing late resolution of "declare variant" to use the same mechanisms as metadirective, the declare_variant_alt and calls_declare_variant_alt flags on struct cgraph_node are no longer used by anything. For the purposes of marking functions that need late resolution, the

[PATCH v2 11/12] OpenMP: Update "declare target"/OpenMP context interaction

2024-05-29 Thread Sandra Loosemore
The code and test case previously implemented the OpenMP 5.0 spec, which said in section 2.3.1: "For functions within a declare target block, the target trait is added to the beginning of the set..." In OpenMP 5.1, this was changed to "For device routines, the target trait is added to the

[PATCH v2 06/12] OpenMP: common c/c++ testcases for metadirectives

2024-05-29 Thread Sandra Loosemore
gcc/testsuite/ChangeLog * c-c++-common/gomp/metadirective-1.c: New. * c-c++-common/gomp/metadirective-2.c: New. * c-c++-common/gomp/metadirective-3.c: New. * c-c++-common/gomp/metadirective-4.c: New. * c-c++-common/gomp/metadirective-5.c: New. *

[PATCH v2 09/12] OpenMP: Extend dynamic selector support to declare variant

2024-05-29 Thread Sandra Loosemore
This patch extends the mechanisms previously added to support dynamic selectors in metavariant constructs to also apply to "declare variant". The front-end mechanisms used to handle "declare variant" via attributes attached to the function decls remain the same, but the gimplifier now uses the

[PATCH v2 07/12] OpenMP: Fortran front-end support for metadirectives.

2024-05-29 Thread Sandra Loosemore
This patch adds support for metadirectives to the Fortran front end. gcc/fortran/ChangeLog * decl.cc (gfc_match_end): Handle metadirectives. * dump-parse-tree.cc (show_omp_node): Likewise. (show_code_node): Likewise. * gfortran.h (enum gfc_statement): Add

[PATCH v2 12/12] OpenMP: Update documentation of metadirective implementation status.

2024-05-29 Thread Sandra Loosemore
libgomp/ChangeLog * libgomp.texi (OpenMP 5.0): Mark metadirective and declare variant as implemented. (OpenMP 5.1): Mark target_device as supported. Add changed interaction between declare target and OpenMP context and dynamic selector support.

[PATCH v2 03/12] libgomp: runtime support for target_device selector

2024-05-29 Thread Sandra Loosemore
This patch implements the libgomp runtime support for the dynamic target_device selector via the GOMP_evaluate_target_device function. include/ChangeLog * cuda/cuda.h (CUdevice_attribute): Add definitions for CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and

[PATCH v2 02/12] OpenMP: middle-end support for metadirectives

2024-05-29 Thread Sandra Loosemore
This patch adds middle-end support for OpenMP metadirectives. Some context selectors can be resolved during gimplification, but others need to be deferred until the omp_device_lower pass, which requires that cgraph, LTO streaming, inlining, etc all know about this construct as well.

[PATCH v2 08/12] OpenMP: Reject other properties with kind(any)

2024-05-29 Thread Sandra Loosemore
The OpenMP spec says: "If trait-property any is specified in the kind trait-selector of the device selector set or the target_device selector sets, no other trait-property may be specified in the same selector set." GCC was not previously enforcing this restriction and several testcases included

[PATCH v2 05/12] OpenMP: C++ front-end support for metadirectives

2024-05-29 Thread Sandra Loosemore
This patch adds C++ support for metadirectives. It uses the c-family support committed with the corresponding C front end patch to do early parse-time metadirective resolution when possible. Additional C/C++ common testcases are provided in a subsequent patch in the series. gcc/cp/ChangeLog

[PATCH v2 04/12] OpenMP: C front end support for metadirectives

2024-05-29 Thread Sandra Loosemore
This patch adds support to the C front end to parse OpenMP metadirective constructs. It includes support for early parse-time resolution of metadirectives (when possible) that will also be used by the C++ front end. Additional common C/C++ testcases are in a later patch in the series.

[PATCH v2 01/12] OpenMP: metadirective tree data structures and front-end interfaces

2024-05-29 Thread Sandra Loosemore
This patch adds the OMP_METADIRECTIVE tree node and shared tree-level support for manipulating metadirectives. It defines/exposes interfaces that will be used in subsequent patches that add front-end and middle-end support, but nothing generates these nodes yet. This patch also adds compile-time

[PATCH v2 00/12] OpenMP: Metadirective support + "declare variant" improvements

2024-05-29 Thread Sandra Loosemore
This is an updated version of the patch series I posted a few weeks ago: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650725.html I won't duplicate the full list of things implemented/fixed here from the original patch mail. The incremental changes since then include: * I rebased the

Re: [PATCH v2 2/2] Prevent divide-by-zero

2024-05-29 Thread Patrick O'Neill
On 5/29/24 00:20, Richard Biener wrote: On Wed, May 29, 2024 at 1:39 AM Patrick O'Neill wrote: From: Greg McGary gcc/ChangeLog: * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent divide-by-zero. * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove xfail.

Re: [PATCH v2] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2024-05-29 Thread Alexandre Oliva
On May 27, 2024, "Kewen.Lin" wrote: > OK with these nits tweaked and re-tested well, thanks! Thanks, here's what I've retested on ppc64le-linux-gnu, and will push onto trunk eventually, after retesting also on ppc- and ppc64-vx7r2: [testsuite] [powerpc] adjust -m32 counts for

Re: [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins.

2024-05-29 Thread Carl Love
This was patch 13 from the previous series. Note the previous series patch 12 was dropped. This patch is the same as the previous version. The additional work to remove __builtin_vec_set_v1ti, __builtin_vec_set_v2di, __builtin_vec_set_v2d per the feedback comments with equivalent gimple

Re: [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

2024-05-29 Thread Carl Love
This was patch 11 from the previous series. Patch was updated to address feedback comments. Carl -- rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the

Re: [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins

2024-05-29 Thread Carl Love
This was patch 8 in the previous series. Updated patch per the feedback comments. Carl rs6000, remove __builtin_vsx_vperm_* built-ins The undocumented built-ins: __builtin_vsx_vperm_16qi_uns,

Re: [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args

2024-05-29 Thread Carl Love
This was patch 10 from the previous series. The patch was updated to address feedback comments. Carl --- rs6000, extend vec_xxpermdi built-in for __int128 args Add a new signed and unsigned overloaded instances for

Re: [PATCH 10/13 ver 3] rs6000, remove __builtin_vsx_xvnegdp and, __builtin_vsx_xvnegsp built-ins

2024-05-29 Thread Carl Love
This was patch 9 in the previous series. It was previously approved. Reposting for completeness. Carl - rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins The undocumented

Re: [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates

2024-05-29 Thread Carl Love
This was patch 7 in the previous series. Patch was updated to address the feedback comments. Carl rs6000, remove the vec_xxsel built-ins, they are duplicates The following undocumented

Re: [PATCH 6/13 ver 3] rs6000, remove duplicated built-ins of vecmergl and, vec_mergeh

2024-05-29 Thread Carl Love
This was patch 5 in the previous series. It was previously approved. Not changes in this version. Being posted for completeness. Carl rs6000, remove duplicated built-ins of vecmergl and vec_mergeh The

Re: [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments

2024-05-29 Thread Carl Love
This was patch 6 in the previous series. Updated the documentation file per the comments. No functional changes to the patch. Carl rs6000, add overloaded vec_sel with int128 arguments Extend the vec_sel

Re: [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions

2024-05-29 Thread Carl Love
This is a new patch to removed the built-ins that were inadvertently missing in the previous series. Carl -- rs6000, Remove redundant float/double type conversions The following built-ins are redundant

Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins

2024-05-29 Thread Carl Love
Updated the patch per the feedback comments from the previous version. Carl --- rs6000, extend the current vec_{un,}signed{e,o} built-ins The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds

Re: [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition

2024-05-29 Thread Carl Love
This patch was updated per the feedback comment from the previous version in series 2. Carl --- rs6000, fix error in unsigned vector float to unsigned int built-in definitions The built-in

Re: [PATCH 2/13 ver 3] rs6000, Remove __builtin_vsx_xvcvspsxws built-in

2024-05-29 Thread Carl Love
I responded to comments about the patch from the previous patch series. No functional changes were made to this patch. Carl -- rs6000, Remove __builtin_vsx_xvcvspsxws built-in. The built-in __builtin_vsx_xvcvspsxws

Re: [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins

2024-05-29 Thread Carl Love
This patch was approved in the previous series. There are no changes to this patch. Reposting for completeness. Carl --- rs6000, Remove __builtin_vsx_cmple* builtins The built-ins __builtin_vsx_cmple_u16qi,

[PATCH 0/13 ver 3] rs6000, built-in cleanup patch series

2024-05-29 Thread Carl Love
GCC maintainers: The following is an updated patch series to remove duplicate built-ins. There are patches to extend an existing overloaded built-in to cover additional input types. A new patch, 0005-rs6000-Remove-redundant-float-double-type-conversion.patch, was added to remove

[PATCH] aarch64: Split aarch64_combinev16qi before RA [PR115258]

2024-05-29 Thread Richard Sandiford
Two-vector TBL instructions are fed by an aarch64_combinev16qi, whose purpose is to put the two input data vectors into consecutive registers. This aarch64_combinev16qi was then split after reload into individual moves (from the first input to the first half of the output, and from the second

Re: [PATCH] Fix LTO type mismatch warning on transparent union

2024-05-29 Thread Richard Biener
> Am 29.05.2024 um 15:30 schrieb Eric Botcazou : > > Hi, > > Ada doesn't have an equivalent to transparent union types in GNU C so, when it > needs to interface a C function that takes a parameter of a transparent union > type, GNAT uses the type of the first member of the union on the Ada

[patch] libgomp.texi: Impl. update for USM and missing 5.2 item

2024-05-29 Thread Tobias Burnus
Now that unified-shared memory works (with some devices), mark it as 'Y' and link to the device-specific chapter. While there is always room for improvement (like having opt-in partial support for managed-memory semi-USM devices), it works sufficienty for a 'Y'. Additionally, I saw that 5.2

Compare loop bounds in ipa-icf

2024-05-29 Thread Jan Hubicka
Hi, this testcase shows another poblem with missing comparators for metadata in ICF. With value ranges available to loop optimizations during early opts we can estimate number of iterations based on guarding condition that can be split away by the fnsplit pass. This patch disables ICF when number

Re: [COMMITTED] tree-optimization/115221 - Do not invoke SCEV if it will use a different range query.

2024-05-29 Thread Andrew MacLeod
On 5/29/24 03:19, Richard Biener wrote: On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote: The original patch causing the PR made ranger's cache re-entrant to enable SCEV to use the current range_query when called from within ranger.. SCEV uses the currently active range query (via

RE: [PATCH v3] Match: Support more form for scalar unsigned SAT_ADD

2024-05-29 Thread Li, Pan2
Thanks Richard for suggestion and review. Did some tricky/ugly restrictions v3 for the phi gen as there are sorts of (cond in match.pd, will have a try with your proposal in v4. Thanks again for help. Pan -Original Message- From: Richard Biener Sent: Wednesday, May 29, 2024 8:36 PM

[PATCH] c-family: Introduce the -Winvalid-noreturn flag from clang with extra tuneability

2024-05-29 Thread Julian Waters
Currently, gcc warns about noreturn marked functions that return both explicitly and implicitly, with no way to turn this warning off. clang does have an option for these classes of warnings, -Winvalid-noreturn. However, we can do better. Instead of just having 1 option that switches the

[pushed] c++: pragma target and static init [PR109753]

2024-05-29 Thread Jason Merrill
Revised to drop the cgraph change so I can self-approve the remaining patch. Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- #pragma target and optimize should also apply to implicitly-generated functions like static initialization functions and defaulted special member functions.

[pushed] c++: add module extensions

2024-05-29 Thread Jason Merrill
Revised to change mkdeps and the docs. Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- There is a trend in the broader C++ community to use a different extension for module interface units, even though (in GCC) they are compiled in the same way as other source files. Let's recognize

Re: [PATCH v6 1/8] Improve must tail in RTL backend

2024-05-29 Thread Michael Matz
On Tue, 21 May 2024, Andi Kleen wrote: > - Give error messages for all causes of non sibling call generation > - When giving error messages clear the musttail flag to avoid ICEs > - Error out when tree-tailcall failed to mark a must-tail call > sibcall. In this case it doesn't know the true

[PATCH] Fix LTO type mismatch warning on transparent union

2024-05-29 Thread Eric Botcazou
Hi, Ada doesn't have an equivalent to transparent union types in GNU C so, when it needs to interface a C function that takes a parameter of a transparent union type, GNAT uses the type of the first member of the union on the Ada side (which is the type used to determine the passing mechanism

[PATCH 3/3] RISC-V: Avoid inserting after a GIMPLE_COND with SLP and early break

2024-05-29 Thread Richard Biener
When vectorizing an early break loop with LENs (do we miss some check here to disallow this?) we can end up deciding to insert stmts after a GIMPLE_COND when doing SLP scheduling and trying to be conservative with placing of stmts only dependent on the implicit loop mask/len. The following avoids

[PATCH 2/3] Reduce single-lane SLP testresult noise

2024-05-29 Thread Richard Biener
The following avoids dumping 'vectorizing stmts using SLP' for single-lane instances since that causes extra testsuite fallout. * tree-vect-slp.cc (vect_schedule_slp): Gate dumping 'vectorizing stmts using SLP' on > 1 lanes. --- gcc/tree-vect-slp.cc | 3 ++- 1 file changed, 2

[PATCH 1/3] Do single-lane SLP discovery for reductions

2024-05-29 Thread Richard Biener
The following performs single-lane SLP discovery for reductions. It requires a fixup for outer loop vectorization where a check for multiple types needs adjustments as otherwise bogus pointer IV increments happen when there are multiple copies of vector stmts in the inner loop. For the reduction

Re: [PATCH 1/2] RISC-V: add option -m(no-)autovec-segment

2024-05-29 Thread Robin Dapp
On 5/28/24 23:55, Patrick O'Neill wrote: > From: Greg McGary > > Add option -m(no-)autovec-segment to enable/disable autovectorizer > from emitting vector segment load/store instructions. This is useful for > performance experiments. I think the question was raised before but does a vector tune

Re: [PATCH v9 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-05-29 Thread Qing Zhao
> On May 29, 2024, at 02:57, Richard Biener wrote: > > On Tue, May 28, 2024 at 11:09 PM Qing Zhao wrote: >> >> Thank you for the comments. See my answers below: >> >> Joseph, please see the last question, I need your help on it. Thanks a lot >> for the help. >> >> Qing >> >>> On May 28,

Re: [PATCH v2] C/C++: add hints for strerror

2024-05-29 Thread Jason Merrill
Pushed, thanks! On 2/27/24 20:13, Oskari Pirhonen wrote: Add proper hints for implicit declaration of strerror. The results could be confusing depending on the other included headers. These example messages are from compiling a trivial program to print the string for an errno value. It only

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:20:01AM +0200, Tobias Burnus wrote: > + if (num_devices > 0 > + && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)) > +for (int dev = 0; dev < num_devices; dev++) > + { > + int pi; > + CUresult r; > + r = CUDA_CALL_NOCHECK

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

2024-05-29 Thread David Malcolm
On Fri, 2024-05-24 at 12:42 +0400, Mariam Arutunian wrote: > This patch adds a new compiler pass aimed at identifying naive CRC > implementations, > characterized by the presence of a loop calculating a CRC (polynomial > long > division). > Upon detection of a potential CRC, the pass prints an

Re: [PATCH v3] Match: Support more form for scalar unsigned SAT_ADD

2024-05-29 Thread Richard Biener
On Mon, May 27, 2024 at 8:29 AM wrote: > > From: Pan Li > > After we support one gassign form of the unsigned .SAT_ADD, we > would like to support more forms including both the branch and > branchless. There are 5 other forms of .SAT_ADD, list as below: > > Form 1: > #define SAT_ADD_U_1(T)

Re: [patch] libgomp: Enable USM for AMD APUs and MI200 devices

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 02:15:07PM +0200, Tobias Burnus wrote: > + bool b; > + hsa_status_t status; > + status = hsa_fns.hsa_system_get_info_fn ( > + HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT, ); > + if (status != HSA_STATUS_SUCCESS) > + GOMP_PLUGIN_error (

[patch] libgomp: Enable USM for AMD APUs and MI200 devices

2024-05-29 Thread Tobias Burnus
This patch depends (on the libgomp/target.c parts) of the patch "[patch] libgomp: Enable USM for some nvptx devices", https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652987.html AMD GPUs that are either APU devices or MI200 [or MI300X] (with HSA_XNACK=1 set) can access host memory; the

Re: [PATCH 2/2] match: Add support for `a ^ CST` to bitwise_inverted_equal_p [PR115224]

2024-05-29 Thread Richard Biener
On Mon, May 27, 2024 at 2:48 AM Andrew Pinski wrote: > > While looking into something else, I noticed that `a ^ CST` needed to be > special casing to bitwise_inverted_equal_p as it would simplify to `a ^ ~CST` > for the bitwise not. > > Bootstrapped and tested on x86_64-linux-gnu with no

Re: [PATCH 1/2] Match: Add maybe_bit_not instead of plain matching

2024-05-29 Thread Richard Biener
On Mon, May 27, 2024 at 2:47 AM Andrew Pinski wrote: > > While working on adding matching of negative expressions of `a - b`, > I noticed that we started to have "duplicated" patterns due to not having > a way to match maybe negative expressions. So I went back to what I did for > bit_not and

[PATCH v1] Vect: Support IFN SAT_SUB for unsigned vector int

2024-05-29 Thread pan2 . li
From: Pan Li This patch would like to support the .SAT_SUB for the unsigned vector int. Given we have below example code: void vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { for (unsigned i = 0; i < n; i++) out[i] = (x[i] - y[i]) & (-(uint64_t)(x[i] >= y[i])); }

Re: [V3 PATCH] Don't reduce estimated unrolled size for innermost loop.

2024-05-29 Thread Richard Biener
On Fri, May 24, 2024 at 9:29 AM liuhongt wrote: > > Update in V3: > > Since this was about vectorization can you instead add a testcase to > > gcc.dg/vect/ and check for > > vectorization to happen? > Move to vect/pr112325.c. > > > > I believe the if (unr_insn <= 0) check can go as well. >

Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection

2024-05-29 Thread Mariam Arutunian
On Tue, May 28, 2024 at 8:20 AM Jeff Law wrote: > > > On 5/24/24 2:42 AM, Mariam Arutunian wrote: > > This patch adds a new compiler pass aimed at identifying naive CRC > > implementations, > > characterized by the presence of a loop calculating a CRC (polynomial > > long division). > > Upon

Re: [PATCH] tree-optimization/115252 - enhance peeling for gaps avoidance

2024-05-29 Thread Richard Biener
On Wed, 29 May 2024, Richard Biener wrote: > On Wed, 29 May 2024, Richard Sandiford wrote: > > > Richard Biener writes: > > > Code generation for contiguous load vectorization can already deal > > > with generalized avoidance of loading from a gap. The following > > > extends detection of

[PATCH] tree-optimization/114435 - pcom left around copies confusing SLP

2024-05-29 Thread Richard Biener
The following arranges for the pre-SLP vectorization scalar cleanup to be run when predictive commoning was applied to a loop in the function. This is similar to the complete unroll situation and facilitating SLP vectorization. Avoiding the SSA copies in predictive commoning itself isn't easy

[Ada] Fix PR ada/115270

2024-05-29 Thread Eric Botcazou
This fixes the link failure of the GNAT tools on 32-bit SPARC/Linux (as well as on 32-bit PowerPC/Linux probably) coming from an incorrect binding to the 64-bit compare-and-exchange builtin. Tested by Rainer on 32-bit SPARC/Linux, applied on mainline and 14 branch. 2024-05-29 Eric Botcazou

[PATCH] libgcc/aarch64: also provide AT_HWCAP2 fallback

2024-05-29 Thread Jan Beulich
Much like AT_HWCAP is already provided in case the platform headers don't have the value (yet). libgcc/ * config/aarch64/cpuinfo.c: Provide AT_HWCAP2. --- Observed as build failure with 14.1.0, so may want backporting there. --- a/libgcc/config/aarch64/cpuinfo.c +++

Re: [PATCH] tree-optimization/115252 - enhance peeling for gaps avoidance

2024-05-29 Thread Richard Biener
On Wed, 29 May 2024, Richard Sandiford wrote: > Richard Biener writes: > > Code generation for contiguous load vectorization can already deal > > with generalized avoidance of loading from a gap. The following > > extends detection of peeling for gaps requirement with that, > > gets rid of the

Re: [PATCH 1/3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-29 Thread Richard Biener
On Thu, 23 May 2024, Hu, Lin1 wrote: > gcc/ChangeLog: > > PR target/107432 > * tree-vect-generic.cc > (supportable_indirect_narrowing_operation): New function for > support indirect narrowing convert. > (supportable_indirect_widening_operation): New function for >

Re: [PATCH] Avoid vector -Wfree-nonheap-object warnings

2024-05-29 Thread Jonathan Wakely
On Tue, 28 May 2024 at 21:55, François Dumont wrote: > > I can indeed restore _M_initialize_dispatch as it was before. It was not > fixing my initial problem. I simply kept the code simplification. > > libstdc++: Use RAII to replace try/catch blocks > > Move _Guard into std::vector

Re: [PATCH] tree-optimization/115252 - enhance peeling for gaps avoidance

2024-05-29 Thread Richard Sandiford
Richard Biener writes: > Code generation for contiguous load vectorization can already deal > with generalized avoidance of loading from a gap. The following > extends detection of peeling for gaps requirement with that, > gets rid of the old special casing of a half load and makes sure > when

Re: [PATCH 1/5] Do single-lane SLP discovery for reductions

2024-05-29 Thread Richard Sandiford
Richard Biener writes: > On Fri, 24 May 2024, Richard Biener wrote: > >> This is the second merge proposed from the SLP vectorizer branch. >> I have again managed without adding and using --param vect-single-lane-slp >> but instead this provides always enabled functionality. >> >> This makes us

[PATCH 3/3 v2] vect: support direct conversion under x86-64-v3.

2024-05-29 Thread Hu, Lin1
According to hongtao's suggestion, I support some trunc in mmx.md under x86-64-v3, and optimize ix86_expand_trunc_with_avx2_noavx512f. BRs, Lin gcc/ChangeLog: PR 107432 * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f): New function for generate a

Re: [PATCH 2/3 v2] vect: Support v4hi -> v4qi.

2024-05-29 Thread Hongtao Liu
On Wed, May 29, 2024 at 4:56 PM Hu, Lin1 wrote: > > Exclude add TARGET_MMX_WITH_SSE, I merge two patterns. Ok. > > BRs, > Lin > > gcc/ChangeLog: > > PR target/107432 > * config/i386/mmx.md > (VI2_32_64): New mode iterator. > (mmxhalfmode): New mode atter. > (mmxhalfmodelower):

[PATCH 2/3 v2] vect: Support v4hi -> v4qi.

2024-05-29 Thread Hu, Lin1
Exclude add TARGET_MMX_WITH_SSE, I merge two patterns. BRs, Lin gcc/ChangeLog: PR target/107432 * config/i386/mmx.md (VI2_32_64): New mode iterator. (mmxhalfmode): New mode atter. (mmxhalfmodelower): Ditto. (truncv2hiv2qi2): Extend mode v4hi and change name from

Re: [PATCH] i386: Fix ix86_option override after change [PR 113719]

2024-05-29 Thread Hongtao Liu
On Thu, May 16, 2024 at 5:15 PM Hongyu Wang wrote: > > Richard Biener 于2024年5月16日周四 15:05写道: > > > > > On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote: > > > > > > Hi, > > > > > > In ix86_override_options_after_change, calls to ix86_default_align > > > and ix86_recompute_optlev_based_flags

Re: [PATCH] vect: Unify bbs in loop_vec_info and bb_vec_info

2024-05-29 Thread Richard Biener
On Wed, May 29, 2024 at 10:39 AM Feng Xue OS wrote: > > Ok. Then I will add a TODO comment on "bbs" field to describe it. Fine with me. Thanks, Richard. > Thanks, > Feng > > > > From: Richard Biener > Sent: Wednesday, May 29, 2024 3:14 PM > To: Feng

Re: [PATCH] vect: Unify bbs in loop_vec_info and bb_vec_info

2024-05-29 Thread Feng Xue OS
Ok. Then I will add a TODO comment on "bbs" field to describe it. Thanks, Feng From: Richard Biener Sent: Wednesday, May 29, 2024 3:14 PM To: Feng Xue OS Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] vect: Unify bbs in loop_vec_info and bb_vec_info

Re: [Patch, PR Fortran/90069] Polymorphic Return Type Memory Leak Without Intermediate Variable

2024-05-29 Thread Andre Vehreschild
Hi Harald, thanks for the review. Very much appreciated. Commited as 2f97d98d174e3ef9f3a9a83c179d787abde5e066. I have some patches for memory leaks I will post in the next days. I am inclined to backport them together to 14-line, if no new bugs arise. About the SAVE_EXPR, Richard Biener shed

Re: [PATCH v2 2/2] Prevent divide-by-zero

2024-05-29 Thread Richard Biener
On Wed, May 29, 2024 at 1:39 AM Patrick O'Neill wrote: > > From: Greg McGary > > gcc/ChangeLog: > * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent > divide-by-zero. > * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove xfail. > --- >

Re: [COMMITTED] tree-optimization/115221 - Do not invoke SCEV if it will use a different range query.

2024-05-29 Thread Richard Biener
On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote: > > The original patch causing the PR made ranger's cache re-entrant to > enable SCEV to use the current range_query when called from within ranger.. > > SCEV uses the currently active range query (via get_range_query()) for > picking up

  1   2   >