[PATCH] Add single-lane SLP support to .GOMP_SIMD_LANE vectorization

2024-10-04 Thread Richard Biener
The following adds basic support for single-lane SLP .GOMP_SIMD_LANE vectorization, in particular it enables SLP discovery. * tree-vect-slp.cc (no_arg_map): New. (vect_get_operand_map): Handle IFN_GOMP_SIMD_LANE. (vect_build_slp_tree_1): Likewise. * tree-vect-stmts.

[PATCH] Fixup dumping of re-trying without/with single-lane SLP

2024-10-04 Thread Richard Biener
The following fixes the order of decrementing the SLP mode and the dumping. Build on x86_64-unknown-linux-gnu, pushed. * tree-vect-loop.cc (vect_analyze_loop_2): Derement 'slp' before dumping which stage we're starting. --- gcc/tree-vect-loop.cc | 6 +++--- 1 file changed, 3 inse

Re: [PATCH] diagnostic, pch: Fix up the new diagnostic PCH methods for ubsan checking [PR116936]

2024-10-04 Thread Richard Biener
On Fri, Oct 4, 2024 at 12:04 PM Jakub Jelinek wrote: > > Hi! > > The PR notes that the new pch_save/pch_restore methods I've added > recently invoke UB if either m_classification_history.address () > or m_push_list.address () is NULL (which can happen if those vectors > are empty (and in the pch_s

Re: [PATCH] libstdc++: Unroll loop in load_bytes function

2024-10-04 Thread Richard Biener
On Fri, Oct 4, 2024 at 11:20 AM Jonathan Wakely wrote: > > On Fri, 4 Oct 2024 at 10:19, Jonathan Wakely wrote: > > > > On Fri, 4 Oct 2024 at 07:53, Richard Biener > > wrote: > > > > > > On Wed, Oct 2, 2024 at 8:26 PM Jonathan Wakely wrote: > > &

[PATCH] Improve load permutation lowering

2024-10-04 Thread Richard Biener
The following makes sure the emitted even/odd extraction scheme follows one that ends up with actual trivial even/odd extract permutes. When we choose a level 2 extract we generate { 0, 1, 4, 5, ... } which for example the x86 backend doesn't recognize with just SSE and QImode elements. So this no

[PATCH] Relax gcc.dg/vect/pr65947-8.c

2024-10-04 Thread Richard Biener
When failing using forced SLP we do not print the non-SLP failure mode which reads slightly different. Massage the expectation a bit. Pushed. * gcc.dg/vect/pr65947-8.c: Adjust. --- gcc/testsuite/gcc.dg/vect/pr65947-8.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/

[PATCH] tree-optimization/99856 - fix testcase

2024-10-04 Thread Richard Biener
When making the testcase use aligned accesses I botched up the copy&paste. Fixed. Pushed. PR tree-optimization/99856 * gcc.dg/vect/pr99856.c: Fix copy&paste errors. --- gcc/testsuite/gcc.dg/vect/pr99856.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gc

Re: [PATCH 3/3] Record template specialization hash

2024-10-04 Thread Richard Biener
On Thu, 3 Oct 2024, Jason Merrill wrote: > On 10/2/24 7:53 AM, Richard Biener wrote: > > For a specific testcase a lot of compile-time is spent in re-hashing > > hashtable elements upon expansion. The following records the hash > > in the hash element. This speeds

Re: [PATCH 3/3] gimple: Add gimple_with_undefined_signed_overflow and use it [PR111276]

2024-10-04 Thread Richard Biener
On Thu, Oct 3, 2024 at 6:09 PM Andrew Pinski wrote: > > While looking into the ifcombine, I noticed that rewrite_to_defined_overflow > was rewriting already defined code. In the previous attempt at fixing this, > the review mentioned we should not be calling rewrite_to_defined_overflow > in those

Re: [PATCH 2/3] cfgexpand: Handle scope conflicts better [PR111422]

2024-10-04 Thread Richard Biener
On Thu, Oct 3, 2024 at 6:09 PM Andrew Pinski wrote: > > After fixing loop-im to do the correct overflow rewriting > for pointer types too. We end up with code like: > ``` > _9 = (unsigned long) &g; > _84 = _9 + 18446744073709551615; > _11 = _42 + _84; > _44 = (signed char *) _11; > ... >

Re: [PATCH 1/3] cfgexpand: Expand comment on when non-var clobbers can show up

2024-10-03 Thread Richard Biener
On Thu, Oct 3, 2024 at 6:09 PM Andrew Pinski wrote: > > The comment here is not wrong, just it would be better if mentioning > the C++ front-end instead of just the nested function lowering. OK > gcc/ChangeLog: > > * cfgexpand.cc (add_scope_conflicts_1): Expand comment > on when

Re: [PATCH] libstdc++: Unroll loop in load_bytes function

2024-10-03 Thread Richard Biener
On Wed, Oct 2, 2024 at 8:26 PM Jonathan Wakely wrote: > > On Wed, 2 Oct 2024 at 19:16, Jonathan Wakely wrote: > > > > On Wed, 2 Oct 2024 at 19:15, Dmitry Ilvokhin wrote: > > > > > > Instead of looping over every byte of the tail, unroll loop manually > > > using switch statement, then compilers

Re: [patch,testsuite] Fix gcc.c-torture/execute/ieee/pr108540-1.c

2024-10-03 Thread Richard Biener
On Thu, Oct 3, 2024 at 1:30 PM Georg-Johann Lay wrote: > > gcc.c-torture/execute/ieee/pr108540-1.c obviously requires that double > is a 64-bit type, hence add pr108540-1.x as an according filter. > > Ok for trunk? > > And is there a reason for why we are still putting test cases in > these old pa

Re: [PATCH] RISC-V: Define LOGICAL_OP_NON_SHORT_CIRCUIT to 1 [PR116615]

2024-10-03 Thread Richard Biener
On Thu, Oct 3, 2024 at 3:15 AM Andrew Waterman wrote: > > On Wed, Oct 2, 2024 at 4:41 PM Jeff Law wrote: > > > > > > > > On 10/2/24 4:39 PM, Andrew Waterman wrote: > > > On Wed, Oct 2, 2024 at 5:56 AM Jeff Law wrote: > > >> > > >> > > >> > > >> On 9/5/24 12:52 PM, Palmer Dabbelt wrote: > > >>> W

Re: [patch,testsuite,applied] Fix gcc.dg/signbit-6.c for int != 32-bit targets

2024-10-03 Thread Richard Biener
On Wed, Oct 2, 2024 at 5:01 PM Georg-Johann Lay wrote: > > This test failed on int != 32-bit targets due to > a[0] = b[0] = INT_MIN instead of using INT32_MIN. OK. Richard. > Johann > > -- > > testsuite/52641 - Fix gcc.dg/signbit-6.c for int != 32-bit targets. > > PR testsuite

Re: [PATCH] testsuite: Make check-function-bodies work with LTO

2024-10-03 Thread Richard Biener
On Wed, Oct 2, 2024 at 3:48 PM Richard Sandiford wrote: > > This patch tries to make check-function-bodies automatically > choose between reading the regular assembly file and reading the > LTO assembly file. There should only ever be one right answer, > since check-function-bodies doesn't make s

Re: [PATCH 3/3] Handle non-grouped stores as single-lane SLP

2024-10-03 Thread Richard Biener
On Thu, 3 Oct 2024, Thomas Schwinge wrote: > Hi! > > On 2024-09-06T11:30:06+0200, Richard Biener wrote: > > On Thu, 5 Sep 2024, Richard Biener wrote: > >> The following enables single-lane loop SLP discovery for non-grouped stores > >> and adjusts vectorizab

[PATCH] Restore aarch64 bootstrap

2024-10-03 Thread Richard Biener
This zero-initializes vec_init to avoid a bogus maybe-uninitialized diagnostic. Built on x86_64-unknown-linux-gnu, pushed as obvious. * tree-vect-loop.cc (vectorizable_induction): Initialize vec_init. --- gcc/tree-vect-loop.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH] tree-optimization/116566 - single lane SLP for VLA inductions

2024-10-03 Thread Richard Biener
On Wed, 2 Oct 2024, Andrew Pinski wrote: > On Tue, Oct 1, 2024 at 5:04 AM Richard Biener wrote: > > > > The following adds SLP support for vectorizing single-lane inductions > > with variable length vectors. > > This introduces a bootstrap failure on aarch64 due

Re: [PATCH] testsuite: Unset torture_current_flags after use

2024-10-02 Thread Richard Biener
> Am 02.10.2024 um 15:48 schrieb Richard Sandiford : > > Before running a test with specific torture options, gcc-dg-runtest > sets the global variable torture_current_flags to the set of torture > options that will be used. However, it never unset the variable > afterwards, which meant that

[PATCH] Replace another missed iterative_hash_object

2024-10-02 Thread Richard Biener
I missed one that's actually hit quite a lot, hashing of the canonical type TYPE_HASH. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed as obvious after the previous approval. Richard. * pt.cc (iterative_hash_template_arg): Use iterative_hash_hashval_t to hash TYPE_HAS

Re: [PATCH]middle-end: support SLP early break

2024-10-02 Thread Richard Biener
mt = STMT_VINFO_STMT (stmt_info); >basic_block cond_bb = gimple_bb (stmt); > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > index > 490061aea2f6d465d9589eb97bbd34a920d76b1c..53483303c4ac3482760fe722354f602e0243e5a2 > 100644 > --- a/gcc/tree-vectorizer.h > +++ b/gcc/t

Re: [RFC PATCH] Allow limited extended asm at toplevel

2024-10-02 Thread Richard Biener
gt; } > + if (allows_reg && toplev_p) > + { > + error_at (loc, "invalid constraint outside of a function"); > + operand = error_mark_node; > + } > } > else >

[PATCH 3/3] Record template specialization hash

2024-10-02 Thread Richard Biener
For a specific testcase a lot of compile-time is spent in re-hashing hashtable elements upon expansion. The following records the hash in the hash element. This speeds up compilation by 20%. There's probably module-related uses that need to be adjusted. Bootstrap failed (guess I was expecting t

[PATCH 2/3] Release expanded template argument vector

2024-10-02 Thread Richard Biener
This reduces peak memory usage by 20% for a specific testcase. Bootstrapped and tested on x86_64-unknown-linux-gnu. It's very ugly so I'd appreciate suggestions on how to handle such situations better? gcc/cp/ * pt.cc (coerce_template_parms): Release expanded argument vector when

[PATCH 1/3] Speedup iterative_hash_template_arg

2024-10-02 Thread Richard Biener
Using iterative_hash_object is expensive compared to using iterative_hash_hashval_t which is fit for integer sized values. The following reduces the number of perf cycles spent in iterative_hash_template_arg and iterative_hash combined by 20%. Bootstrapped and tested on x86_64-unknown-linux-gnu.

[PATCH] Adjust gcc.dg/vect/vect-double-reduc-5.c

2024-10-02 Thread Richard Biener
The testcase XPASSes now and should do so everywhere I think. Pushed. * gcc.dg/vect/vect-double-reduc-5.c: Adjust. --- gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c

[PATCH] Adjust gcc.dg/vect/slp-12a.c

2024-10-02 Thread Richard Biener
We can now SLP the loop. There's PR116583 tracking that this still fails for VLA vectors when load-lanes doesn't support a group of size 8. We can't express this right now so the testcase keeps FAILing for aarch64 with SVE (but passes now for riscv). Pushed. * gcc.dg/vect/slp-12a.c: Adj

[PATCH] Adjust expectation for gcc.dg/vect/slp-19c.c

2024-10-02 Thread Richard Biener
We can now vectorize the first loop with SLP when using V2SImode vectors since then we can handle the non-power-of-two interleaving. We can also SLP the second loop reliably now after adding induction support for VLA vectors. Pushed. * gcc.dg/vect/slp-19c.c: Adjust expectation. --- gcc/t

[PATCH] un-XFAIL gcc.dg/vect/vect-double-reduc-5.c

2024-10-02 Thread Richard Biener
The testcase now passes, we can handle double reductions with multiple types fine. Pushed. * gcc.dg/vect/vect-double-reduc-5.c: Un-XFAIL everywhere. --- gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/gcc/testsuite/g

[PATCH] testsuite/116596 - fix gcc.dg/vect/slp-11a.c

2024-10-02 Thread Richard Biener
The condition on "vectorizing stmts using SLP" needs to match that of "vectorized 1 loops", obviously. Pushed. PR testsuite/116596 * gcc.dg/vect/slp-11a.c: Fix. --- gcc/testsuite/gcc.dg/vect/slp-11a.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsui

[PATCH] testsuite/116660 - adjust testcases unexpectedly failing on 32bit sparc

2024-10-02 Thread Richard Biener
Both testcases miss some effective target requires. Pushed. PR testsuite/116660 * gcc.dg/vect/no-scevccp-outer-12.c: Add vect_pack_trunc. * gcc.dg/vect/vect-multitypes-6.c: Add vect_char_add, remove explicit 32bit sparc XFAIL. --- gcc/testsuite/gcc.dg/vect/no-scev

Re: [PATCH] backprop: Fix deleting of a phi node [PR116922]

2024-10-02 Thread Richard Biener
On Wed, Oct 2, 2024 at 5:13 AM Andrew Pinski wrote: > > The problem here is remove_unused_var is called on a name that is > defined by a phi node but it deletes it like removing a normal statement. > remove_phi_node should be called rather than gsi_remove for phinodes. > > Note there is a possibil

Re: [PATCH] phiopt: Fix VCE moving by rewriting it into cast [PR116098]

2024-10-02 Thread Richard Biener
On Wed, Oct 2, 2024 at 1:11 AM Andrew Pinski wrote: > > Phiopt match_and_simplify might move a well defined VCE assign statement > from being conditional to being uncondtitional; that VCE might no longer > being defined. It will need a rewrite into a cast instead. > > This adds the rewriting code

Re: [PATCH][Backport][GCC12] tree-optimization/116585 - SSA corruption with split_constant_offset

2024-10-01 Thread Richard Biener
On Tue, 1 Oct 2024, Qing Zhao wrote: > From: Richard Biener > > Hi, this is the backport of the fix for PR116585 to GCC12. > bootstrapped and regress tested on both X86 and aarch64. > > Okay for committing? OK. > thanks. > > Qing. > > ==

Re: [PATCH][Backport][GCC13] tree-optimization/116585 - SSA corruption with split_constant_offset

2024-10-01 Thread Richard Biener
On Tue, 1 Oct 2024, Qing Zhao wrote: > From: Richard Biener > > Hi, this is the backport of the fix for PR116585 to GCC13. > bootstrapped and regress tested on both X86 and aarch64. > > Okay for committing? OK. > thanks. > > Qing. > > === > split_

Re: [PATCH][Backport][GCC14] tree-optimization/116585 - SSA corruption with split_constant_offset

2024-10-01 Thread Richard Biener
On Tue, 1 Oct 2024, Qing Zhao wrote: > From: Richard Biener > > Hi, this is the backport of the fix for PR116585 to GCC14. > bootstrapped and regress tested on both X86 and aarch64. > > Okay for committing? OK. &

[PATCH] Fix gcc.dg/pr116905.c

2024-10-01 Thread Richard Biener
I missed { dg-add-options float16 }. Pushed. * gcc.dg/pr116905.c: Add float16 options. --- gcc/testsuite/gcc.dg/pr116905.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/gcc.dg/pr116905.c b/gcc/testsuite/gcc.dg/pr116905.c index 0a2b96ac1c1..89de8525b25 100644 --- a/gcc

[PATCH 2/2] testsuite/116654 - adjust gcc.target/powerpc/p9-vec-length-full-8.c

2024-10-01 Thread Richard Biener
gcc.target/powerpc/p9-vec-length-full-8.c was expecting all loops to use -with-len fully masked vectorization to avoid epilogues because the loops needed peeling for gaps. With SLP we have improved things here and the loops using V2D[IF]mode no longer need peeling for gaps since the target can com

[PATCH 1/2] testsuite/116654 - adjust gcc.dg/vect/costmodel/ppc/costmodel-slp-12.c

2024-10-01 Thread Richard Biener
As we now SLP non-grouped stores we have to adjust the expected count. Pushed. PR testsuite/116654 * gcc.dg/vect/costmodel/ppc/costmodel-slp-12.c: Adjust. --- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-12.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --

Re: [PATCH] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-10-01 Thread Richard Biener
On Sun, Sep 29, 2024 at 5:28 PM Jeff Law wrote: > > > > On 9/25/24 2:30 AM, Eikansh Gupta wrote: > > This patch simplify `min(a,b) op max(a,b)` to `a op b`. This optimization > > will work for all the binary commutative operations. So, the `op` here can > > be one of {plus, mult, bit_and, bit_xor,

Re: [PATCH] MATCH: Simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, -1.0/1.0)` [PR112472]

2024-10-01 Thread Richard Biener
On Tue, Sep 24, 2024 at 10:58 AM Eikansh Gupta wrote: > > This patch simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, > -1.0/1.0)` > depending on the sign of CST. Previously, it was simplified to `copysign (x, > CST)`. > It can be optimized as the sign of the CST matters, not the val

[PATCH] tree-optimization/116902 - vectorizer load hosting breaks UID order #2

2024-10-01 Thread Richard Biener
This is another case of load hoisting breaking UID order in the preheader, this time between two hoistings. The easiest way out is to do what we do for the main stmt - copy instead of move. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/116902 P

Re: [PATCH] Fix wrong code out of NRV + RSO + inlining (take 2)

2024-10-01 Thread Richard Biener
On Tue, Oct 1, 2024 at 12:01 PM Eric Botcazou wrote: > > Hi, > > the attached Ada testcase compiled with -O -flto exhibits a wrong code issue > when the 3 optimizations NRV + RSO + inlining are applied to the same call: if > the LHS of the call is marked write-only before inlining, then it will ke

[PATCH] tree-optimization/116654 - missed dr_explicit_realign[_optimized] with SLP

2024-10-01 Thread Richard Biener
With single-lane SLP we miss to use the power realing loads causing some testsuite FAILs. r14-2430-g4736ddd11874fe exempted SLP of non-grouped accesses because that could have been only splats where the scheme isn't used anyway, but now with single-lane SLP it can be contiguous accesses. Bootstra

Re: [PATCH] [PR86710][PR116826] match.pd: Fold logarithmic identities.

2024-10-01 Thread Richard Biener
gN(a), logN(a) + logN(b) -> logN(a*b), > and logN(a) - logN(b) -> logN(a/b). > > gcc/testsuite/ > * gcc.dg/tree-ssa/log_ident.c: New test. > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] phi-opt: Improve factor heurstic with constants and conversions from bool [PR116890]

2024-10-01 Thread Richard Biener
On Mon, Sep 30, 2024 at 11:50 PM Andrew Pinski wrote: > > Take: > ``` > if (t_3(D) != 0) > goto ; > else > goto ; > > > _8 = c_4(D) != 0; > _9 = (int) _8; > > > # e_2 = PHI <_9(3), 0(2)> > ``` > > We should factor out the conversion here as that will allow a simplfication t

Re: [PATCH] middle-end: Fix ifcvt predicate generation for masked function calls

2024-10-01 Thread Richard Biener
On Mon, Sep 30, 2024 at 8:40 PM Tamar Christina wrote: > > Hi Victor, > > Thanks! This looks good to me with one minor comment: > > > -Original Message- > > From: Victor Do Nascimento > > Sent: Monday, September 30, 2024 2:34 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Tamar Christina ; ri

[PATCH] tree-optimization/116566 - single lane SLP for VLA inductions

2024-10-01 Thread Richard Biener
The following adds SLP support for vectorizing single-lane inductions with variable length vectors. Bootstrapped and tested on x86_64-unknown-linux-gnu. PR tree-optimization/116566 * tree-vect-loop.cc (vectorizable_induction): Handle single-lane SLP for VLA vectors. --- g

[PATCH] tree-optimization/116906 - unsafe PRE with never executed edges

2024-10-01 Thread Richard Biener
When we're computing ANTIC for PRE we treat edges to not yet visited blocks as having a maximum ANTIC solution to get at an optimistic solution in the iteration. That assumes the edges visted eventually execute. This is a wrong assumption that can lead to wrong code (and not only non-optimality)

[PATCH] tree-optimization/116905 - ICE with bogus range ops

2024-10-01 Thread Richard Biener
The following avoids querying ranges of vector entities. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. PR tree-optimization/116905 * tree-vect-stmts.cc (supportable_indirect_convert_operation): Fix guard for vect_get_range_info. * gcc.dg

Re: [PATCH v1 1/2] Match: Support form 2 for scalar signed integer SAT_SUB

2024-10-01 Thread Richard Biener
On Thu, Sep 26, 2024 at 2:25 PM wrote: > > From: Pan Li > > This patch would like to support the form 2 of the scalar signed > integer SAT_SUB. Aka below example: > > Form 2: > #define DEF_SAT_S_SUB_FMT_2(T, UT, MIN, MAX) \ > T __attribute__((noinline)) \ > sat_s_sub_##T##

[PATCH] tree-optimization/116566 - single lane SLP for VLA inductions

2024-09-30 Thread Richard Biener
The following adds SLP support for vectorizing single-lane inductions with variable length vectors. This is a WIP patch, local testing for SVE and riscv is fine but the CI might discover issues. PR tree-optimization/116566 * tree-vect-loop.cc (vectorizable_induction): Handle singl

[PATCH] tree-optimization/116879 - failure to recognize non-empty latch

2024-09-30 Thread Richard Biener
When we relaxed the vectorizers constraint on loop structure verifying the emptiness of the latch became too lose as can be seen in the case for PR116879 where the latch effectively contains two basic-blocks which one being an unmerged forwarder that's not empty. Bootstrapped and tested on x86_64-

RE: [PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817]

2024-09-30 Thread Richard Biener
>if (check_bool_pattern (var, vinfo, bool_stmts)) > var = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo); >else if (integer_type_for_mask (var, vinfo)) > return NULL; > else if (TREE_CODE (TREE_TYPE (var)) == BOOLEAN_TYPE > -&&

[PATCH] tree-optimization/113197 - bougs assert in PTA

2024-09-30 Thread Richard Biener
PTA asserts that EAF_NO_DIRECT_READ is not set when flags are set consistently which doesn't make sense. The following removes the assert. Bootstrap & regtest running on x86_64-unknown-linux-gnu. Richard. PR tree-optimization/113197 * tree-ssa-structalias.cc (handle_call_arg): R

Re: [RFC PATCH] More detailed diagnostics for section type conflicts

2024-09-29 Thread Richard Biener
On Sun, Sep 29, 2024 at 5:13 PM Florian Weimer wrote: > > Sometimes this is a user error, sometimes it is more of an ICE. > In either case, more information about the conflict is helpful. > > I used to this to get a better idea about what is going on with > PR116887. The original diagnostics look

Re: [RFC/RFA] [PATCH v4 10/12] Verify detected CRC loop with symbolic execution and LFSR matching

2024-09-29 Thread Richard Biener
On Sun, Sep 29, 2024 at 8:01 PM Jeff Law wrote: > > > > On 9/13/24 5:06 AM, Mariam Arutunian wrote: > > Symbolically execute potential CRC loops and check whether the loop > > actually calculates CRC (uses LFSR matching). > > Calculated CRC and created LFSR are compared on each iteration of the >

[PATCH] tree-optimization/116842 - vectorizer load hosting breaks UID order

2024-09-28 Thread Richard Biener
The following fixes the case when vectorizing a load hoists an invariant load and dependent stmts, thereby breaking UID order of said stmts. While we duplicate the load we just move the dependences. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. PR tree-optimizat

Re: RFC: C++ and C23 zero initialization of padding bits

2024-09-28 Thread Richard Biener
On Sat, 28 Sep 2024, Jakub Jelinek wrote: > Hi! > > C++ has > https://eel.is/c++draft/dcl.init#general-6.2 > https://eel.is/c++draft/dcl.init#general-6.3 > which says that during zero-initialization padding bits of structures > and unions are zero initialized, and in > https://eel.is/c++draft/dcl

Re: [PATCH v3 3/4] tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types

2024-09-28 Thread Richard Biener
On Fri, Sep 27, 2024 at 6:39 PM Artemiy Volkov wrote: > > On 9/27/2024 1:24 PM, Richard Biener wrote: > > On Mon, 23 Sep 2024, Artemiy Volkov wrote: > > > >> Implement a match.pd transformation inverting the sign of X in > >> C1 - X cmp C2, where C1 and C2 are

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-28 Thread Richard Biener
On Fri, 27 Sep 2024, Jakub Jelinek wrote: > On Fri, Sep 27, 2024 at 12:14:47PM +0200, Richard Biener wrote: > > I can investigate a bit when there's a testcase showing the issue. > > The testcase is pr78687.C with Marek's cp-gimplify.cc patch. OK, I can reproduce. The

[PATCH] Fix sorting in Contributors.html

2024-09-27 Thread Richard Biener
The following moves my entry to where it belongs alphabetically (it wasn't moved when s/Guenther/Biener/). Pushed as obvious. * doc/contrib.texi (Richard Biener): Move entry. --- gcc/doc/contrib.texi | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gc

[PATCH] tree-optimization/116785 - relax volatile handling in PTA

2024-09-27 Thread Richard Biener
When there's volatile qualified stores we do not have to treat the destination as pointing to ANYTHING. It's only when reading from it that we want to treat the resulting pointers as pointing to ANYTHING. Bootstrapped and tested on x86_64-unknown-linux-gnu. Richard. PR tree-optimization

Re: [PATCH v3 3/4] tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types

2024-09-27 Thread Richard Biener
-tree-dump-times "gimple_simplified to.* \\+ -11.*\n.*>= > -21" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ -11.*\n.*>= > -30" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simp

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Richard Biener
On Fri, 27 Sep 2024, Jakub Jelinek wrote: > On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote: > > > __attribute__((noinline)) > > > struct ref_proxy f () > > > { > > >struct ref_proxy ptr; > > >struct ref_proxy D.10036;

Re: [PATCH v3 4/4] tree-optimization/116024 - simplify some cases of X +- C1 cmp C2

2024-09-27 Thread Richard Biener
l < 20; // f() > -21u > +} > + > +int32_t i3b(void) > +{ > + uint32_t l = 30 + (uint32_t)f(); > + return l >= 30; // f() <= -31u > +} > + > +int32_t i3c(void) > +{ > + uint32_t l = 40 + (uint32_t)f(); > + return l > 39; // f() < -39u

Re: [PATCH v3 2/4] tree-optimization/116024 - simplify C1-X cmp C2 for unsigned types

2024-09-27 Thread Richard Biener
rwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 19.*\n.*<= > 29" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ > 4294967196.*\n.*<= 100" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ > 4294967095.*\n.*<= 99" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 999.*\n.*> > 1999" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 2000.*\n.*> > 3000" 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ > 4294957295.*\n.*> " 1 "forwprop1" } } */ > +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ > 4294947296.*\n.*> 1" 1 "forwprop1" } } */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v3 1/4] tree-optimization/116024 - simplify C1-X cmp C2 for UB-on-overflow types

2024-09-27 Thread Richard Biener
2; // return 0 > +} > + > +int32_t i1g(void) > +{ > + int32_t l = 2; > + l = INT32_MAX/2 + 30 - (int32_t)f(); > + return l <= INT32_MIN/2 - 30; // return 1 > +} > + > + > +/* { dg-final { scan-tree-dump-times "Removing dead stmt:.*?- _" 5 > "forwprop

Re: [PING] [PATCH v2] gimple ssa: Don't use __builtin_popcount in switch exp transform

2024-09-27 Thread Richard Biener
. I will take care to include the tag in the git commit message. OK. Thanks, Richard. > Thanks, > Filip Kastl > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v1] Widening-Mul: Fix one ICE when iterate on phi node

2024-09-27 Thread Richard Biener
On Fri, Sep 27, 2024 at 9:52 AM wrote: > > From: Pan Li > > We iterate all phi node of bb to try to match the SAT_* pattern > for scalar integer. We also remove the phi mode when the relevant > pattern matched. > > Unfortunately the iterator may have no idea the phi node is removed > and continu

Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread Richard Biener
On Sun, Sep 22, 2024 at 5:49 AM -thor wrote: > > From: thor > > This is the second revision of: > > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html > > I've incorporated the feedback given both by Richard and David - I didn't > find any memory leaks when testing in valg

Re: [PATCH 2/2] HTML Dumping of trees from gdb

2024-09-27 Thread Richard Biener
On Sun, Sep 22, 2024 at 5:49 AM -thor wrote: > > From: thor > > This patch allows one to dump a tree as HTML from within gdb by invoking, > i.e, > htlml-tree tree I have managed to get a browser window launched with the following incremental patch (xdg-open should be a better default than

Re: [RFC] Prevent the scheduler from moving prefetch instructions when expanding __builtin_prefetch [PR 116713]

2024-09-27 Thread Richard Biener
On Fri, Sep 27, 2024 at 6:27 AM Pietro Monteiro wrote: > > The prefetch instruction that is emitted by __builtin_prefetch is re-ordered > on GCC, but not on clang[0]. GCC's behavior is surprising because when using > the builtin you want the instruction to be placed at the exact point where > y

Re: [PING][PATCH V4 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-09-26 Thread Richard Biener
!= 2147395600) > > + abort (); > > + > > + return 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ > > +/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: > > detected" 46 "vect" } } */ > > +/* { dg-final { scan-assembler "\[ \t\]udot\tz\[0-9\]+.s, z\[0-9\]+.h, > > z\[0-9\]+.h" } } */ > > +/* { dg-final { scan-assembler "\[ \t\]sdot\tz\[0-9\]+.s, z\[0-9\]+.h, > > z\[0-9\]+.h" } } */ > > diff --git a/gcc/testsuite/lib/target-supports.exp > > b/gcc/testsuite/lib/target-supports.exp > > index 11ba77ca404..ebbc2fb8015 100644 > > --- a/gcc/testsuite/lib/target-supports.exp > > +++ b/gcc/testsuite/lib/target-supports.exp > > @@ -4258,6 +4258,15 @@ proc check_effective_target_vect_int { } { > > }}] > > } > > > > +# Return 1 if the target supports two-way dot products on inputs of hi mode > > +# producing si outputs, 0 otherwise. > > + > > +proc check_effective_target_vect_dotprod_hisi { } { > > +return [check_cached_effective_target_indexed aarch64_sme2 { > > + expr { [check_effective_target_aarch64_sme2] > > +}}] > > +} > > + > > # Return 1 if the target supports vectorization of early breaks, > > # 0 otherwise. > > # > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] unswitch: Replace manual ondemand maybe_undef with ssa_name_maybe_undef_p/mark_ssa_maybe_undefs [PR116848]

2024-09-26 Thread Richard Biener
On Thu, Sep 26, 2024 at 5:11 PM Andrew Pinski wrote: > > The ondemand maybe_undef that follows phis was added in > r7-6427-g8b670f93ab1136 > but then later ssa_name_maybe_undef_p/mark_ssa_maybe_undefs was added in > r13-972-gbe2861fe8c527a. This moves the ondemand one to use > mark_ssa_maybe_unde

Re: [PATCH] Fix PR58143 and dups

2024-09-26 Thread Richard Biener
On Thu, 26 Sep 2024, Andrew Pinski wrote: > On Tue, Oct 15, 2013 at 6:57 AM Richard Biener wrote: > > > > > > This is an alternate fix (see > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00234.html for the other > > one) for the various PRs that show that

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-26 Thread Richard Biener
273 = {CLOBBER(eos)}; > > > -} > > > + D.9472 = {}; > > > + D.9472._storage._which = 2; > > > + ref_proxy >::ref_proxy > > > (&, D.9472); > > > + return ; > > > > > > that makes sense: the new optimization "inlined" the > > > eggs::variants::variant::variant call. So > > > later > > > .einline has nothing to expand here whereas previously > > > > > >eggs::variants::variant::variant > > > (&D.9472, &D.9273); > > > > > > was expanded into > > > > > >MEM[(struct _storage *)&D.9472] ={v} {CLOBBER(bob)}; > > >MEM[(struct _union *)&D.9472] ={v} {CLOBBER(bob)}; > > >MEM[(struct _union *)&D.9472] ={v} {CLOBBER(bob)}; > > >MEM[(struct _storage *)&D.9472]._which = 2; > > > > > > Then make_object_1 gets inlined. Something happens in SRA. And > > > then we got rid of a lot of code. But now a lot of code remains. > > > > > > Is it simply the fact that with this opt we expand the ctor into > > > > > >D.9472 = {}; > > >D.9472._storage._which = 2; > > > > > > which means that we zero-init the variant object and then set _which to 2, > > > while previously we just allocated storage and set _which to 2? > > > > That seems likely since the patch that fixed the bug before was dealing with > > partially-initialized objects. Why does the optimization change that? > > CCing a few optimizer folks. To recap, we're talking about > tree-ssa/pr78687.C. > In in, there's: > > EGGS_CXX11_CONSTEXPR variant(U&& v) > noexcept( > std::is_nothrow_constructible::value) > : _storage{detail::index{}, std::forward(v)} > {} > > With a new C++ FE optimization this patch introduces, we can evaluate the call > at compile time. Then .gimple looks a little different (see above). What is > not clear is why we can't optimize the code as much as without this patch, > when the variant call isn't evaluated at compile time, and instead we produce > those MEMs as shown above. > > Any insights would be appreciated. > > This is .optimized with the opt on: > > __attribute__((noinline)) > struct ref_proxy f () > { >struct ref_proxy ptr; >struct ref_proxy D.10036; >struct ref_proxy type; >struct ref_proxy type; >struct qual_option D.10031; >struct ref_proxy D.10030; >struct qual_option inner; >struct variant t; >struct variant D.10026; >struct variant D.10024; >struct inplace_ref D.10023; >struct inplace_ref ptr; >struct ref_proxy D.9898; > > [local count: 1073741824]: >MEM [(struct variant *)&D.10024] = {}; Without actually checking it might be that SRA chokes on the above. The IL is basically a huge chain of aggregate copies interspersed with clobbers and occasional scalar inits and I fear that we really only have SRA dealing with this. Is there any reason to use the char[40] init instead of a aggregate {} init of type variant? I would suggest to open a bugreport. >D.10024._storage._which = 2; >D.10026 = D.10024; >t = D.10026; >MEM[(struct variant_ref *)&D.9898] ={v} {CLOBBER(bob)}; >MEM[(struct variant_ref *)&D.9898].inner_storage_ = t; >t ={v} {CLOBBER(eos)}; >D.10026 ={v} {CLOBBER(eos)}; >D.10024 ={v} {CLOBBER(eos)}; >MEM [(struct ref_proxy *)&D.9898 + 40B] = 2; >D.10036 = D.9898; >ptr = D.10036; >MEM[(struct variant_ref *)&D.10030] ={v} {CLOBBER(bob)}; >MEM[(struct variant_ref *)&D.10030].inner_storage_ = > ptr.D.9270.inner_storage_; >ptr ={v} {CLOBBER(eos)}; >D.10036 ={v} {CLOBBER(eos)}; >MEM [(struct ref_proxy *)&D.10030 + 40B] = 2; >type = D.10030; >type = type; >MEM[(struct __as_base &)&D.10031] ={v} {CLOBBER(bob)}; >D.10031.type_ = type; >type ={v} {CLOBBER(eos)}; >type ={v} {CLOBBER(eos)}; >MEM [(struct qual_option *)&D.10031 + 40B] = 2; >D.10031.quals_ = 0; >inner = D.10031; >D.10023 ={v} {CLOBBER(bob)}; >D.10023.inner_ = inner; >inner ={v} {CLOBBER(eos)}; >D.10030 ={v} {CLOBBER(eos)}; >D.10031 ={v} {CLOBBER(eos)}; >MEM [(struct inplace_ref *)&D.10023 + 40B] = 2; >MEM [(struct inplace_ref *)&D.10023 + 48B] = 0; >ptr = D.10023; > ={v} {CLOBBER(bob)}; >.D.9858 = ptr; >ptr ={v} {CLOBBER(eos)}; >D.9898 ={v} {CLOBBER(eos)}; >return ; > > } > > and this is the result without this patch: > > __attribute__((noinline)) > struct ref_proxy f () > { >[local count: 1073741824]: >={v} {CLOBBER(bob)}; > MEM [(struct inplace_ref *)& + 40B] = 2; > MEM [(struct inplace_ref *)& + 48B] = 0; > return ; > > } > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] tree-optimization/116850 - path isolation fails to update post-dominators

2024-09-26 Thread Richard Biener
On Thu, 26 Sep 2024, Jeff Law wrote: > > > On 9/26/24 5:06 AM, Richard Biener wrote: > > When path isolation performs CFG manipulations (block splitting) it > > fails to update post-dominators it computes on-demand. That both > > runs into dominance verification

Re: [PATCH] tree-optimization/114855 - speed up dom_oracle::register_transitives

2024-09-26 Thread Richard Biener
On Thu, 26 Sep 2024, Andrew MacLeod wrote: > > On 9/26/24 10:03, Andrew MacLeod wrote: > > > > On 9/26/24 03:07, Richard Biener wrote: > >> On Wed, 25 Sep 2024, Andrew MacLeod wrote: > >> > >>> > >>> > >>> I added a n

Re: [PATCH] tree-optimization/116850 - path isolation fails to update post-dominators

2024-09-26 Thread Richard Biener
On Thu, 26 Sep 2024, Richard Biener wrote: > When path isolation performs CFG manipulations (block splitting) it > fails to update post-dominators it computes on-demand. That both > runs into dominance verification issues when we compute post-dominators > again and possibly accessin

[PATCH] tree-optimiztation/114855 - profile prediction slowness

2024-09-26 Thread Richard Biener
The testcase in PR114855 shows profile prediction to evaluate the same SSA def via expr_expected_value for each condition or switch in a function. The following patch caches the expected value (and probability/predictor) for each visited SSA def, also protecting against recursion and thus obsoleti

[PATCH 2/2] tree-optimization/116818 - try VMAT_GATHER_SCATTER also for SLP

2024-09-26 Thread Richard Biener
When not doing SLP and we end up with VMAT_ELEMENTWISE we consider using strided loads, aka VMAT_GATHER_SCATTER. The following moves this logic down to also apply to SLP where we now can end up using VMAT_ELEMENTWISE as well. PR tree-optimization/116818 * tree-vect-stmts.cc (get_g

[PATCH 1/2] Fix bogus SLP nvector compute in check_load_store_for_partial_vectors

2024-09-26 Thread Richard Biener
We have a new overload for vect_get_num_copies that handles both SLP and non-SLP. Use it and avoid the division by group_size for SLP when not using load-store lanes. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Use the new vect_get_num_copies overload. Only divid

[PATCH] tree-optimization/116850 - path isolation fails to update post-dominators

2024-09-26 Thread Richard Biener
When path isolation performs CFG manipulations (block splitting) it fails to update post-dominators it computes on-demand. That both runs into dominance verification issues when we compute post-dominators again and possibly accessing missing or broken post-dominance data when checking is disabled.

Re: [PATCH] pretty-print: Fix up allocate_object

2024-09-26 Thread Richard Biener
On Thu, 26 Sep 2024, Jakub Jelinek wrote: > On Thu, Aug 29, 2024 at 06:58:12PM -0400, David Malcolm wrote: > > The following patch rewrites the internals of pp_format. > > > The tokens and token lists are allocated on the chunk_obstack, and so > > there's no additional heap activity required, wit

Re: [PATCH] tree-optimization/114855 - speed up dom_oracle::register_transitives

2024-09-26 Thread Richard Biener
On Wed, 25 Sep 2024, Andrew MacLeod wrote: > > On 9/25/24 06:51, Richard Biener wrote: > > dom_oracle::register_transitives contains an unbound dominator walk > > which for the testcase in PR114855 dominates the profile. I've also > > noticed odd behavior in t

Re: [PATCH] ltmain.sh: allow more flags at link-time

2024-09-25 Thread Richard Biener
On Wed, 25 Sep 2024, Sam James wrote: > Alan Modra writes: > > > On Thu, Sep 19, 2024 at 11:52:48PM +0100, Sam James wrote: > >> Sam James writes: > >> > >> > Sam James writes: > >> > > >> >> libtool defaults to filtering flags passed at link-time. > >> >> > >> >> This brings the filtering in

Re: [PATCH] MATCH: add abs support for half float

2024-09-25 Thread Richard Biener
On Wed, Sep 25, 2024 at 12:12 PM Kugan Vivekanandarajah wrote: > > Hi Richard, > > > On 24 Sep 2024, at 6:16 pm, Richard Biener > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Mon, Sep 23,

Re: [PATCH v2] match: Fix `a != 0 ? a * b : 0` patterns for things that trap [PR116772]

2024-09-25 Thread Richard Biener
On Tue, Sep 24, 2024 at 2:59 AM Andrew Pinski wrote: > > For generic, `a != 0 ? a * b : 0` would match where `b` would be an expression > which trap (in the case of the testcase, it was an integer division but it > could be any). > > This adds a new helper function, expr_no_side_effects_p which t

Re: [PATCH v1 1/2] Match: Support form 3 for vector signed integer .SAT_ADD

2024-09-25 Thread Richard Biener
On Sat, Sep 21, 2024 at 4:23 PM wrote: > > From: Pan Li > > This patch would like to support the form 3 of the vector signed > integer .SAT_ADD. Aka below example: > > Form 3: > #define DEF_VEC_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ > void __attribute__((noinline))

Re: [PATCH v4] match: Fix A || B not optimized to true when !B implies A [PR114326]

2024-09-25 Thread Richard Biener
On Thu, Sep 19, 2024 at 2:18 PM Konstantinos Eleftheriou wrote: > > From: kelefth > > In expressions like (a != b || ((a ^ b) & c) == d) and > (a != b || (a ^ b) == c), (a ^ b) is folded to false. > In the equivalent expressions (((a ^ b) & c) == d || a != b) and > ((a ^ b) == c || a != b) this i

Re: [PATCH v2] match: Change (A * B) + (-C) to (B - C/A) * A, if C multiple of A [PR109393]

2024-09-25 Thread Richard Biener
On Tue, Sep 17, 2024 at 9:43 AM Konstantinos Eleftheriou wrote: > > From: kelefth > > The following function: > > int foo(int *a, int j) > { > int k = j - 1; > return a[j - 1] == a[k]; > } > > does not fold to `return 1;` using -O2 or higher. The cause of this is that > the expression `4 * j

[PATCH] Use tree view for find_always_executed_bbs result

2024-09-25 Thread Richard Biener
The following makes us use bitmap tree view for the always-executed-BBs bitmap as computed by IPA utils find_always_executed_bbs and used by IPA modref (where it shows up in the profile for PR114855. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * ipa-utils.cc (find_always_

[PATCH] Speed up get_bitmask_from_range

2024-09-25 Thread Richard Biener
When min != max we know min ^ max != 0. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * value-range.cc (get_bitmask_from_range): Remove redundant compare of xorv with zero. --- gcc/value-range.cc | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a

gcc-patches@gcc.gnu.org

2024-09-25 Thread Richard Biener
wide_int_storage shows up high in the profile for the testcase in PR114855 where the apparent issue is that the conditional jump on 'precision' after the (inlined) memcpy stalls the pipeline due to the data dependence and required store-to-load forwarding. We can add scheduling freedom by instead

[PATCH] remove dominator recursion from reassoc

2024-09-25 Thread Richard Biener
The reassoc pass currently walks dominators in a recursive way where I ran into a stack overflow with. The following replaces it with worklists following patterns used elsewhere. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. * tree-ssa-reassoc.cc (break_up_subt

[PATCH] tree-optimization/114855 - speed up dom_oracle::register_transitives

2024-09-25 Thread Richard Biener
dom_oracle::register_transitives contains an unbound dominator walk which for the testcase in PR114855 dominates the profile. I've also noticed odd behavior in the case when set_one_relation returns NULL, we'd then completely abort processing other relations. The following fixes the latter by con

[PATCH] Remove recursion in simplify_control_stmt_condition_1 [PR114855].

2024-09-25 Thread Richard Biener
From: Aldy Hernandez Remove some ad-hoc simplification code in the forward threader, as the call into the ranger in m_simplifier->simplify() will handle anything we can do manually in simplify_control_stmt_condition_1. In PR114855, DOM time is reduced from 120s to 92s (-23%) and overall compilat

Re: [PATCH] i386, v2: Add GENERIC and GIMPLE folders of __builtin_ia32_{min,max}* [PR116738]

2024-09-25 Thread Richard Biener
On Wed, Sep 25, 2024 at 10:49 AM Hongtao Liu wrote: > > On Wed, Sep 25, 2024 at 4:42 PM Jakub Jelinek wrote: > > > > On Wed, Sep 25, 2024 at 10:17:50AM +0800, Hongtao Liu wrote: > > > > + for (int i = 0; i < 2; ++i) > > > > + { > > > > + unsigned count = vector

  1   2   3   4   5   6   7   8   9   10   >