Re: [PATCH] gimple: Remove special handling of COND_EXPR for COMPARISON_CLASS_P [PR116949, PR114785]

2024-10-30 Thread Richard Biener
On Wed, Oct 30, 2024 at 1:56 AM Andrew Pinski wrote: > > After r13-707-g68e0063397ba82, COND_EXPR for gimple assign no longer could > contain a comparison. > The vectorizer was builting gimple assigns with comparison until > r15-4695-gd17e672ce82e69 > (which added an assert to make sure it no lo

Re: [PATCH #2/7] drop redundant ifcombine_ifandif parm (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > In preparation to changes that may modify both inner and outer > conditions in ifcombine, drop the redundant parameter result_inv, that > is always identical to inner_inv. OK. > > for gcc/ChangeLog > > * tree-ssa-ifcombine.cc

[PATCH] Remove vectorizer finish_cost wrapper

2024-10-30 Thread Richard Biener
The inline function wraps the vector_cost class API and no longer is a good representation of the query style of that class which makes it also difficult to extend. Boostrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-vectorizer.h (finish_cost): Inline everywhere and remove.

Re: [PATCH #6/7] ifcombine across noncontiguous blocks (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > Rework ifcombine to support merging conditions from noncontiguous > blocks. This depends on earlier preparation changes. > > The function that attempted to ifcombine a block with its immediate > predecessor, tree_ssa_ifcombine_bb, now l

Re: [PATCH #5/7] extend ifcombine_replace_cond to handle noncontiguous ifcombine (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > Prepare to handle noncontiguous ifcombine, introducing logic to modify > the outer condition when needed. There are two cases worth > mentioning: > > - when blocks are noncontiguous, we have to place the combined > condition in the ou

Re: [PATCH #4/7] adjust update_profile_after_ifcombine for noncontiguous ifcombine (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > Prepare for ifcombining noncontiguous blocks, adding (still unused) > logic to the ifcombine profile updater to handle such cases. > > > for gcc/ChangeLog > > * tree-ssa-ifcombine.cc (known_succ_p): New. > (update_profil

Re: [PATCH #3/7] introduce ifcombine_replace_cond (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > Refactor ifcombine_ifandif, moving the common code from the various > paths that apply the combined condition to a new function. > > > for gcc/ChangeLog > > * tree-ssa-ifcombine.cc (ifcombine_replace_cond): Factor out >

Re: [PATCH #1/7] allow vuses in ifcombine blocks (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine)

2024-10-30 Thread Richard Biener
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote: > > > Disallowing vuses in blocks for ifcombine is too strict, and it > prevents usefully moving fold_truth_andor into ifcombine. That > tree-level folder has long ifcombined loads, absent other relevant > side effects. OK. Richard. > > fo

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-30 Thread Richard Biener
On Wed, Oct 30, 2024 at 8:47 AM Kugan Vivekanandarajah wrote: > > Hi Richard, > > > On 29 Oct 2024, at 8:33 pm, Richard Biener > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Tue, Oct 29, 2024 at 9:24

Re: [PATCH 1/5] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-10-30 Thread Richard Biener
t; +tree > +build_cheap_unsigned_int_sat_add (tree op_0, tree op_1, tree type) > +{ > + /* (bit_ior @2 (negate (convert (lt @2 @0)) */ > + return build2 (BIT_IOR_EXPR, type, > +build2 (PLUS_EXPR, type, op_0, op_1), > +build1 (NEGATE_EXPR, type, >

Re: [PATCH v1] Doc: Add doc for standard name mask_len_strided_load{store}m

2024-10-30 Thread Richard Biener
On Wed, Oct 30, 2024 at 2:39 AM wrote: > > From: Pan Li > > This patch would like to add doc for the below 2 standard names. > > 1. strided load: v = mask_len_strided_load (ptr, stried, mask, len, bias) > 2. strided store: mask_len_stried_store (ptr, stride, v, mask, len, bias) > > gcc/ChangeLog:

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

2024-10-30 Thread Richard Biener
On Tue, 29 Oct 2024, Alex Coplan wrote: > On 29/10/2024 13:39, Richard Biener wrote: > > On Mon, 28 Oct 2024, Alex Coplan wrote: > > > > > This allows us to vectorize more loops with early exits by forcing > > > peeling for alignment to make sure that we're

Re: [PATCH] genmatch: Fix build on hppa64-hpux [PR117348]

2024-10-30 Thread Richard Biener
p;& defined(HAVE_DECL_FMEMOPEN) > +#if defined(GENMATCH_SELFTESTS) && HAVE_DECL_FMEMOPEN > #pragma GCC diagnostic push > #pragma GCC diagnostic ignored "-Wsuggest-attribute=format" > > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] Remove dead part of bool pattern recognition

2024-10-29 Thread Richard Biener
Given we no longer want vcond[u]{,_eq} and VEC_COND_EXPR or COND_EXPR with embedded GENERIC comparisons the whole check_bool_pattern and adjust_bool_stmts machinery is dead. It is effectively dead after r15-4713-g0942bb85fc5573 and the following patch removes it. Bootstrapped and tested on x86_64

Re: [PATCH v2 2/2] Match: make SAT_ADD case 7 commutative

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 4:45 PM Akram Ahmad wrote: > > Case 7 of unsigned scalar saturating addition defines > SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as > SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1 > being commutative. > > The pattern for case 7 currently does

[PATCH 1/2] Remove dead code in vectorizer pattern recog

2024-10-29 Thread Richard Biener
The following removes the code path in vect_recog_mask_conversion_pattern dealing with comparisons in COND_EXPRs. That can no longer happen. * tree-vect-patterns.cc (vect_recog_mask_conversion_pattern): Remove COMPARISON_CLASS_P rhs1 of COND_EXPR case and assert it doesn't

Re: [RFC PATCH 5/5] vect: Also cost gconds for scalar

2024-10-29 Thread Richard Biener
1 file changed, 3 insertions(+), 1 deletion(-) > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] Remove code in vectorizer pattern recog relying on vec_cond{u,eq,}

2024-10-29 Thread Richard Biener
On Sat, 26 Oct 2024, Richard Biener wrote: > With the intent to rely on vec_cond_mask and vec_cmp patterns > comparisons do not need rewriting into COND_EXPRs that eventually > combine to vec_cond{u,eq,}. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. So with thi

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-10-29 Thread Richard Biener
ed on aarch64-linux-gnu, no regression. > OK for mainline? > > Signed-off-by: Soumya AR > > gcc/ChangeLog: > PR target/57492 > * match.pd: Added patterns to fold certain calls to pow to ldexp. > > gcc/testsuite/ChangeLog: > PR target/57492 >

Re: [PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-10-29 Thread Richard Biener
h.pd: Fold logN(x) CMP CST -> x CMP expN(CST) > and expN(x) CMP CST -> x CMP logN(CST) > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/log_exp.c: New test. > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH 1/5] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-10-29 Thread Richard Biener
On Tue, Oct 29, 2024 at 9:27 AM wrote: > > From: Pan Li > > There are sorts of forms for the unsigned SAT_ADD. Some of them are > complicated while others are cheap. This patch would like to simplify > the complicated form into the cheap ones. For example as below: > > From the form 4 (branch)

Re: [PATCH v2 2/3] Only do switch bit test clustering when multiple labels point to same bb

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > From: Andi Kleen > > The bit cluster code generation strategy is only beneficial when > multiple case labels point to the same code. Do a quick check if > that is the case before trying to cluster. > > This fixes the switch part of PR117091 wh

Re: [RFC PATCH 5/5] vect: Also cost gconds for scalar

2024-10-29 Thread Richard Biener
On Tue, 29 Oct 2024, Richard Biener wrote: > On Mon, 28 Oct 2024, Alex Coplan wrote: > > > Currently we only cost gconds for the vector loop while we omit costing > > them when analyzing the scalar loop; this unfairly penalizes the vector > > loop in the case o

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > From: Andi Kleen > > The current switch bit test clustering enumerates all possible case > clusters combinations to find ones that fit the bit test constrains > best. This causes performance problems with very large switches. > > For bit test

Re: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605]

2024-10-29 Thread Richard Biener
On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski wrote: > > r0-126134-g5d2a9da9a7f7c1 added support for circuiting and combing the ifs > into using either AND or OR. But it only allowed the inner condition > basic block having the conditional only. This changes to allow up to 2 > defining > statemen

Re: [PATCH v2 1/2] Match: support new case of unsigned scalar SAT_SUB

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 4:44 PM Akram Ahmad wrote: > > This patch adds a new case for unsigned scalar saturating subtraction > using a branch with a greater-than-or-equal condition. For example, > > X >= (X - Y) ? (X - Y) : 0 > > is transformed into SAT_SUB (X, Y) when X and Y are unsigned scalars

Re: [RFC PATCH 2/5] vect: Don't guard scalar epilogue for inverted loops

2024-10-29 Thread Richard Biener
* tree-vect-loop-manip.cc (vect_do_peeling): Avoid emitting an > epilogue guard for inverted early-exit loops. > --- > gcc/tree-vect-loop-manip.cc | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > -- Richard Biener SUSE Software Solutions Germany GmbH,

Re: [RFC PATCH 4/5] vect: Ensure we add vector skip guard even when versioning for aliasing

2024-10-29 Thread Richard Biener
gcc/tree-vect-loop-manip.cc | 2 +- > 2 files changed, 92 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_130.c > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [RFC PATCH 3/5] vect: Fix dominators when adding a guard to skip the vector loop

2024-10-29 Thread Richard Biener
| 24 ++ > 2 files changed, 49 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/vect/vect-early-break_6.cc > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > This allows us to vectorize more loops with early exits by forcing > peeling for alignment to make sure that we're guaranteed to be able to > safely read an entire vector iteration without crossing a page boundary. > > To make this work for VLA architectu

Re: [PATCH] config: add -Werror=lto-type-mismatch, odr to bootstrap-lto*

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 12:22 PM Sam James wrote: > > Sam James writes: > > > Sam James writes: > > > >> Add -Werror=lto-type-mismatch,odr to bootstrap-lto* configurations to > >> help stop LTO breakage/correctness issues sneaking in. > >> > >> We discussed -Werror=strict-aliasing but it runs ea

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-29 Thread Richard Biener
On Sat, Oct 26, 2024 at 10:37 PM James K. Lowden wrote: > > On Sat, 26 Oct 2024 11:22:20 +0800 > Xi Ruoyao wrote: > > > The changelog is not formatted correctly. gcc/ has its own > > changelog. And gcc/cobol should have its own changelog too, like all > > other frontends. > > Thank you for point

[PATCH] tree-optimization/117343 - decide_masked_load_lanes and stale graph

2024-10-29 Thread Richard Biener
It turns out decide_masked_load_lanes accesses a stale SLP graph so the following re-builds it instead. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/117343 * tree-vect-slp.cc (vect_optimize_slp_pass::build_vertices): Support re-building

Re: [PATCH v4 1/2] Match: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).

2024-10-29 Thread Richard Biener
On Sat, Oct 26, 2024 at 12:20 AM Andrew Pinski wrote: > > On Thu, Oct 24, 2024 at 6:22 PM Li Xu wrote: > > > > From: xuli > > > > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, > > we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating > > a branch ins

Re: [PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer

2024-10-29 Thread Richard Biener
On Wed, Oct 23, 2024 at 12:47 PM wrote: > > From: Pan Li > > This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR > for invariant stride memory access. For example as below > > void foo (int * __restrict a, int * __restrict b, int stride, int n) > { > for (int i = 0;

Re: [PATCH 1/5] Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}

2024-10-29 Thread Richard Biener
On Wed, Oct 23, 2024 at 12:47 PM wrote: > > From: Pan Li > > This patch would like to introduce new IFN for strided load and store. > > LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias) > STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias) > > The IFN target below code ex

[PATCH 2/2] Remove dead vect_recog_mixed_size_cond_pattern

2024-10-29 Thread Richard Biener
vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P rhs1 COND_EXPRs which no longer appear - the following removes it. Its testcases still pass, I believe the situation is mitigated by bool pattern handling of the compare use in COND_EXPRs. * tree-vect-patterns.cc (type_c

[PATCH] tree-optimization/117333 - ICE with NULL access size DR

2024-10-29 Thread Richard Biener
dr_may_alias_p ICEs when TYPE_SIZE of DR->ref is NULL but this is valid IL when the access size of an aggregate copy can be infered from the RHS. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/117333 * tree-data-ref.cc (dr_may_alias_p): Guard aga

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-29 Thread Richard Biener
On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah wrote: > > Hi Richard, > Thanks for the review. > > > On 28 Oct 2024, at 9:18 pm, Richard Biener > > wrote: > > > > External email: Use caution opening links or attachments > > > > > >

Re: [PATCH v2 6/8] gcn: Add else operand to masked loads.

2024-10-29 Thread Richard Biener
zer to expect and undefined value > > in the inactive elements. Did your tests return any reasonable result? > > Sorry, I forgot to report back No, there were no suspicious test failures. > > Andrew > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v2 1/3] Disable -fbit-tests and -fjump-tables at -O0

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > From: Andi Kleen > > gcc/ChangeLog: OK. Thanks, Richard. > * common.opt: Enable -fbit-tests and -fjump-tables only at -O1. > * opts.cc (default_options_table): Dito. > --- > gcc/common.opt | 4 ++-- > gcc/opts.cc| 2 ++

[PATCH] tree-optimization/117307 - STMT_VINFO_SLP_VECT_ONLY mis-computation

2024-10-28 Thread Richard Biener
STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all group members and when the group is later split due to duplicates not all sub-groups inherit the flag. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/117307 * tree-vect-data-refs.cc

Re: [PATCH] Allow BB vectorisation of scalar loop when ifcvt versioned loop is not vectorized

2024-10-28 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah wrote: > > Hi, > > When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If the > vector loop is not vectorized and removed, the scalar loop is still left with > dont_vectorize. As a result, BB vectorization will not happen. >

Re: [PATCH] vec-lowering: Fix ABSU lowering [PR111285]

2024-10-28 Thread Richard Biener
On Mon, Oct 28, 2024 at 12:41 AM Andrew Pinski wrote: > > ABSU_EXPR lowering incorrectly used the resulting type > for the new expression but in the case of ABSU the resulting > type is an unsigned type and with ABSU is folded away. The fix > is to use a signed type for the expression instead. > >

Re: [PATCH] phiopt: Move check for maybe_undef_p slightly earlier

2024-10-28 Thread Richard Biener
On Mon, Oct 28, 2024 at 12:42 AM Andrew Pinski wrote: > > This moves the check for maybe_undef_p in match_simplify_replacement > slightly earlier before figuring out the true/false arg using arg0/arg1 > instead. > In most cases this is no difference in compile time; just in the case > there is an

Re: [PATCH] tree: Mark PAREN_EXPR and VEC_DUPLICATE_EXPR as non-trapping [PR117234]

2024-10-26 Thread Richard Biener
> Am 26.10.2024 um 19:06 schrieb Andrew Pinski : > > While looking to fix a possible trapping issue in PHI-OPT's factor, > I noticed that some tree codes could be marked as trapping even > though they don't have a possibility to trap. In the case of PAREN_EXPR, > it is basically a nop except w

[PATCH] Remove code in vectorizer pattern recog relying on vec_cond{u,eq,}

2024-10-26 Thread Richard Biener
With the intent to rely on vec_cond_mask and vec_cmp patterns comparisons do not need rewriting into COND_EXPRs that eventually combine to vec_cond{u,eq,}. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vect-patterns.cc (check_bool_pattern): For comparisons we d

[PATCH 1/2] Remove dead code in vectorizer pattern recog

2024-10-26 Thread Richard Biener
The following removes the code path in vect_recog_mask_conversion_pattern dealing with comparisons in COND_EXPRs. That can no longer happen. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vect-patterns.cc (vect_recog_mask_conversion_pattern): Remove COMPARISON_

[PATCH] Testing: remove dead code in adjust_bool_pattern

2024-10-26 Thread Richard Biener
I believe the comparison handling in adjust_bool_pattern is dead after no longer queueing them but there's a do_compare goto that's not obviously dead. So I'm looking for testcases - the code path would build a COND_EXPR with a comparison embedded. Bootstrap and regtest running on x86_64-unknown-

[PATCH 2/2] Remove dead vect_recog_mixed_size_cond_pattern

2024-10-26 Thread Richard Biener
vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P rhs1 COND_EXPRs which no longer appear - the following removes it. Its testcases still pass, I believe the situation is mitigated by bool pattern handling of the compare use in COND_EXPRs. Boostrap and regtest running on x86_64-

[PATCH 3/3] Assert finished vectorizer pattern COND_EXPR transition

2024-10-25 Thread Richard Biener
The following places a few strathegic asserts so we do not end up with COND_EXPRs with a comparison as the first operand during vectorization. Bootstrap and regtest running on x86_64-unknown-linux-gnu. These are the asserts I'd like to keep, I will followup with quite some dead code removal when

[PATCH 2/3] Finish vectorizer pattern proper COND_EXPR transition

2024-10-25 Thread Richard Biener
This fixes up vect_recog_ctz_ffs_pattern. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): Create a separate pattern stmt for the comparison in the generated COND_EXPR. --- gcc/tree-vect-patterns.cc | 7 +

Re: [PATCH] Match: Simplify branch form 3 of unsigned SAT_ADD into branchless

2024-10-25 Thread Richard Biener
On Thu, Oct 24, 2024 at 4:08 PM wrote: > > From: Pan Li > > There are sorts of forms for the unsigned SAT_ADD. Some of them are > complicated while others are cheap. This patch would like to simplify > the complicated form into the cheap ones. For example as below: > > From the form 3 (branch)

[PATCH 1/3] Finish vectorizer pattern proper COND_EXPR transition

2024-10-25 Thread Richard Biener
The following tries to finish building proper GIMPLE COND_EXPRs in vectorizer pattern recognition. Bootstrapped and tested on x86_64-unknown-linux-gnu. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Build separate comparion pattern for the condition of a COND_EXPR pa

Re: [PATCH] match.pd: Add std::pow folding optimizations.

2024-10-25 Thread Richard Biener
On Wed, 23 Oct 2024, Jennifer Schmitz wrote: > > > > On 22 Oct 2024, at 13:14, Richard Biener wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > > > >&

Re: Patch ping - Re: [PATCH] genmatch: Add selftests to genmatch for diag_vfprintf

2024-10-25 Thread Richard Biener
ormat): Add two tests, one for %M$.*N$s > > and one for %M$.Ns. > > I'd like to ping this patch. OK if David doesn't have any further comments. Richard. > Thanks. > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] Default expand_vec_cond_expr_p code to ERROR_MARK

2024-10-25 Thread Richard Biener
As we want to transition to only vcond_mask expanders the following makes it possible to easier distinguish queries that rely on vcond queries for expand_vec_cond_expr_p from those of vcond_mask by for the latter having the comparison code defaulted to ERROR_MARK. Bootstrapped and tested on x86_64

Re: [PATCH]middle-end: Lower all gconds during vector pattern matching [PR117176]

2024-10-25 Thread Richard Biener
ance) = roots; > - SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL; > - SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond; > - new_instance->reduc_phis = NULL; > - new_instance->cost_vec = vNULL; > - new_instance->subgraph_entries = vNULL; > - vinfo->slp_instances.safe_push (new_instance); > - } > + vect_build_slp_instance (vinfo, slp_inst_kind_gcond, > +stmts, roots, remain, > +max_tree_size, &limit, > +bst_map, NULL, force_single_lane); > } > > /* Find and create slp instances for inductions that have been forced > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] Assorted --disable-checking fixes [PR117249]

2024-10-25 Thread Richard Biener
NULL; >edge->caller = caller; > --- gcc/pair-fusion.cc.jj 2024-10-22 17:09:09.372091098 +0200 > +++ gcc/pair-fusion.cc2024-10-24 11:13:07.023744574 +0200 > @@ -1962,7 +1962,10 @@ pair_fusion_bb_info::fuse_pair (bool loa > >auto ignore = ignore_changing_in

Re: [PATCH] lto: Handle RAW_DATA_CST in compare_tree_sccs_1 [PR117201]

2024-10-25 Thread Richard Biener
0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, > + 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, > + 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, > + 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, > + 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, &g

Re: [RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-25 Thread Richard Biener
On Fri, Oct 25, 2024 at 12:22 AM Kugan Vivekanandarajah wrote: > > Hi, > > This patch sets param_vect_max_version_for_alias_checks to 15. > This was causing GCC to miss vectorization opportunities in one internal > application making it slower than LLVM by about ~14%. > > I've tested different pa

Re: [PATCH 1/5] vec: Add quick_emplace_push/safe_emplace_push

2024-10-25 Thread Richard Biener
On Thu, Oct 24, 2024 at 8:16 AM Andrew Pinski wrote: > > On Wed, Oct 23, 2024 at 12:28 AM Andrew Pinski wrote: > > > > On Tue, Oct 22, 2024 at 11:49 PM Richard Biener > > wrote: > > > > > > On Tue, Oct 22, 2024 at 5:31 PM Andrew Pinski > > >

Re: [PATCH 2/2] Match: make SAT_ADD case 7 commutative

2024-10-24 Thread Richard Biener
On Mon, Oct 21, 2024 at 4:23 PM Akram Ahmad wrote: > > Case 7 of unsigned scalar saturating addition defines > SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as > SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1 > being commutative. > > The pattern for case 7 currently does

[PATCH] Restrict :c to commutative ops as intended

2024-10-24 Thread Richard Biener
genmatch was supposed to restrict :c to verifiable commutative operations while leaving :C to the "I know what I'm doing" case. The following enforces this, cleaning up parsing and amending the commutative_op helper. There's one pattern that needs adjustment, the pattern optimizing fmax (x, NaN) o

Re: [PATCH 1/2] Match: support new case of unsigned scalar SAT_SUB

2024-10-24 Thread Richard Biener
On Mon, Oct 21, 2024 at 4:22 PM Akram Ahmad wrote: > > This patch adds a new case for unsigned scalar saturating subtraction > using a branch with a greater-than-or-equal condition. For example, > > X >= (X - Y) ? (X - Y) : 0 > > is transformed into SAT_SUB (X, Y) when X and Y are unsigned

[PATCH 2/2][v2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-24 Thread Richard Biener
The following implements masked load-lane discovery for SLP. The challenge here is that a masked load has a full-width mask with group-size number of elements when this becomes a masked load-lanes instruction one mask element gates all group members. We already have some discovery hints in place,

[PATCH 1/2][v2] Relax vect_check_scalar_mask check

2024-10-24 Thread Richard Biener
When the mask is not a constant or external def there's no need to check the scalar type, in particular with SLP and the mask being a VEC_PERM_EXPR there isn't a scalar operand ready to check (not one vect_is_simple_use will get you). We later check the vector type and reject non-mask types there.

[PATCH] tree-optimization/117277 - remove CLOBBERs before SLP code generation

2024-10-24 Thread Richard Biener
We have to remove CLOBBERs before SLP is code generated since for store-lanes we are inserting our own CLOBBERs that we want to survive. So the following refactors vect_transform_loop to remove unwanted stmts first. This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL. Bootstrap and

Re: [PATCH] gcc: Remove trailing whitespace

2024-10-24 Thread Richard Biener
On Thu, Oct 24, 2024 at 10:17 AM Jakub Jelinek wrote: > > Hi! > > I've tried to build stage3 with > -Wleading-whitespace=blanks -Wtrailing-whitespace=blank > -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank > added to STRICT_WARN and that expectably resulted in about > 27

Re: [PATCH] asan: Fix up build_check_stmt gsi handling [PR117209]

2024-10-24 Thread Richard Biener
> Am 24.10.2024 um 09:29 schrieb Jakub Jelinek : > > Hi! > > gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator > in case it splits objects, but unfortunately build_check_stmt was in > some places (but not others) using a copy of the iterator rather than > the iterator pas

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-23 Thread Richard Biener
On Tue, Oct 15, 2024 at 1:10 AM James K. Lowden wrote: > > Consequent to advice, I'm preparing the Cobol front-end patches as a > small number of hopefully meaningful patches covering many files. > > 1. meta files used by autotools etc. > 2. gcc/cobol/*.h > 3. gcc/cobol/*.{y,l,cc} > 4. libgcob

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-10-23 Thread Richard Biener
On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote: > > From: Andi Kleen > > Retrieving sys/user time in timevars is quite expensive because it > always needs a system call. Only getting the wall time is much > cheaper because operating systems have optimized paths for this. > > The sys time isn't t

[PATCH 2/2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-23 Thread Richard Biener
The following implements masked load-lane discovery for SLP. The challenge here is that a masked load has a full-width mask with group-size number of elements when this becomes a masked load-lanes instruction one mask element gates all group members. We already have some discovery hints in place,

[PATCH 1/2] Relax vect_check_scalar_mask check

2024-10-23 Thread Richard Biener
When the mask is not a constant or external def there's no need to check the scalar type, in particular with SLP and the mask being a VEC_PERM_EXPR there isn't a scalar operand ready to check (not one vect_is_simple_use will get you). We later check the vector type and reject non-mask types there.

Re: [PATCH] match: Reject non-const internal functions [PR117260]

2024-10-23 Thread Richard Biener
On Wed, Oct 23, 2024 at 8:50 AM Richard Biener wrote: > > On Tue, Oct 22, 2024 at 7:21 PM Andrew Pinski > wrote: > > > > When internal functions support was added to match > > (r6-4979-gc9e926ce2bdc8b), > > the check for ECF_CONST was the builtin function

Re: [PATCH] match: Reject non-const internal functions [PR117260]

2024-10-22 Thread Richard Biener
On Tue, Oct 22, 2024 at 7:21 PM Andrew Pinski wrote: > > When internal functions support was added to match (r6-4979-gc9e926ce2bdc8b), > the check for ECF_CONST was the builtin function side. Though before > r15-4503-g8d6d6d537fdc, > there was no use of maybe_push_res_to_seq with non-const intern

Re: [PATCH 1/5] vec: Add quick_emplace_push/safe_emplace_push

2024-10-22 Thread Richard Biener
On Tue, Oct 22, 2024 at 5:31 PM Andrew Pinski wrote: > > This adds quick_emplace_push and safe_emplace_push to vec. > These are like std::vector's emplace_back so you don't need an extra > copy of the struct around. > > Since we require C++11 and also support partial non-PODs for vec, these > fun

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-10-22 Thread Richard Biener
On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > > > > On 18 Oct 2024, at 09:11, Richard Biener wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Thu, 17 Oct 2024, Richard Sandiford wrote: > > > >&g

Re: [PATCH] varasm: Handle RAW_DATA_CST in compare_constant [PR117199]

2024-10-22 Thread Richard Biener
> Am 22.10.2024 um 20:07 schrieb Jakub Jelinek : > > Hi! > > On the following testcase without LTO we unnecessarily don't merge > two identical .LC* constants (constant hashing computes the same hash, > but as compare_constant returned false for the RAW_DATA_CST in it, > it never compares equ

Re: [PATCH] varasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190]

2024-10-22 Thread Richard Biener
> Am 22.10.2024 um 20:04 schrieb Jakub Jelinek : > > Hi! > > CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375 > patch actually got it right in 6 other spots, but not in this function, > where it used size_int rather than bitsize_int and so size_binop can ICE > on type mismat

Re: [PATCH] fold fold_truth_andor field merging into ifcombine

2024-10-22 Thread Richard Biener
On Mon, Oct 21, 2024 at 4:30 AM Alexandre Oliva wrote: > > On Oct 10, 2024, Richard Biener wrote: > > > Thanks for working on this. There's #if 0 portions in the patch - did you > > send the correct version? > > 'fraid so. Sorry, I'd forgotten about th

Re: [PATCH v2 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-10-22 Thread Richard Biener
d, &ssa_names); > + { > + gimple_seq call_seq > + = predicate_load_or_store (&gsi, stmt, mask, &ssa_names); > > - gsi_replace (&gsi, new_stmt, true); > + gsi_replace_with_seq (&gsi, call_seq, true); > + } > + else > + { > + gimple *new_stmt; > + new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names); > + gsi_replace (&gsi, new_stmt, true); > + } > } > else if (((lhs = gimple_assign_lhs (stmt)), true) > && (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v2 4/8] vect: Add maskload else value support.

2024-10-22 Thread Richard Biener
On Fri, 18 Oct 2024, Robin Dapp wrote: > This patch adds an else operand to vectorized masked load calls. > The current implementation adds else-value arguments to the respective > target-querying functions that is used to supply the vectorizer with the > proper else value. > > Right now, the onl

Re: [PATCH v2 2/8] ifn: Add else-operand handling.

2024-10-22 Thread Richard Biener
On Fri, 18 Oct 2024, Robin Dapp wrote: > This patch adds else-operand handling to the internal functions. > > gcc/ChangeLog: > > * internal-fn.cc (add_mask_and_len_args): Rename... > (add_mask_else_and_len_args): ...to this and add else handling. > (expand_partial_load_optab_fn

Re: [PATCHv2 1/2] cfgexpand: Handle scope conflicts better [PR111422]

2024-10-22 Thread Richard Biener
On Mon, Oct 21, 2024 at 6:04 PM Andrew Pinski wrote: > > On Mon, Oct 21, 2024 at 3:41 AM Richard Biener > wrote: > > > > On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski > > wrote: > > > > > > After fixing loop-im to do the correct overflow rewritin

Re: [PATCH] testsuite: Skip pr112305.c for -O[01] on simulators

2024-10-22 Thread Richard Biener
On Tue, Oct 22, 2024 at 1:19 PM Richard Sandiford wrote: > > gcc.dg/torture/pr112305.c contains an inner loop that executes > 0x8000_0014 times and an outer loop that executes 5 times, giving about > 10 billion total executions of the inner loop body. At -O2 and above we > are able to remove the

Re: [PATCH] match.pd: Add std::pow folding optimizations.

2024-10-22 Thread Richard Biener
On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > > > > On 21 Oct 2024, at 10:51, Richard Biener wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Fri, 18 Oct 2024, Jennifer Schmitz wrote: > > > >

[PATCH] tree-optimization/117254 - ICE with access diangostics

2024-10-22 Thread Richard Biener
The diagnostics code fails to handle non-constant domain max. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/117254 * gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Check the array domain max is constant before using it. *

Re: [PATCH] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-10-22 Thread Richard Biener
d, > enum out_edge_check *oe_check) > { > + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) > +{ > + tree base = expr; > + if (TREE_CODE (expr) == ADDR_EXPR) > + base = get_base_address (TREE_OPERAND (expr, 0)); > + disqualify_

Re: [PATCH] match.pd: Add std::pow folding optimizations.

2024-10-22 Thread Richard Biener
On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > > > > On 22 Oct 2024, at 11:05, Richard Biener wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > > > >&

[PATCH] tree-optimization/117123 - missed PHI equivalence in VN

2024-10-22 Thread Richard Biener
Value-numbering can use its set of equivalences to prove that a PHI node with args is equal to a_1 iff on the edges with the constants a_1 == 5 and a_1 == 10 hold. This breaks down when the order of PHI args is <5, 10, a_1> as then we drop to VARYING early. The following mitigates this by shuffl

Re: [PATCH 02/11] Vect: Try the pattern of vector signed integer SAT_TRUNC

2024-10-21 Thread Richard Biener
On Mon, Oct 14, 2024 at 1:12 PM wrote: > > From: Pan Li > > Almost the same as vector unsigned integer SAT_TRUNC, try to match > the signed version during the vector pattern matching. > > The below test suites are passed for this patch. > * The rv64gcv fully regression test. > * The x86 bootstrap

Re: [PATCH 01/11] Match: Support form 1 for vector signed integer SAT_TRUNC

2024-10-21 Thread Richard Biener
On Mon, Oct 14, 2024 at 1:12 PM wrote: > > From: Pan Li > > This patch would like to support the form 1 of the vector signed > integer SAT_TRUNC. Aka below example: > > Form 1: > #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ > void __attribute__((noinline))

RE: [PATCH]middle-end: Handle more gcond lowering [PR117176]

2024-10-21 Thread Richard Biener
On Mon, 21 Oct 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Monday, October 21, 2024 9:55 AM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH]middle-end: Handle more gcond

Re: [PATCHv2 1/2] cfgexpand: Handle scope conflicts better [PR111422]

2024-10-21 Thread Richard Biener
On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski wrote: > > After fixing loop-im to do the correct overflow rewriting > for pointer types too. We end up with code like: > ``` > _9 = (unsigned long) &g; > _84 = _9 + 18446744073709551615; > _11 = _42 + _84; > _44 = (signed char *) _11; > ... >

Re: [PATCH 1/2] Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1

2024-10-21 Thread Richard Biener
On Mon, Oct 21, 2024 at 5:09 AM Li Xu wrote: > > From: xuli > > This patch would like to support .SAT_SUB when one of the op > is IMM = max - 1 of form1. > > Form 1: > #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ > T __attribute__((noinline)) \ > sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)

Re: [PATCH 1/2] Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1

2024-10-21 Thread Richard Biener
On Mon, Oct 21, 2024 at 6:13 AM Li Xu wrote: > > From: xuli > > This patch would like to support .SAT_SUB when one of the op > is IMM = 1 of form1. > > Form 1: > #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ > T __attribute__((noinline)) \ > sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ >

Re: [PATCH] phiopt: factor could move a trapping statement across other trapping statements [PR117235]

2024-10-21 Thread Richard Biener
On Sun, Oct 20, 2024 at 7:01 PM Andrew Pinski wrote: > > After r15-4503-g8d6d6d537fdc75, phiopt could move a trapping statement > across another trapping statement causing the order to be incorrect (might > happen > with non-call exceptions). > To prevent this, a trapping statement has to be at t

Re: [PATCH] phiopt: do factor_out_conditional_operation for all phis [PR112418]

2024-10-21 Thread Richard Biener
On Sat, Oct 19, 2024 at 5:08 PM Jeff Law wrote: > > > > On 10/18/24 7:41 PM, Andrew Pinski wrote: > > Sometimes factor_out_conditional_operation can factor out > > an operation that causes a phi node to become the same element. > > Other times, we want to factor out a binary operator because > > i

Re: [PATCH v2] MATCH: Simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, -1.0/1.0)` [PR112472]

2024-10-21 Thread Richard Biener
On Wed, Oct 16, 2024 at 6:20 PM Eikansh Gupta wrote: > > This patch simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x, > -1.0/1.0)` > depending on the sign of CST. Previously, it was simplified to `copysign (x, > CST)`. > It can be optimized as the sign of the CST matters, not the valu

<    1   2   3   4   5   6   7   8   9   10   >