Re: [PATCH V2] Fix wrong cost of MEM when addr is a lea.

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 5:57 AM liuhongt wrote: > > > But rtx_cost invokes targetm.rtx_cost which allows to avoid that > > recursive processing at any level. You're dealing with MEM [addr] > > here, so why's rtx_cost (addr, Pmode, MEM, 0, speed) not always > > the best way to deal with this?

Re: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 3:31 AM wrote: > > From: Pan Li OK > The zip benchmark of coremark-pro have one SAT_SUB like pattern but > truncated as below: > > void test (uint16_t *x, unsigned b, unsigned n) > { > unsigned a = 0; > register uint16_t *p = x; > > do { > a = *--p; > *p =

[PATCH] RISC-V: tree-optimization/115652 - amend last fix

2024-06-26 Thread Richard Biener
The previous fix breaks in the degenerate case when the discovered last_stmt is equal to the first stmt in the block since then we undo a required stmt advancement. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115652 * tree-vect-slp.cc

[PATCH] tree-optimization/115493 - complete previous fix

2024-06-26 Thread Richard Biener
The following fixes the 2nd occurance of new_temp missed with the previous fix. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115493 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Use first scalar result. ---

Re: [PATCH] vect: Fix shift-by-induction for single-lane slp

2024-06-26 Thread Richard Biener
On Wed, Jun 26, 2024 at 4:58 PM Feng Xue OS wrote: > > Allow shift-by-induction for slp node, when it is single lane, which is > aligned with the original loop-based handling. OK. Did you try whether we handle multiple lanes correctly? The simplest case would be a loop body with say a[2*i]

RE: [PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-06-26 Thread Richard Biener
On Wed, 26 Jun 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, June 26, 2024 2:23 PM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > > Subject: Re: [PATCH]mi

Re: [PATCH v2] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-26 Thread Richard Biener
On Mon, Jun 24, 2024 at 3:55 PM wrote: > > From: Pan Li > > The zip benchmark of coremark-pro have one SAT_SUB like pattern but > truncated as below: > > void test (uint16_t *x, unsigned b, unsigned n) > { > unsigned a = 0; > register uint16_t *p = x; > > do { > a = *--p; > *p =

Re: [PATCH v1] Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int

2024-06-26 Thread Richard Biener
On Wed, Jun 26, 2024 at 3:46 AM wrote: > > From: Pan Li > > This patch would like to add the middle-end presentation for the > saturation truncation. Aka set the result of truncated value to > the max value when overflow. It will take the pattern similar > as below. > > Form 1: > #define

Re: [PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-06-26 Thread Richard Biener
ion pattern > + won't hit on the pattern statement. */ > + cmp_ls = build_mask_conversion (vinfo, var, gs_vectype, stmt_vinfo); Isn't this somewhat redundant with the below call? I fear of bad [non-]interactions with bool pattern recognition btw. > +} > + > + tree mask = vect_conve

Re: [PING] Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity

2024-06-26 Thread Richard Biener
On Wed, Jun 26, 2024 at 2:28 PM Aleksandar Rakic wrote: > > Hi! > > I'd like to ping the following patch: > > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647966.html > a patch for the computation of the complexity for the unsupported > addressing modes in ivopts The thread starting

[PATCH] tree-optimization/115640 - outer loop vect with inner SLP permute

2024-06-26 Thread Richard Biener
The following fixes wrong-code when using outer loop vectorization and an inner loop SLP access with permutation. A wrong adjustment to the IV increment is then applied on GCN. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115640 *

[PATCH] tree-optimization/115652 - adjust insertion gsi for SLP

2024-06-26 Thread Richard Biener
The following adjusts how SLP computes the insertion location. In particular it advanced the insert iterator of the found last_stmt. The vectorizer will later insert stmts _before_ it. But we also have the constraint that possibly masked ops may not be scheduled outside of the loop and as we do

Re: [PATCH] Fix wrong cost of MEM when addr is a lea.

2024-06-26 Thread Richard Biener
On Wed, Jun 26, 2024 at 9:14 AM Hongtao Liu wrote: > > On Wed, Jun 26, 2024 at 2:52 PM Richard Biener > wrote: > > > > On Wed, Jun 26, 2024 at 8:09 AM liuhongt wrote: > > > > > > 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0.

Re: [PATCH] Fix wrong cost of MEM when addr is a lea.

2024-06-26 Thread Richard Biener
On Wed, Jun 26, 2024 at 8:09 AM liuhongt wrote: > > 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0. > The commit adjust rtx_cost of mem to reduce cost of (add op0 disp). > But Cost of ADDR could be cheaper than XEXP (addr, 0) when it's a lea. > It is the case in the PR, the

[PATCH] tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield

2024-06-25 Thread Richard Biener
The following makes analysis and transform agree on constraints. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. PR tree-optimization/115646 * tree-call-cdce.cc (check_pow): Check for bit_sz values as allowed by transform. * gcc.dg/pr115646.c: New

[PATCH] GORI cleanups

2024-06-25 Thread Richard Biener
The following replaces conditional is_export_p calls as is_export_p handles a NULL bb itself. Bootstrap running on x86_64-unknown-linux-gnu, OK? Thanks, Richard. * gimple-range-gori.cc (gori_compute::may_recompute_p): Call is_export_p with NULL bb. --- gcc/gimple-range-gori.cc

[PATCH] tree-optimization/115629 - missed tail merging

2024-06-25 Thread Richard Biener
The following fixes a missed tail-merging observed for the testcase in PR115629. The issue is that when deps_ok_for_redirect doesn't compute both would be valid prevailing blocks it rejects the merge. The following instead makes sure to record the working block as prevailing. Also stmt

Re: [PATCH 1/3 v4] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-25 Thread Richard Biener
mode)) > + return false; > + } > + code1 = float_expr_p ? (tree_code) code : NOP_EXPR; > + code2 = float_expr_p ? NOP_EXPR : (tree_code) code; > + opt_scalar_mode mode_iter; > + FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode) > + { > +

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-25 Thread Richard Biener
On Mon, 24 Jun 2024, Tamar Christina wrote: > > > > -Original Message- > > From: Richard Biener > > Sent: Thursday, June 20, 2024 8:49 AM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > > Subject:

Re: [PATCH V2] [x86] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-25 Thread Richard Biener
On Mon, Jun 24, 2024 at 1:28 AM liuhongt wrote: > > > I think the check for TYPE_UNSIGNED should be of TREE_TYPE (@0) rather > > than type here. > > Changed > > > Or maybe you need `types_match (type, TREE_TYPE (@0))` too. > And use tree_nop_conversion_p (type, TREE_TYPE (@0)) and add

Re: [PATCH][v2] Support single def-use cycle optimization for SLP reduction vectorization

2024-06-25 Thread Richard Biener
On Tue, 25 Jun 2024, Thomas Schwinge wrote: > Hi! > > On 2024-06-14T11:08:15+0200, Richard Biener wrote: > > We can at least mimic single def-use cycle optimization when doing > > single-lane SLP reductions and that's required to avoid regressing > > compared to no

Re: [PATCH] Add a late-combine pass [PR106594]

2024-06-25 Thread Richard Biener
On Mon, Jun 24, 2024 at 9:38 PM Segher Boessenkool wrote: > > I didn't see this before. Sigh. > > On Tue, Jan 02, 2024 at 09:47:11AM +, Richard Sandiford wrote: > > Segher Boessenkool writes: > > > On Tue, Oct 24, 2023 at 07:49:10PM +0100, Richard Sandiford wrote: > > >> This patch adds a

Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-25 Thread Richard Biener
On Tue, Jun 25, 2024 at 11:32 AM Feng Xue OS wrote: > > >> > >> >> - if (slp_node) > >> >> + if (slp_node && SLP_TREE_LANES (slp_node) > 1) > >> > > >> > Hmm, that looks wrong. It looks like SLP_TREE_NUMBER_OF_VEC_STMTS is off > >> > instead, which is bad. > >> > > >> >>

Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-24 Thread Richard Biener
t; > > >> + for (unsigned i = 0; i < op.num_ops - 1; i++) > >> + { > >> + gcc_assert (vec_oprnds[i].length () == using_ncopies); > >> + vec_oprnds[i].safe_grow_cleared (reduc_ncopies); > >> + } > >> +} > >

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-24 Thread Richard Biener
On Thu, 20 Jun 2024, Hu, Lin1 wrote: > > >else if (ret_elt_bits > arg_elt_bits) > > > modifier = WIDEN; > > > > > > + if (supportable_convert_operation (code, ret_type, arg_type, )) > > > +{ > > > + g = gimple_build_assign (lhs, code1, arg); > > > + gsi_replace (gsi, g,

Re: [PATCH] cfg: propagate source location in gimple_split_edge [PR115564]

2024-06-24 Thread Richard Biener
On Sat, Jun 22, 2024 at 12:26 AM David Malcolm wrote: > > PR analyzer/115564 reports a missing warning from the analyzer > on this infinite loop at -O2 and above: > > void test (unsigned b) > { >for (unsigned i = b; i >= 0; --i) {} > } > > The issue is that there are no useful location_t

Re: [PATCH v2] PR tree-opt/113673: Avoid load merging when potentially trapping.

2024-06-24 Thread Richard Biener
ed from outside > this file, and guaranteeing that it is dominated by stmt_can_throw_internal > checking. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? OK. Th

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-24 Thread Richard Biener
On Mon, Jun 24, 2024 at 1:34 PM Richard Sandiford wrote: > > Richard Biener writes: > > On Mon, Jun 24, 2024 at 10:03 AM Richard Sandiford > > wrote: > >> > >> Richard Biener writes: > >> > On Sat, Jun 22, 2024 at 6:50 PM Richard Sandifo

[PATCH] tree-optimization/115602 - SLP CSE results in cycles

2024-06-24 Thread Richard Biener
The following prevents SLP CSE to create new cycles which happened because of a 1:1 permute node being present where its child was then CSEd to the permute node. Fixed by making a node only available to CSE to after recursing. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Re: [PATCH] Add -finline-functions-aggressive option [PR114531]

2024-06-24 Thread Richard Biener
On Mon, Jun 24, 2024 at 1:18 PM Malladi, Rama wrote: > > From: Rama Malladi Hmm, if we offer the ability to set -O3 inline limits why wouldn't we offer a way to set -O2 inline limits for example with -O3? So ... wouldn't a -finline-limit={default,O2,O3} option be a more generic and extensible

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-24 Thread Richard Biener
On Mon, Jun 24, 2024 at 10:03 AM Richard Sandiford wrote: > > Richard Biener writes: > > On Sat, Jun 22, 2024 at 6:50 PM Richard Sandiford > >> The traditional (and IMO correct) way to handle this is to make the > >> pattern reserve the temporary

Re: Ping^3 [PATCHv5] Optab: add isnormal_optab for __builtin_isnormal

2024-06-24 Thread Richard Biener
On Mon, Jun 24, 2024 at 3:39 AM HAO CHEN GUI wrote: > > Hi, > Gently ping it. > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653001.html OK > Thanks > Gui Haochen > > 在 2024/6/17 13:30, HAO CHEN GUI 写道: > > Hi, > > Gently ping it. > >

Re: Ping^3 [PATCHv5] Optab: add isfinite_optab for __builtin_isfinite

2024-06-24 Thread Richard Biener
On Mon, Jun 24, 2024 at 3:38 AM HAO CHEN GUI wrote: > > Hi, > Gently ping it. > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652991.html OK > Thanks > Gui Haochen > > 在 2024/6/17 13:29, HAO CHEN GUI 写道: > > Hi, > > Gently ping it. > >

[PATCH] tree-optimization/115599 - reassoc qsort comparator issue

2024-06-23 Thread Richard Biener
The compare_repeat_factors comparator fails qsort checking eventually because it uses rf2->rank - rf1->rank to compare unsigned numbers which causes issues for ranks that interpret negative as signed. Fixed by re-writing the obvious way. I've also fixed the count comparison which suffers from

[PATCH] tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes

2024-06-23 Thread Richard Biener
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS as otherwise a chain of two-operator node operations can result in exponential behavior of the CSE process as likely seen when building 510.parest on aarch64. Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-23 Thread Richard Biener
On Sat, Jun 22, 2024 at 6:50 PM Richard Sandiford wrote: > > Takayuki 'January June' Suwa writes: > > On 2024/06/20 22:34, Richard Sandiford wrote: > >> This patch adds a combine pass that runs late in the pipeline. > >> There are two instances: one between combine and split1, and one > >> after

[PATCH] tree-optimization/115579 - fix wrong code with store-motion

2024-06-22 Thread Richard Biener
The recent change to relax store motion for variables that cannot have store data races broke the optimization to share flag vars for stores that all happen in the same single BB. The following fixes this. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. PR

Re: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB

2024-06-22 Thread Richard Biener
0x1919f69 execute_function_todo > > /home/pli/gcc/555/riscv-gnu-toolchain/gcc/__RISCV_BUILD__/../gcc/passes.cc:2096 > 0x1918b46 do_per_function > > /home/pli/gcc/555/riscv-gnu-toolchain/gcc/__RISCV_BUILD__/../gcc/passes.cc:1688 > 0x191a116 execute_todo > >

Re: [PATCH] Add param for bb limit to invoke fast_vrp.

2024-06-22 Thread Richard Biener
On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod wrote: > > This patch adds > > --param=vrp-block-limit=N > > When the basic block counter for a function exceeded 'N' , VRP is > invoked with the new fast_vrp algorithm instead. This algorithm uses a > lot less memory and processing power,

[PATCH] tree-optimization/115528 - fix vect alignment analysis for outer loop vect

2024-06-21 Thread Richard Biener
For outer loop vectorization of a data reference in the inner loop we have to look at both steps to see if they preserve alignment. What is special for this testcase is that the outer loop step is one element but the inner loop step four and that we now use SLP and the vectorization factor is

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-21 Thread Richard Biener
On Thu, 20 Jun 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Mon, 17 Jun 2024, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > On Fri, 14 Jun 2024, Richard Biener wrote: > >> > > >> >> On Fri, 1

Re: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB

2024-06-21 Thread Richard Biener
to do is simply to add a conversion stmt to the pattern sequence in case the types differ? But maybe I'm missing something. Richard. > Pan > > -Original Message- > From: Richard Biener > Sent: Friday, June 21, 2024 3:00 PM > To: Li, Pan2 > Cc: gcc-patches@gcc.

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-21 Thread Richard Biener
On Fri, Jun 21, 2024 at 10:21 AM Richard Sandiford wrote: > > Richard Biener writes: > > [...] > > I wonder if you can amend doc/passes.texi, specifically noting differences > > between fwprop, combine and late-combine? > > Ooh, we have a doc/passes.texi? :) Some

Re: [PATCH] MATCH: Simplify (vec CMP vec) eq/ne (vec CMP vec) [PR111150]

2024-06-21 Thread Richard Biener
On Fri, Jun 21, 2024 at 9:12 AM Eikansh Gupta wrote: > > We can optimize (vec_cond eq/ne vec_cond) when vec_cond is a > result of (vec CMP vec). The optimization is because of the > observation that in vec_cond, (-1 != 0) is true. So, we can > generate vec_cond of xor of vec resulting in a single

Re: [PATCH v1] Ifcvt: Add cond tree reconcile for truncated .SAT_SUB

2024-06-21 Thread Richard Biener
On Fri, Jun 21, 2024 at 5:53 AM wrote: > > From: Pan Li > > The zip benchmark of coremark-pro have one SAT_SUB like pattern but > truncated as below: > > void test (uint16_t *x, unsigned b, unsigned n) > { > unsigned a = 0; > register uint16_t *p = x; > > do { > a = *--p; > *p =

Re: [PATCH] [match.pd] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-21 Thread Richard Biener
On Fri, Jun 21, 2024 at 5:11 AM Andrew Pinski wrote: > > On Thu, Jun 20, 2024 at 7:56 PM liuhongt wrote: > > > > Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 > > and x < 0 ? 1 : 0 into (unsigned) x >> 31. > > > > Move the optimization did in ix86_expand_int_vcond to match.pd > > > >

[PATCH] Remove outdated info from passes.texi

2024-06-21 Thread Richard Biener
This applies some maintainance to passes.texi by removing references to no longer existing passes. It also fixes a few minor things but doesn't fill the gaps that meanwhile exist. make pdf tested, pushed. * doc/passes.texi: Remove references to no longer existing passes. ---

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-20 Thread Richard Biener
On Thu, Jun 20, 2024 at 3:37 PM Richard Sandiford wrote: > > This patch adds a combine pass that runs late in the pipeline. > There are two instances: one between combine and split1, and one > after postreload. > > The pass currently has a single objective: remove definitions by > substituting

Re: [PATCH] complex-lowering: Better handling of PAREN_EXPR [PR68855]

2024-06-20 Thread Richard Biener
> Am 21.06.2024 um 04:35 schrieb Andrew Pinski : > > When PAREN_EXPR tree code was added in r0-85884-gdedd42d511b6e4, > a simplified handling was added to complex lowering. Which means > we would get: > ``` > _9 = COMPLEX_EXPR <_15, _14>; > _11 = ((_9)); > _19 = REALPART_EXPR <_11>; > _20

Re: [PATCH] s390: define single step vector casts

2024-06-20 Thread Richard Biener
> Am 20.06.2024 um 17:40 schrieb Stefan Schulze Frielinghaus > : > > On Thu, Jun 20, 2024 at 09:06:11AM +0200, Juergen Christ wrote: >> Some casts were missing leading to missed of bad vectorizations where >> casting was done scalar followed by a vector creation from the >> individual

Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-06-20 Thread Richard Biener
> Am 20.06.2024 um 16:05 schrieb Andrew MacLeod : > >  >> On 6/20/24 05:31, Richard Biener wrote: >>> On Thu, 20 Jun 2024, Aldy Hernandez wrote: >>> >>> Hi. >>> >>> I came around to this, and whipped up the proposed patch. Howeve

Re: [wwwdocs] simtest-howto.html: Remove cc0 mentions

2024-06-20 Thread Richard Biener
On Thu, Jun 20, 2024 at 1:32 PM Georg-Johann Lay wrote: > > cc0 has been removed long ago, removed mentions. OK > Johann > > diff --git a/htdocs/simtest-howto.html b/htdocs/simtest-howto.html > index ea69c9ed..f18a78f6 100644 > --- a/htdocs/simtest-howto.html > +++ b/htdocs/simtest-howto.html >

Re: [PATCH 3/7 v2] Lockfile.

2024-06-20 Thread Richard Biener
te lock. No other lock can be held on this lockfile. > + Blocking call. */ > + int lock_write (); > + > + /* Unique write lock. No other lock can be held on this lockfile. > + Only locks if this filelock is not locked by any other process. > + Return whether locking was successful. */ > + int try_lock_write (); > + > + /* Shared read lock. Only read lock can be held concurrently. > + If write lock is already held by this process, it will be > + changed to read lock. > + Blocking call. */ > + int lock_read (); > + > + /* Unlock all previously placed locks. */ > + void unlock (); > + > + /* Returns whether any lock is held. */ > + bool > + locked () > + { > +return fd < 0; > + } > + > + /* Are lockfiles supported? */ > + static bool lockfile_supported (); > +private: > + std::string filename; > + int fd; > +}; > + > +#endif > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-20 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:31 AM Feng Xue OS wrote: > > For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current > vectorizer could only handle the pattern if the reduction chain does not > contain other operation, no matter the other is normal or lane-reducing. > > Actually,

Re: [PATCH 4/8] vect: Determine input vectype for multiple lane-reducing

2024-06-20 Thread Richard Biener
short>(short_c0_lo, short_c1_lo, sum_v0); > sum_v1 = dot_prod<8 * short>(short_c0_hi, short_c1_hi, sum_v1); > } > > For this purpose, we need to track the vectype_in that results in > the most ncopies, for this case, the type is <8 * short>. So the VF

Re: [PATCH] testsuite/ubsan/overflow-div-3.c: Use SIGTRAP for MIPS

2024-06-20 Thread Richard Biener
On Thu, Jun 20, 2024 at 12:57 PM Maciej W. Rozycki wrote: > > On Thu, 20 Jun 2024, YunQiang Su wrote: > > > The DIV instructions of MIPS won't be trapped themself if the divisor > > is zero. The compiler will emit a conditional trap instruct for it. > > So the signal will be SIGTRAP instead of

GCC 11.4.1 Status Report (2024-06-20)

2024-06-20 Thread Richard Biener
Status == The gcc-11 branch nears its retirement with the last release from it, GCC 11.5, on the horizon. Please look through bugzilla and see which of your regression fixes for GCC 12 are also applicable for the GCC 11 branch and do the necessary backporting. Please error on the safe side

Re: [PATCH] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-06-20 Thread Richard Biener
obstack is released after each pass. But ranger instances are also not expected to be created multiple times each pass, right? I don't have a strong opinion. Richard. > Aldy > > On Mon, Apr 8, 2024 at 7:47 PM Richard Biener > wrote: > > > > > > > > > Am 08.04.2

GCC 12.4.1 Status Report (2024-06-20)

2024-06-20 Thread Richard Biener
Status == GCC 12.4 has been released and the branch is again open for regression and documentation fixes. Quality Data Priority#Change from last report ------ P1 0 P2588- 31 P3 76- 1

RE: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-06-20 Thread Richard Biener
On Wed, 19 Jun 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, June 19, 2024 1:14 PM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > > Subject: Re:

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-20 Thread Richard Biener
On Wed, 19 Jun 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, June 19, 2024 12:55 PM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com > > Subject: Re: [

Re: [PATCH] middle-end/114070 - folding breaking VEC_COND expansion

2024-06-20 Thread Richard Biener
th this pattern, it > requires vectors and it fails only on targets where there is no vector > support enabled. > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906 > > Thanks, > Andrew Pinski > > > > > > > Kind regards, > > Vasee > > > >

Re: [PATCH] bitint: Fix up lowering of COMPLEX_EXPR [PR115544]

2024-06-19 Thread Richard Biener
> Am 19.06.2024 um 20:44 schrieb Jakub Jelinek : > > Hi! > > We don't really support _Complex _BitInt(N), the only place we use > bitint complex types is for the .{ADD,SUB,MUL}_OVERFLOW internal function > results and COMPLEX_EXPR in the usual case should be either not present > yet because

Re: [PATCH][v2] Enhance if-conversion for automatic arrays

2024-06-19 Thread Richard Biener
> Am 19.06.2024 um 20:25 schrieb Toon Moene : > > On 6/17/24 16:05, Richard Biener wrote: > >> Automatic arrays that are not address-taken should not be subject to >> store data races. This applies to OMP SIMD in-branch lowered >> functions result array wh

Re: [PATCH] tree-optimization/114413 - SLP CSE after permute optimization

2024-06-19 Thread Richard Biener
On Wed, 19 Jun 2024, Richard Sandiford wrote: > Richard Biener writes: > > We currently fail to re-CSE SLP nodes after optimizing permutes > > which results in off cost estimates. For gcc.dg/vect/bb-slp-32.c > > this shows in not re-using the SLP node with the load and ar

Re: [PATCH 5/8] vect: Use an array to replace 3 relevant variables

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:27 AM Feng Xue OS wrote: > > It's better to place 3 relevant independent variables into array, since we > have requirement to access them via an index in the following patch. At the > same time, this change may get some duplicated code be more compact. OK. I might have

Re: [PATCH 6/8] vect: Tighten an assertion for lane-reducing in transform

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:28 AM Feng Xue OS wrote: > > According to logic of code nearby the assertion, all lane-reducing operations > should not appear, not just DOT_PROD_EXPR. Since "use_mask_by_cond_expr_p" > treats SAD_EXPR same as DOT_PROD_EXPR, and WIDEN_SUM_EXPR should not be > allowed >

Re: [PATCH 4/8] vect: Determine input vectype for multiple lane-reducing

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:25 AM Feng Xue OS wrote: > > The input vectype of reduction PHI statement must be determined before > vect cost computation for the reduction. Since lance-reducing operation has > different input vectype from normal one, so we need to traverse all reduction > statements

Re: [PATCH 3/8] vect: Use one reduction_type local variable

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:23 AM Feng Xue OS wrote: > > Two local variables were defined to refer same STMT_VINFO_REDUC_TYPE, better > to keep only one. OK. Richard. > Thanks, > Feng > > --- > gcc/ > * tree-vect-loop.cc (vectorizable_reduction): Remove v_reduc_type, and >

Re: [PATCH 2/8] vect: Remove duplicated check on reduction operand

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:22 AM Feng Xue OS wrote: > > In vectorizable_reduction, one check on a reduction operand via index could be > contained by another one check via pointer, so remove the former. OK. Thanks, Richard. > Thanks, > Feng > > --- > gcc/ > * tree-vect-loop.cc

Re: [PATH 1/8] vect: Add a function to check lane-reducing stmt

2024-06-19 Thread Richard Biener
On Sun, Jun 16, 2024 at 9:21 AM Feng Xue OS wrote: > > The series of patches are meant to support multiple lane-reducing reduction > statements. Since the original ones conflicted with the new single-lane slp > node patches, I have reworked most of the patches, and split them as small as >

[PATCH] tree-optimization/114413 - SLP CSE after permute optimization

2024-06-19 Thread Richard Biener
We currently fail to re-CSE SLP nodes after optimizing permutes which results in off cost estimates. For gcc.dg/vect/bb-slp-32.c this shows in not re-using the SLP node with the load and arithmetic for both the store and the reduction. The following implements CSE by re-bst-mapping nodes as

Re: [PATCH][ivopts]: perform affine fold on unsigned addressing modes known not to overflow. [PR114932]

2024-06-19 Thread Richard Biener
t;original" iv->base to be used for code generation (and there only the unexpanded form) and a variant used for the various sorts of canonicalization/compare (I see we eventually add/subtract step and then compare against sth else). And then apply this normalization always to the not "original" form. The above STRIP_NOPS (expr) + expand might turn an unsigned affine combination into a signed one which might be problematic. So what happens if you change the above to simply always unsigned expand? Richard. >iv->base = base; >iv->base_object = determine_base_object (data, base); >iv->step = step; > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-19 Thread Richard Biener
iv_step, use->iv->step) > + && affine_compare_eq (iv_addr_base, use->addr_base)) There's only this use of addr_base so I think the opportunity is to turn iv_use->addr_base into aff_tree (even though that's a quite big representation). For the testcase, what are the two IVs we are comparing? I wonder why you need the affine compare for iv->step? > break; > } >if (i == data->vgroups.length ()) > @@ -2231,6 +2248,14 @@ constant_multiple_of (tree top, tree bot, widest_int > *mul) >return true; > } > > + aff_tree aff_top, aff_bot; > + tree_to_aff_combination (top, TREE_TYPE (top), _top); > + tree_to_aff_combination (bot, TREE_TYPE (bot), _bot); > + poly_widest_int poly_mul; > + if (aff_combination_constant_multiple_p (_top, _bot, _mul) > + && poly_mul.is_constant (mul)) > +return true; > + So why does stripping nops not work here? >code = TREE_CODE (top); >switch (code) > { > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] Improve gcc.dg/vect/bb-slp-32.c testcase

2024-06-19 Thread Richard Biener
The following adds a correctness check to the combined store/reduce vectorization. Tested on x86_64-unknown-linux-gnu, pushed. * gcc.dg/vect/bb-slp-32.c: Add check for correctness. --- gcc/testsuite/gcc.dg/vect/bb-slp-32.c | 26 -- 1 file changed, 20

Re: [PATCH v1] Match: Support more forms for the scalar unsigned .SAT_SUB

2024-06-19 Thread Richard Biener
vectorize_conversion?). > Would like to get some hint from you before the next step, thanks a lot. > > patt_34 = .SAT_SUB (m_11, wsize_12(D)); > patt_35 = (vector([8,8]) short unsigned int) patt_34; > > Pan > > -Original Message- > From: Richard Biener >

Re: [C PATCH] Fix ICE related to incomplete structures in C23 [PR114930,PR115502].

2024-06-19 Thread Richard Biener
On Wed, 19 Jun 2024, Martin Uecker wrote: > Am Mittwoch, dem 19.06.2024 um 08:57 +0200 schrieb Richard Biener: > > On Wed, 19 Jun 2024, Martin Uecker wrote: > > > > > Am Mittwoch, dem 19.06.2024 um 08:04 +0200 schrieb Richard Biener: > > > > > > &

Re: [C PATCH] Fix ICE related to incomplete structures in C23 [PR114930,PR115502].

2024-06-19 Thread Richard Biener
On Wed, 19 Jun 2024, Jakub Jelinek wrote: > On Wed, Jun 19, 2024 at 08:04:55AM +0200, Richard Biener wrote: > > >> Note a canonical type should always be unqualified (for > > >> classical qualifiers, not address space or atomic qualification) > > > >

Re: [C PATCH] Fix ICE related to incomplete structures in C23 [PR114930,PR115502].

2024-06-19 Thread Richard Biener
On Wed, 19 Jun 2024, Martin Uecker wrote: > Am Mittwoch, dem 19.06.2024 um 08:04 +0200 schrieb Richard Biener: > > > > > Am 18.06.2024 um 20:18 schrieb Martin Uecker : > > > > > > Am Dienstag, dem 18.06.2024 um 17:27 +0200 schrieb Richard Biener: > &g

Re: [C PATCH] Fix ICE related to incomplete structures in C23 [PR114930, PR115502].

2024-06-19 Thread Richard Biener
> Am 18.06.2024 um 20:18 schrieb Martin Uecker : > > Am Dienstag, dem 18.06.2024 um 17:27 +0200 schrieb Richard Biener: >> >>>> Am 18.06.2024 um 17:20 schrieb Martin Uecker : >>> >>>  >>> As discussed this replaces the use of check

Re: [C PATCH] Fix ICE related to incomplete structures in C23 [PR114930, PR115502].

2024-06-18 Thread Richard Biener
> Am 18.06.2024 um 17:20 schrieb Martin Uecker : > >  > As discussed this replaces the use of check_qualified_type with > a simple check for qualifiers as suggested by Jakub in > c_update_type_canonical. Note a canonical type should always be unqualified (for classical qualifiers, not

[PATCH] tree-optimization/115537 - ICE with SLP condition reduction vectorization

2024-06-18 Thread Richard Biener
The condition rejecting "multiple-type" SLP condition reduction lacks handling EXTRACT_LAST reductions. Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Richard. PR tree-optimization/115537 * tree-vect-loop.cc (vectorizable_reduction): Also reject SLP

Re: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-18 Thread Richard Biener
diate_mode) > target_size) > + break; > + > + scalar_mode cvt_mode; > + if (!int_mode_for_size > + (GET_MODE_BITSIZE (intermediate_mode), 0).exists (_mode)) > + break; > + > + cvt_type = build_nonstandard_integer_type

Re: [PATCH 0/8] Follow-on force_subreg patches

2024-06-18 Thread Richard Biener
On Mon, Jun 17, 2024 at 11:55 AM Richard Sandiford wrote: > > This series expands on the fix for PR115464 by using force_subreg > in more places. It also adds some convenience wrappers for lowpart > and highpart subregs. > > A part of this will need to be backported after a grace period, > but

Re: [PATCH v1] Match: Support form 11 for the unsigned scalar .SAT_SUB

2024-06-18 Thread Richard Biener
On Mon, Jun 17, 2024 at 9:07 AM wrote: > > From: Pan Li > > We missed one match pattern for the unsigned scalar .SAT_SUB, aka > form 11. > > Form 11: > #define SAT_SUB_U_11(T) \ > T sat_sub_u_11_##T (T x, T y) \ > { \ > T ret; \ > bool overflow = __builtin_sub_overflow (x, y, ); \

Re: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

2024-06-18 Thread Richard Biener
On Mon, Jun 17, 2024 at 3:41 AM wrote: > > From: Pan Li > > When investigate the vectorization of .SAT_ADD, we notice there > are additional 2 forms, aka form 7 and 8 for .SAT_ADD. > > Form 7: > #define DEF_SAT_U_ADD_FMT_7(T) \ > T __attribute__((noinline)) \ >

Re: [PATCH] build: Fix missing variable quotes

2024-06-18 Thread Richard Biener
On Tue, Jun 18, 2024 at 10:35 AM Sam James wrote: > > YunQiang Su writes: > > > OK for trunk? > > It looks good to me, but I can't approve. (I'd dare say it's obvious, > even.) > > Richard, any chance you could give it a quick ack? OK

Re: [PATCH] function.h: eliminate macros "dom_computed" and "n_bbs_in_dom_tree"

2024-06-18 Thread Richard Biener
On Tue, Jun 18, 2024 at 2:11 AM David Malcolm wrote: > > Be explicit when we use "cfun". > > No functional change intended. > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? > > gcc/ChangeLog: > * dominance.cc (compute_dom_fast_query): Replace uses of >

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-18 Thread Richard Biener
On Mon, 17 Jun 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Fri, 14 Jun 2024, Richard Biener wrote: > > > >> On Fri, 14 Jun 2024, Richard Sandiford wrote: > >> > >> > Richard Biener writes: > >> > > On Fri, 14 J

RE: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]

2024-06-17 Thread Richard Biener
> > --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > new file mode 100644 > > > index 000..57cc00913a3 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > @@ -0,0 +1,14 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2" } */ > > > + > > > +void > > > +f (short *__restrict a, float *__restrict b) { > > > + a[0] = b[0]; > > > + a[1] = b[1]; > > > + a[2] = b[2]; > > > + a[3] = b[3]; > > > +} > > > + > > > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} > > > +1 } } */ > > > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 > > > +} } */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH][v2] Enhance if-conversion for automatic arrays

2024-06-17 Thread Richard Biener
Automatic arrays that are not address-taken should not be subject to store data races. This applies to OMP SIMD in-branch lowered functions result array which for the testcase otherwise prevents vectorization with SSE and for AVX and AVX512 ends up with spurious .MASK_STORE to the stack

[PATCH] tree-optimization/115493 - fix wrong code with SLP induction cond reduction

2024-06-17 Thread Richard Biener
The following fixes a bad final value being used when doing single-lane SLP integer induction cond reduction vectorization. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115493 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Use

[PATCH] tree-optimization/115508 - fix ICE with SLP scheduling and extern vector

2024-06-17 Thread Richard Biener
When there's a permute after an extern vector we can run into a case that didn't consider the scheduled node being a permute which lacks a representative. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/115508 * tree-vect-slp.cc

Re: [COMMITTED] Do not assume LHS of call is an ssa-name.

2024-06-17 Thread Richard Biener
On Fri, Jun 14, 2024 at 9:20 PM Andrew MacLeod wrote: > > gimple_range_fold makes an assumption that if there is a LHS on a call > that it is an ssa_name. Especially later in compilation that may not be > true. It's always true if the LHS is of register type (is_gimple_reg_type) and never true

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-17 Thread Richard Biener
On Mon, 17 Jun 2024, Kewen.Lin wrote: > Hi Richi, > > on 2024/6/14 18:31, Richard Biener wrote: > > The following retires vcond{,u,eq} optabs by stopping to use them > > from the middle-end. Targets instead (should) implement vcond_mask > > and vec_cmp{,u,eq} optabs.

Re: [PATCH] Enhance if-conversion for automatic arrays

2024-06-16 Thread Richard Biener
On Fri, 14 Jun 2024, Andrew Pinski wrote: > On Fri, Jun 14, 2024 at 5:54 AM Richard Biener wrote: > > > > Automatic arrays that are not address-taken should not be subject to > > store data races. > > That seems conservative enough. Though I would think if the array &

RE: [PATCH] tree-optimization/114589 - remove profile based sink heuristics

2024-06-14 Thread Richard Biener
nts here? It definitely looks like a latent issue being triggered. Either in LRA or in how the target presents itself. Richard. > Pan > > -----Original Message- > From: Richard Biener > Sent: Wednesday, May 15, 2024 5:39 PM > To: gcc-patches@gcc.gnu.org > Subject: [PAT

[PATCH] Enhance if-conversion for automatic arrays

2024-06-14 Thread Richard Biener
Automatic arrays that are not address-taken should not be subject to store data races. This applies to OMP SIMD in-branch lowered functions result array which for the testcase otherwise prevents vectorization with SSE and for AVX and AVX512 ends up with spurious .MASK_STORE to the stack

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-14 Thread Richard Biener
On Fri, 14 Jun 2024, Richard Biener wrote: > On Fri, 14 Jun 2024, Richard Sandiford wrote: > > > Richard Biener writes: > > > On Fri, 14 Jun 2024, Richard Sandiford wrote: > > > > > >> Richard Biener writes: > > >> > The foll

Re: [PATCH] middle-end/114189 - drop uses of vcond{,u,eq}_optab

2024-06-14 Thread Richard Biener
On Fri, 14 Jun 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Fri, 14 Jun 2024, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > The following retires vcond{,u,eq} optabs by stopping to use them > >> > from the

<    1   2   3   4   5   6   7   8   9   10   >