On Wed, Oct 30, 2024 at 1:56 AM Andrew Pinski wrote:
>
> After r13-707-g68e0063397ba82, COND_EXPR for gimple assign no longer could
> contain a comparison.
> The vectorizer was builting gimple assigns with comparison until
> r15-4695-gd17e672ce82e69
> (which added an assert to make sure it no lo
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> In preparation to changes that may modify both inner and outer
> conditions in ifcombine, drop the redundant parameter result_inv, that
> is always identical to inner_inv.
OK.
>
> for gcc/ChangeLog
>
> * tree-ssa-ifcombine.cc
The inline function wraps the vector_cost class API and no longer is
a good representation of the query style of that class which makes it
also difficult to extend.
Boostrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-vectorizer.h (finish_cost): Inline everywhere and remove.
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> Rework ifcombine to support merging conditions from noncontiguous
> blocks. This depends on earlier preparation changes.
>
> The function that attempted to ifcombine a block with its immediate
> predecessor, tree_ssa_ifcombine_bb, now l
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> Prepare to handle noncontiguous ifcombine, introducing logic to modify
> the outer condition when needed. There are two cases worth
> mentioning:
>
> - when blocks are noncontiguous, we have to place the combined
> condition in the ou
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> Prepare for ifcombining noncontiguous blocks, adding (still unused)
> logic to the ifcombine profile updater to handle such cases.
>
>
> for gcc/ChangeLog
>
> * tree-ssa-ifcombine.cc (known_succ_p): New.
> (update_profil
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> Refactor ifcombine_ifandif, moving the common code from the various
> paths that apply the combined condition to a new function.
>
>
> for gcc/ChangeLog
>
> * tree-ssa-ifcombine.cc (ifcombine_replace_cond): Factor out
>
On Fri, Oct 25, 2024 at 4:39 PM Alexandre Oliva wrote:
>
>
> Disallowing vuses in blocks for ifcombine is too strict, and it
> prevents usefully moving fold_truth_andor into ifcombine. That
> tree-level folder has long ifcombined loads, absent other relevant
> side effects.
OK.
Richard.
>
> fo
On Wed, Oct 30, 2024 at 8:47 AM Kugan Vivekanandarajah
wrote:
>
> Hi Richard,
>
> > On 29 Oct 2024, at 8:33 pm, Richard Biener
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Tue, Oct 29, 2024 at 9:24
t; +tree
> +build_cheap_unsigned_int_sat_add (tree op_0, tree op_1, tree type)
> +{
> + /* (bit_ior @2 (negate (convert (lt @2 @0)) */
> + return build2 (BIT_IOR_EXPR, type,
> +build2 (PLUS_EXPR, type, op_0, op_1),
> +build1 (NEGATE_EXPR, type,
>
On Wed, Oct 30, 2024 at 2:39 AM wrote:
>
> From: Pan Li
>
> This patch would like to add doc for the below 2 standard names.
>
> 1. strided load: v = mask_len_strided_load (ptr, stried, mask, len, bias)
> 2. strided store: mask_len_stried_store (ptr, stride, v, mask, len, bias)
>
> gcc/ChangeLog:
On Tue, 29 Oct 2024, Alex Coplan wrote:
> On 29/10/2024 13:39, Richard Biener wrote:
> > On Mon, 28 Oct 2024, Alex Coplan wrote:
> >
> > > This allows us to vectorize more loops with early exits by forcing
> > > peeling for alignment to make sure that we're
p;& defined(HAVE_DECL_FMEMOPEN)
> +#if defined(GENMATCH_SELFTESTS) && HAVE_DECL_FMEMOPEN
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wsuggest-attribute=format"
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Given we no longer want vcond[u]{,_eq} and VEC_COND_EXPR or COND_EXPR
with embedded GENERIC comparisons the whole check_bool_pattern
and adjust_bool_stmts machinery is dead. It is effectively dead
after r15-4713-g0942bb85fc5573 and the following patch removes it.
Bootstrapped and tested on x86_64
On Mon, Oct 28, 2024 at 4:45 PM Akram Ahmad wrote:
>
> Case 7 of unsigned scalar saturating addition defines
> SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as
> SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1
> being commutative.
>
> The pattern for case 7 currently does
The following removes the code path in vect_recog_mask_conversion_pattern
dealing with comparisons in COND_EXPRs. That can no longer happen.
* tree-vect-patterns.cc (vect_recog_mask_conversion_pattern):
Remove COMPARISON_CLASS_P rhs1 of COND_EXPR case and assert
it doesn't
1 file changed, 3 insertions(+), 1 deletion(-)
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Sat, 26 Oct 2024, Richard Biener wrote:
> With the intent to rely on vec_cond_mask and vec_cmp patterns
> comparisons do not need rewriting into COND_EXPRs that eventually
> combine to vec_cond{u,eq,}.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
So with thi
ed on aarch64-linux-gnu, no regression.
> OK for mainline?
>
> Signed-off-by: Soumya AR
>
> gcc/ChangeLog:
> PR target/57492
> * match.pd: Added patterns to fold certain calls to pow to ldexp.
>
> gcc/testsuite/ChangeLog:
> PR target/57492
>
h.pd: Fold logN(x) CMP CST -> x CMP expN(CST)
> and expN(x) CMP CST -> x CMP logN(CST)
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/log_exp.c: New test.
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, Oct 29, 2024 at 9:27 AM wrote:
>
> From: Pan Li
>
> There are sorts of forms for the unsigned SAT_ADD. Some of them are
> complicated while others are cheap. This patch would like to simplify
> the complicated form into the cheap ones. For example as below:
>
> From the form 4 (branch)
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote:
>
> From: Andi Kleen
>
> The bit cluster code generation strategy is only beneficial when
> multiple case labels point to the same code. Do a quick check if
> that is the case before trying to cluster.
>
> This fixes the switch part of PR117091 wh
On Tue, 29 Oct 2024, Richard Biener wrote:
> On Mon, 28 Oct 2024, Alex Coplan wrote:
>
> > Currently we only cost gconds for the vector loop while we omit costing
> > them when analyzing the scalar loop; this unfairly penalizes the vector
> > loop in the case o
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote:
>
> From: Andi Kleen
>
> The current switch bit test clustering enumerates all possible case
> clusters combinations to find ones that fit the bit test constrains
> best. This causes performance problems with very large switches.
>
> For bit test
On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski wrote:
>
> r0-126134-g5d2a9da9a7f7c1 added support for circuiting and combing the ifs
> into using either AND or OR. But it only allowed the inner condition
> basic block having the conditional only. This changes to allow up to 2
> defining
> statemen
On Mon, Oct 28, 2024 at 4:44 PM Akram Ahmad wrote:
>
> This patch adds a new case for unsigned scalar saturating subtraction
> using a branch with a greater-than-or-equal condition. For example,
>
> X >= (X - Y) ? (X - Y) : 0
>
> is transformed into SAT_SUB (X, Y) when X and Y are unsigned scalars
* tree-vect-loop-manip.cc (vect_do_peeling): Avoid emitting an
> epilogue guard for inverted early-exit loops.
> ---
> gcc/tree-vect-loop-manip.cc | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
gcc/tree-vect-loop-manip.cc | 2 +-
> 2 files changed, 92 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_130.c
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
| 24 ++
> 2 files changed, 49 insertions(+)
> create mode 100644 gcc/testsuite/g++.dg/vect/vect-early-break_6.cc
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, 28 Oct 2024, Alex Coplan wrote:
> This allows us to vectorize more loops with early exits by forcing
> peeling for alignment to make sure that we're guaranteed to be able to
> safely read an entire vector iteration without crossing a page boundary.
>
> To make this work for VLA architectu
On Mon, Oct 28, 2024 at 12:22 PM Sam James wrote:
>
> Sam James writes:
>
> > Sam James writes:
> >
> >> Add -Werror=lto-type-mismatch,odr to bootstrap-lto* configurations to
> >> help stop LTO breakage/correctness issues sneaking in.
> >>
> >> We discussed -Werror=strict-aliasing but it runs ea
On Sat, Oct 26, 2024 at 10:37 PM James K. Lowden
wrote:
>
> On Sat, 26 Oct 2024 11:22:20 +0800
> Xi Ruoyao wrote:
>
> > The changelog is not formatted correctly. gcc/ has its own
> > changelog. And gcc/cobol should have its own changelog too, like all
> > other frontends.
>
> Thank you for point
It turns out decide_masked_load_lanes accesses a stale SLP graph
so the following re-builds it instead.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/117343
* tree-vect-slp.cc (vect_optimize_slp_pass::build_vertices):
Support re-building
On Sat, Oct 26, 2024 at 12:20 AM Andrew Pinski wrote:
>
> On Thu, Oct 24, 2024 at 6:22 PM Li Xu wrote:
> >
> > From: xuli
> >
> > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> > we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
> > a branch ins
On Wed, Oct 23, 2024 at 12:47 PM wrote:
>
> From: Pan Li
>
> This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR
> for invariant stride memory access. For example as below
>
> void foo (int * __restrict a, int * __restrict b, int stride, int n)
> {
> for (int i = 0;
On Wed, Oct 23, 2024 at 12:47 PM wrote:
>
> From: Pan Li
>
> This patch would like to introduce new IFN for strided load and store.
>
> LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias)
> STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias)
>
> The IFN target below code ex
vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P
rhs1 COND_EXPRs which no longer appear - the following removes it.
Its testcases still pass, I believe the situation is mitigated by
bool pattern handling of the compare use in COND_EXPRs.
* tree-vect-patterns.cc (type_c
dr_may_alias_p ICEs when TYPE_SIZE of DR->ref is NULL but this is
valid IL when the access size of an aggregate copy can be infered
from the RHS.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/117333
* tree-data-ref.cc (dr_may_alias_p): Guard aga
On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah
wrote:
>
> Hi Richard,
> Thanks for the review.
>
> > On 28 Oct 2024, at 9:18 pm, Richard Biener
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> >
zer to expect and undefined value
> > in the inactive elements. Did your tests return any reasonable result?
>
> Sorry, I forgot to report back No, there were no suspicious test failures.
>
> Andrew
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote:
>
> From: Andi Kleen
>
> gcc/ChangeLog:
OK.
Thanks,
Richard.
> * common.opt: Enable -fbit-tests and -fjump-tables only at -O1.
> * opts.cc (default_options_table): Dito.
> ---
> gcc/common.opt | 4 ++--
> gcc/opts.cc| 2 ++
STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all
group members and when the group is later split due to duplicates
not all sub-groups inherit the flag.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/117307
* tree-vect-data-refs.cc
On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah
wrote:
>
> Hi,
>
> When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If the
> vector loop is not vectorized and removed, the scalar loop is still left with
> dont_vectorize. As a result, BB vectorization will not happen.
>
On Mon, Oct 28, 2024 at 12:41 AM Andrew Pinski wrote:
>
> ABSU_EXPR lowering incorrectly used the resulting type
> for the new expression but in the case of ABSU the resulting
> type is an unsigned type and with ABSU is folded away. The fix
> is to use a signed type for the expression instead.
>
>
On Mon, Oct 28, 2024 at 12:42 AM Andrew Pinski wrote:
>
> This moves the check for maybe_undef_p in match_simplify_replacement
> slightly earlier before figuring out the true/false arg using arg0/arg1
> instead.
> In most cases this is no difference in compile time; just in the case
> there is an
> Am 26.10.2024 um 19:06 schrieb Andrew Pinski :
>
> While looking to fix a possible trapping issue in PHI-OPT's factor,
> I noticed that some tree codes could be marked as trapping even
> though they don't have a possibility to trap. In the case of PAREN_EXPR,
> it is basically a nop except w
With the intent to rely on vec_cond_mask and vec_cmp patterns
comparisons do not need rewriting into COND_EXPRs that eventually
combine to vec_cond{u,eq,}.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
* tree-vect-patterns.cc (check_bool_pattern): For comparisons
we d
The following removes the code path in vect_recog_mask_conversion_pattern
dealing with comparisons in COND_EXPRs. That can no longer happen.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
* tree-vect-patterns.cc (vect_recog_mask_conversion_pattern):
Remove COMPARISON_
I believe the comparison handling in adjust_bool_pattern is dead
after no longer queueing them but there's a do_compare goto that's
not obviously dead. So I'm looking for testcases - the code path
would build a COND_EXPR with a comparison embedded.
Bootstrap and regtest running on x86_64-unknown-
vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P
rhs1 COND_EXPRs which no longer appear - the following removes it.
Its testcases still pass, I believe the situation is mitigated by
bool pattern handling of the compare use in COND_EXPRs.
Boostrap and regtest running on x86_64-
The following places a few strathegic asserts so we do not end up
with COND_EXPRs with a comparison as the first operand during
vectorization.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
These are the asserts I'd like to keep, I will followup with quite
some dead code removal when
This fixes up vect_recog_ctz_ffs_pattern.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): Create
a separate pattern stmt for the comparison in the generated
COND_EXPR.
---
gcc/tree-vect-patterns.cc | 7 +
On Thu, Oct 24, 2024 at 4:08 PM wrote:
>
> From: Pan Li
>
> There are sorts of forms for the unsigned SAT_ADD. Some of them are
> complicated while others are cheap. This patch would like to simplify
> the complicated form into the cheap ones. For example as below:
>
> From the form 3 (branch)
The following tries to finish building proper GIMPLE COND_EXPRs
in vectorizer pattern recognition.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Build
separate comparion pattern for the condition of a COND_EXPR
pa
On Wed, 23 Oct 2024, Jennifer Schmitz wrote:
>
>
> > On 22 Oct 2024, at 13:14, Richard Biener wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Tue, 22 Oct 2024, Jennifer Schmitz wrote:
> >
> >&
ormat): Add two tests, one for %M$.*N$s
> > and one for %M$.Ns.
>
> I'd like to ping this patch.
OK if David doesn't have any further comments.
Richard.
> Thanks.
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
As we want to transition to only vcond_mask expanders the following
makes it possible to easier distinguish queries that rely on
vcond queries for expand_vec_cond_expr_p from those of vcond_mask
by for the latter having the comparison code defaulted to ERROR_MARK.
Bootstrapped and tested on x86_64
ance) = roots;
> - SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL;
> - SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond;
> - new_instance->reduc_phis = NULL;
> - new_instance->cost_vec = vNULL;
> - new_instance->subgraph_entries = vNULL;
> - vinfo->slp_instances.safe_push (new_instance);
> - }
> + vect_build_slp_instance (vinfo, slp_inst_kind_gcond,
> +stmts, roots, remain,
> +max_tree_size, &limit,
> +bst_map, NULL, force_single_lane);
> }
>
> /* Find and create slp instances for inductions that have been forced
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
NULL;
>edge->caller = caller;
> --- gcc/pair-fusion.cc.jj 2024-10-22 17:09:09.372091098 +0200
> +++ gcc/pair-fusion.cc2024-10-24 11:13:07.023744574 +0200
> @@ -1962,7 +1962,10 @@ pair_fusion_bb_info::fuse_pair (bool loa
>
>auto ignore = ignore_changing_in
0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11,
> + 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22,
> + 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22,
> + 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33,
> + 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33,
&g
On Fri, Oct 25, 2024 at 12:22 AM Kugan Vivekanandarajah
wrote:
>
> Hi,
>
> This patch sets param_vect_max_version_for_alias_checks to 15.
> This was causing GCC to miss vectorization opportunities in one internal
> application making it slower than LLVM by about ~14%.
>
> I've tested different pa
On Thu, Oct 24, 2024 at 8:16 AM Andrew Pinski wrote:
>
> On Wed, Oct 23, 2024 at 12:28 AM Andrew Pinski wrote:
> >
> > On Tue, Oct 22, 2024 at 11:49 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Oct 22, 2024 at 5:31 PM Andrew Pinski
> > >
On Mon, Oct 21, 2024 at 4:23 PM Akram Ahmad wrote:
>
> Case 7 of unsigned scalar saturating addition defines
> SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as
> SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1
> being commutative.
>
> The pattern for case 7 currently does
genmatch was supposed to restrict :c to verifiable commutative
operations while leaving :C to the "I know what I'm doing" case.
The following enforces this, cleaning up parsing and amending
the commutative_op helper. There's one pattern that needs adjustment,
the pattern optimizing fmax (x, NaN) o
On Mon, Oct 21, 2024 at 4:22 PM Akram Ahmad wrote:
>
> This patch adds a new case for unsigned scalar saturating subtraction
> using a branch with a greater-than-or-equal condition. For example,
>
> X >= (X - Y) ? (X - Y) : 0
>
> is transformed into SAT_SUB (X, Y) when X and Y are unsigned
The following implements masked load-lane discovery for SLP. The
challenge here is that a masked load has a full-width mask with
group-size number of elements when this becomes a masked load-lanes
instruction one mask element gates all group members. We already
have some discovery hints in place,
When the mask is not a constant or external def there's no need to
check the scalar type, in particular with SLP and the mask being
a VEC_PERM_EXPR there isn't a scalar operand ready to check
(not one vect_is_simple_use will get you). We later check the
vector type and reject non-mask types there.
We have to remove CLOBBERs before SLP is code generated since for
store-lanes we are inserting our own CLOBBERs that we want to survive.
So the following refactors vect_transform_loop to remove unwanted
stmts first.
This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL.
Bootstrap and
On Thu, Oct 24, 2024 at 10:17 AM Jakub Jelinek wrote:
>
> Hi!
>
> I've tried to build stage3 with
> -Wleading-whitespace=blanks -Wtrailing-whitespace=blank
> -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
> added to STRICT_WARN and that expectably resulted in about
> 27
> Am 24.10.2024 um 09:29 schrieb Jakub Jelinek :
>
> Hi!
>
> gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator
> in case it splits objects, but unfortunately build_check_stmt was in
> some places (but not others) using a copy of the iterator rather than
> the iterator pas
On Tue, Oct 15, 2024 at 1:10 AM James K. Lowden
wrote:
>
> Consequent to advice, I'm preparing the Cobol front-end patches as a
> small number of hopefully meaningful patches covering many files.
>
> 1. meta files used by autotools etc.
> 2. gcc/cobol/*.h
> 3. gcc/cobol/*.{y,l,cc}
> 4. libgcob
On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote:
>
> From: Andi Kleen
>
> Retrieving sys/user time in timevars is quite expensive because it
> always needs a system call. Only getting the wall time is much
> cheaper because operating systems have optimized paths for this.
>
> The sys time isn't t
The following implements masked load-lane discovery for SLP. The
challenge here is that a masked load has a full-width mask with
group-size number of elements when this becomes a masked load-lanes
instruction one mask element gates all group members. We already
have some discovery hints in place,
When the mask is not a constant or external def there's no need to
check the scalar type, in particular with SLP and the mask being
a VEC_PERM_EXPR there isn't a scalar operand ready to check
(not one vect_is_simple_use will get you). We later check the
vector type and reject non-mask types there.
On Wed, Oct 23, 2024 at 8:50 AM Richard Biener
wrote:
>
> On Tue, Oct 22, 2024 at 7:21 PM Andrew Pinski
> wrote:
> >
> > When internal functions support was added to match
> > (r6-4979-gc9e926ce2bdc8b),
> > the check for ECF_CONST was the builtin function
On Tue, Oct 22, 2024 at 7:21 PM Andrew Pinski wrote:
>
> When internal functions support was added to match (r6-4979-gc9e926ce2bdc8b),
> the check for ECF_CONST was the builtin function side. Though before
> r15-4503-g8d6d6d537fdc,
> there was no use of maybe_push_res_to_seq with non-const intern
On Tue, Oct 22, 2024 at 5:31 PM Andrew Pinski wrote:
>
> This adds quick_emplace_push and safe_emplace_push to vec.
> These are like std::vector's emplace_back so you don't need an extra
> copy of the struct around.
>
> Since we require C++11 and also support partial non-PODs for vec, these
> fun
On Tue, 22 Oct 2024, Jennifer Schmitz wrote:
>
>
> > On 18 Oct 2024, at 09:11, Richard Biener wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Thu, 17 Oct 2024, Richard Sandiford wrote:
> >
> >&g
> Am 22.10.2024 um 20:07 schrieb Jakub Jelinek :
>
> Hi!
>
> On the following testcase without LTO we unnecessarily don't merge
> two identical .LC* constants (constant hashing computes the same hash,
> but as compare_constant returned false for the RAW_DATA_CST in it,
> it never compares equ
> Am 22.10.2024 um 20:04 schrieb Jakub Jelinek :
>
> Hi!
>
> CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375
> patch actually got it right in 6 other spots, but not in this function,
> where it used size_int rather than bitsize_int and so size_binop can ICE
> on type mismat
On Mon, Oct 21, 2024 at 4:30 AM Alexandre Oliva wrote:
>
> On Oct 10, 2024, Richard Biener wrote:
>
> > Thanks for working on this. There's #if 0 portions in the patch - did you
> > send the correct version?
>
> 'fraid so. Sorry, I'd forgotten about th
d, &ssa_names);
> + {
> + gimple_seq call_seq
> + = predicate_load_or_store (&gsi, stmt, mask, &ssa_names);
>
> - gsi_replace (&gsi, new_stmt, true);
> + gsi_replace_with_seq (&gsi, call_seq, true);
> + }
> + else
> + {
> + gimple *new_stmt;
> + new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names);
> + gsi_replace (&gsi, new_stmt, true);
> + }
> }
> else if (((lhs = gimple_assign_lhs (stmt)), true)
> && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Fri, 18 Oct 2024, Robin Dapp wrote:
> This patch adds an else operand to vectorized masked load calls.
> The current implementation adds else-value arguments to the respective
> target-querying functions that is used to supply the vectorizer with the
> proper else value.
>
> Right now, the onl
On Fri, 18 Oct 2024, Robin Dapp wrote:
> This patch adds else-operand handling to the internal functions.
>
> gcc/ChangeLog:
>
> * internal-fn.cc (add_mask_and_len_args): Rename...
> (add_mask_else_and_len_args): ...to this and add else handling.
> (expand_partial_load_optab_fn
On Mon, Oct 21, 2024 at 6:04 PM Andrew Pinski wrote:
>
> On Mon, Oct 21, 2024 at 3:41 AM Richard Biener
> wrote:
> >
> > On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski
> > wrote:
> > >
> > > After fixing loop-im to do the correct overflow rewritin
On Tue, Oct 22, 2024 at 1:19 PM Richard Sandiford
wrote:
>
> gcc.dg/torture/pr112305.c contains an inner loop that executes
> 0x8000_0014 times and an outer loop that executes 5 times, giving about
> 10 billion total executions of the inner loop body. At -O2 and above we
> are able to remove the
On Tue, 22 Oct 2024, Jennifer Schmitz wrote:
>
>
> > On 21 Oct 2024, at 10:51, Richard Biener wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Fri, 18 Oct 2024, Jennifer Schmitz wrote:
> >
> >
The diagnostics code fails to handle non-constant domain max.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/117254
* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg):
Check the array domain max is constant before using it.
*
d,
> enum out_edge_check *oe_check)
> {
> + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
> +{
> + tree base = expr;
> + if (TREE_CODE (expr) == ADDR_EXPR)
> + base = get_base_address (TREE_OPERAND (expr, 0));
> + disqualify_
On Tue, 22 Oct 2024, Jennifer Schmitz wrote:
>
>
> > On 22 Oct 2024, at 11:05, Richard Biener wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Tue, 22 Oct 2024, Jennifer Schmitz wrote:
> >
> >&
Value-numbering can use its set of equivalences to prove that
a PHI node with args is equal to a_1 iff on the
edges with the constants a_1 == 5 and a_1 == 10 hold. This
breaks down when the order of PHI args is <5, 10, a_1> as then
we drop to VARYING early. The following mitigates this by
shuffl
On Mon, Oct 14, 2024 at 1:12 PM wrote:
>
> From: Pan Li
>
> Almost the same as vector unsigned integer SAT_TRUNC, try to match
> the signed version during the vector pattern matching.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap
On Mon, Oct 14, 2024 at 1:12 PM wrote:
>
> From: Pan Li
>
> This patch would like to support the form 1 of the vector signed
> integer SAT_TRUNC. Aka below example:
>
> Form 1:
> #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
> void __attribute__((noinline))
On Mon, 21 Oct 2024, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Monday, October 21, 2024 9:55 AM
> > To: Tamar Christina
> > Cc: gcc-patches@gcc.gnu.org; nd
> > Subject: Re: [PATCH]middle-end: Handle more gcond
On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski wrote:
>
> After fixing loop-im to do the correct overflow rewriting
> for pointer types too. We end up with code like:
> ```
> _9 = (unsigned long) &g;
> _84 = _9 + 18446744073709551615;
> _11 = _42 + _84;
> _44 = (signed char *) _11;
> ...
>
On Mon, Oct 21, 2024 at 5:09 AM Li Xu wrote:
>
> From: xuli
>
> This patch would like to support .SAT_SUB when one of the op
> is IMM = max - 1 of form1.
>
> Form 1:
> #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
> T __attribute__((noinline)) \
> sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)
On Mon, Oct 21, 2024 at 6:13 AM Li Xu wrote:
>
> From: xuli
>
> This patch would like to support .SAT_SUB when one of the op
> is IMM = 1 of form1.
>
> Form 1:
> #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
> T __attribute__((noinline)) \
> sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \
>
On Sun, Oct 20, 2024 at 7:01 PM Andrew Pinski wrote:
>
> After r15-4503-g8d6d6d537fdc75, phiopt could move a trapping statement
> across another trapping statement causing the order to be incorrect (might
> happen
> with non-call exceptions).
> To prevent this, a trapping statement has to be at t
On Sat, Oct 19, 2024 at 5:08 PM Jeff Law wrote:
>
>
>
> On 10/18/24 7:41 PM, Andrew Pinski wrote:
> > Sometimes factor_out_conditional_operation can factor out
> > an operation that causes a phi node to become the same element.
> > Other times, we want to factor out a binary operator because
> > i
On Wed, Oct 16, 2024 at 6:20 PM Eikansh Gupta wrote:
>
> This patch simplify `(trunc)copysign ((extend)x, CST)` to `copysign (x,
> -1.0/1.0)`
> depending on the sign of CST. Previously, it was simplified to `copysign (x,
> CST)`.
> It can be optimized as the sign of the CST matters, not the valu
101 - 200 of 2590 matches
Mail list logo