> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, May 22, 2024 10:48 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH 3/4]AArch64: add new altern
Hi All,
This patch adds new alternatives to the patterns which are affected. The new
alternatives with the conditional early clobbers are added before the normal
ones in order for LRA to prefer them in the event that we have enough free
registers to accommodate them.
In case register pressure
Hi All,
This enables the new tuning flag for Neoverse V1, Neoverse V2 and Neoverse N2.
It is kept off for generic codegen.
Note the reason for the +sve even though they are in aarch64-sve.exp is if the
testsuite is ran with a forced SVE off option, e.g. -march=armv8-a+nosve then
the intrinsics
>
> Sorry for the bike-shedding, but how about something like "avoid_pred_rmw"?
> (I'm open to other suggestions.) Just looking for something that describes
> either the architecture or the end result that we want to achieve.
> And preferable something fairly short :)
>
> avoid_* would be
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, May 15, 2024 10:31 PM
> To: Tamar Christina
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd
> ; Richard Earnshaw ; Marcus
> Shawcroft ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH 0/4]AArch64: s
> -Original Message-
> From: pan2...@intel.com
> Sent: Tuesday, May 21, 2024 2:13 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com; Pan Li
>
> Subject: [
Hi Pan,
> -Original Message-
> From: pan2...@intel.com
> Sent: Monday, May 20, 2024 12:01 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com; Pan Li
>
> Subject: [PATCH v1 1/2] Ma
> -Original Message-
> From: pan2...@intel.com
> Sent: Sunday, May 19, 2024 5:17 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com; Pan Li
>
> Subject: [PATCH v1] Match: Extract int
t: 1073741824]:
> _1 = x_3(D) + y_4(D);
> if (_1 >= x_3(D))
> goto ; [65.00%]
> else
> goto ; [35.00%]
>
>[local count: 697932184]:
>
> [local count: 1073741824]:
> # _2 = PHI <65535(2), _1(3)>
> return _2;
> }
>
>
> -Original Message-
> From: Richard Biener
> Sent: Friday, May 17, 2024 10:46 AM
> To: Tamar Christina
> Cc: Victor Do Nascimento ; gcc-
> patc...@gcc.gnu.org; Richard Sandiford ; Richard
> Earnshaw ; Victor Do Nascimento
>
> Subject: Re: [PATCH] middle-e
> -Original Message-
> From: Hongtao Liu
> Sent: Friday, May 17, 2024 3:14 AM
> To: Victor Do Nascimento
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ;
> Richard Earnshaw ; Victor Do Nascimento
>
> Subject: Re: [PATCH] middle-end: Expand {u|s}dot product support in
> autovectorizer
> -Original Message-
> From: Richard Biener
> Sent: Friday, May 17, 2024 6:51 AM
> To: Victor Do Nascimento
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ;
> Richard Earnshaw ; Victor Do Nascimento
>
> Subject: Re: [PATCH] middle-end: Expand {u|s}dot product support in
>
Hi,
> -Original Message-
> From: Victor Do Nascimento
> Sent: Thursday, May 16, 2024 2:57 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Richard Earnshaw
> ; Victor Do Nascimento
>
> Subject: [PATCH] middle-end: Drop __builtin_pretech calls in autovectorization
> [PR114061]'
Hi Victor,
> -Original Message-
> From: Victor Do Nascimento
> Sent: Thursday, May 16, 2024 3:39 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Richard Earnshaw
> ; Victor Do Nascimento
>
> Subject: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer
>
>
> -Original Message-
> From: pan2...@intel.com
> Sent: Thursday, May 16, 2024 5:06 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com; Richard Sandiford
> ; Pan Li
> Subject: [PATCH v2
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, May 15, 2024 10:31 PM
> To: Tamar Christina
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd
> ; Richard Earnshaw ; Marcus
> Shawcroft ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH 0/4]AArch64: s
> >> On Wed, May 15, 2024 at 12:29 PM Tamar Christina
> >> wrote:
> >> >
> >> > Hi All,
> >> >
> >> > Some Neoverse Software Optimization Guides (SWoG) have a clause that
> >> > state
> >>
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, May 15, 2024 12:20 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; ktkac...@gcc.gnu.org; Richard Sandiford
>
> Subject: Re: [PATCH 0/4]AArch64
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, May 15, 2024 11:56 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH 2/4]AArch64: add new tu
Hi All,
This patch adds new alternatives to the patterns which are affected. The new
alternatives with the conditional early clobbers are added before the normal
ones in order for LRA to prefer them in the event that we have enough free
registers to accommodate them.
In case register pressure
Hi All,
This enables the new tuning flag for Neoverse V1, Neoverse V2 and Neoverse N2.
It is kept off for generic codegen.
Note the reason for the +sve even though they are in aarch64-sve.exp is if the
testsuite is ran with a forced SVE off option, e.g. -march=armv8-a+nosve then
the intrinsics
Hi All,
This converts the single alternative patterns to the new compact syntax such
that when I add the new alternatives it's clearer what's being changed.
Note that this will spew out a bunch of warnings from geninsn as it'll warn that
@ is useless for a single alternative pattern. These are
Hi All,
This adds a new tuning parameter EARLY_CLOBBER_SVE_PRED_DEST for AArch64 to
allow us to conditionally enable the early clobber alternatives based on the
tuning models.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
Hi All,
Some Neoverse Software Optimization Guides (SWoG) have a clause that state
that for predicated operations that also produce a predicate it is preferred
that the codegen should use a different register for the destination than that
of the input predicate in order to avoid a performance
Hi Pan,
Thanks!
> -Original Message-
> From: pan2...@intel.com
> Sent: Wednesday, May 15, 2024 3:14 AM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li
&
> -Original Message-
> From: pan2...@intel.com
> Sent: Monday, May 13, 2024 3:54 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com;
> Tamar Christina ; Richard Sandiford
> ; Pan Li
> Subject: [PATCH v1
for each shape here ? Both works for
> me.
>
Yeah, I think that's better than iterating over the statements twice. It also
fits better
In the existing code.
Tamar.
> #define SAT_ADD_U_1(T) \
> T sat_add_u_1_##T(T x, T y) \
> { \
> return (T)(x + y) >= x ? (x + y) : -1;
> -Original Message-
> From: Richard Biener
> Sent: Friday, May 10, 2024 2:07 PM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Allow patterns in SLP reductions
>
> On Fri, Mar 1, 2024 at 10:21 AM Richard Biener wrote:
> >
> > The following removes the
Hi Pan,
> -Original Message-
> From: pan2...@intel.com
> Sent: Monday, May 6, 2024 3:49 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li
> Subj
Hi Pan,
> -Original Message-
> From: pan2...@intel.com
> Sent: Monday, May 6, 2024 3:48 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li
> Subject:
t y) noexcept
{
uint64_t z;
if (!__builtin_add_overflow(x, y, ))
return z;
return -1u;
}
Is a valid and common way to do saturation too.
But for now, it's fine.
Cheers,
Tamar
> Sorry not sure if my understanding is correct, feel free to correct me.
>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Thursday, May 2, 2024 4:11 AM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com;
> Liu, Hongtao
> Subject: RE: [PATCH v3] Internal-fn: Introduce
Hi,
> From: Pan Li
>
> Update in v3:
> * Rebase upstream for conflict.
>
> Update in v2:
> * Fix one failure for x86 bootstrap.
>
> Original log:
>
> This patch would like to add the middle-end presentation for the
> saturation add. Aka set the result of add to the max when overflow.
> It
Hi All,
As the reporter in PR114769 points out the control flow for the abd detection
is hard to follow. This is because vect_recog_absolute_difference has two
different ways it can return true.
1. It can return true when the widening operation is matched, in which case
unprom is set,
Hi All,
In PR114741 we see that we have a regression in codegen when SVE is enable where
the simple testcase:
void foo(unsigned v, unsigned *p)
{
*p = v & 1;
}
generates
foo:
fmovs31, w0
and z31.s, z31.s, #1
str s31, [x1]
ret
instead of:
foo:
> On Tue, Apr 16, 2024 at 09:00:53AM +0200, Richard Biener wrote:
> > > PR tree-optimization/114403
> > > * gcc.dg/vect/vect-early-break_124-pr114403.c: Skip in ilp32.
> > >
> > > ---
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c
>
Hi all,
The testcase seems to fail vectorization on -m32 since the access pattern is
determined as too complex. This skips the vectorization check on ilp32 systems
as I couldn't find a better proxy for being able to do strided 64-bit loads and
I suspect it would fail on all 32-bit targets.
docs: document early break support and pragma novector
---
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index
b4c602a523717c1d64333e44aefb60ba0ed02e7a..aceecb86f17443cfae637e90987427b98c42f6eb
100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@
Hi All,
This is a story all about how the peeling for gaps introduces a bug in the upper
bounds.
Before I go further, I'll first explain how I understand this to work for loops
with a single exit.
When peeling for gaps we peel N < VF iterations to scalar.
This happens by removing N iterations
Hi All,
The report shows that we end up in a situation where the code has been peeled
for gaps and we have an early break.
The code for peeling for gaps assume that a scalar loop needs to perform at
least one iteration. However this doesn't take into account early break where
the scalar loop
Hi All,
This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07.
The AArch64 vector PCS does not allow simd calls with simdlen 1,
however due to a bug we currently do allow it for num == 0.
This causes us to emit a symbol that doesn't exist and we fail to link.
Bootstrapped Regtested
> -Original Message-
> From: Richard Biener
> Sent: Thursday, March 7, 2024 8:47 AM
> To: Robin Dapp
> Cc: gcc-patches ; Tamar Christina
>
> Subject: Re: [PATCH] vect: Do not peel epilogue for partial vectors
> [PR114196].
>
> On Wed, Mar 6, 202
e.
This would allow us to better understand what kind of gimple would have to to
deal with in
ISEL and VECT if we decide not to lower early.
Thanks,
Tamar
> Pan
>
> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, February 27, 2024 5:57 PM
> To: Richard Biener
> Am 19.02.24 um 08:36 schrieb Richard Biener:
> > On Sat, Feb 17, 2024 at 11:30 AM wrote:
> >>
> >> From: Pan Li
> >>
> >> This patch would like to add the middle-end presentation for the
> >> unsigned saturation add. Aka set the result of add to the max
> >> when overflow. It will take the
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, February 27, 2024 9:44 AM
> To: Tamar Christina
> Cc: pan2...@intel.com; gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai;
> yanzhang.w...@intel.com; kito.ch...@gmail.com;
> richard.sandiford@arm.com2;
> > The testcase shows an interesting case where we have multiple loops sharing
> > a
> > live value and have an early exit that go to the same location. The
> > additional
> > complication is that on x86_64 with -mavx we seem to also do prologue
> > peeling
> > on the loops.
> >
> > We
Hi All,
The testcase shows an interesting case where we have multiple loops sharing a
live value and have an early exit that go to the same location. The additional
complication is that on x86_64 with -mavx we seem to also do prologue peeling
on the loops.
We correctly identify which BB we need
Hi Pan,
> From: Pan Li
>
> Hi Richard & Tamar,
>
> Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping
> us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def.
> And then expand_US_PLUS in internal-fn.cc. Not very sure if my
> understanding is correct for
Hi All,
In certain cases we can have a situation where the merge block has a vUSE
virtual PHI and the exits do not. In this case for instance the exits lead
to an abort so they have no virtual PHIs. If we have a store before the first
exit and we move it to a later block during vectorization we
> -Original Message-
> From: Li, Pan2
> Sent: Monday, February 19, 2024 12:59 PM
> To: Tamar Christina ; Richard Biener
>
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang
> ; kito.ch...@gmail.com
> Subject: RE: [PATCH v1] Internal-fn: Add new in
> -Original Message-
> From: Tamar Christina
> Sent: Thursday, February 15, 2024 11:05 AM
> To: Richard Earnshaw (lists) ; gcc-
> patc...@gcc.gnu.org
> Cc: nd ; Marcus Shawcroft ; Kyrylo
> Tkachov ; Richard Sandiford
>
> Subject: RE: [PATCH]AArch64:
Thanks for doing this!
> -Original Message-
> From: Li, Pan2
> Sent: Monday, February 19, 2024 8:42 AM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang
> ; kito.ch...@gmail.com; Tamar Christina
>
> Subject: RE: [PATCH
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, February 15, 2024 2:56 PM
> To: Andrew Pinski
> Cc: gcc-patches@gcc.gnu.org; Tamar Christina
> Subject: Re: [PATCH] aarch64: Improve PERM<{0}, a, ...> (64bit) by adding
> whole
> vector shif
> -Original Message-
> From: Richard Earnshaw (lists)
> Sent: Thursday, February 15, 2024 11:01 AM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Marcus Shawcroft ; Kyrylo
> Tkachov ; Richard Sandiford
>
> Subject: Re: [PATCH]AArch64: xfail modes_1.f
Hi All,
This test has never worked on AArch64 since the day it was committed. It has
a number of issues that prevent it from working on AArch64:
1. IEEE does not require that FP operations raise a SIGFPE for FP operations,
only that an exception is raised somehow.
2. Most Arm designed
Hi, this I a new version of the patch updating some additional tests
because some of the LTO tests required a newer binutils than my distro had.
---
The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64)
shows that ls64 is an optional extensions and should not be
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, February 1, 2024 4:42 PM
> To: Tamar Christina
> Cc: Andrew Pinski ; gcc-patches@gcc.gnu.org; nd
> ; Richard Earnshaw ; Marcus
> Shawcroft ; Kyrylo Tkachov
>
> Subject: Re: [PATCH]AArch64: up
Hi All,
The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64)
shows that ls64 is an optional extensions and should not be enabled by default
for Armv8.7-a.
This drops it from the mandatory bits for the architecture and brings GCC inline
with LLVM and the achitecture.
>
> I think this isn't entirely good. For simple cases for
> do {} while the condition ends up in the latch while for while () {}
> loops it ends up in the header. In your case the latch isn't empty
> so it doesn't end up with the conditional.
>
> I think your patch is OK to the point of
Hi All,
Attaching a pragma to a loop which has a complex condition often gets the pragma
dropped. e.g.
#pragma GCC novector
while (i < N && parse_tables_n--)
before lowering this is represented as:
if (ANNOTATE_EXPR ) ...
But after lowering the condition is broken appart and attached to
Hi All,
When doing early break vectorization we should treat the final iteration as
possibly being partial. This so that when we calculate the vector loop upper
bounds we take into account that final iteration could have done some work.
The attached testcase shows that if we don't then cunroll
> -Original Message-
> From: Richard Biener
> Sent: Thursday, February 8, 2024 2:16 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH]middle-end: add two debug counters for early-break
> vectorization debuggi
Hi All,
This adds two new debug counter to aid in debugging early break code.
- vect_force_last_exit: when reached will always force the final loop exit.
- vect_skip_exit: when reached will skip selecting the current candidate exit
as the loop exit.
The first counter
> Please either drop lastprivate(k) clause or use linear(k:1)
> The iteration var of simd loop without collapse or with
> collapse(1) is implicitly linear with the step, and even linear
> means the value from the last iteration can be used after the
> simd construct. Overriding the data sharing
Hi All,
There's a bug in vectorizable_live_operation that restart_loop is defined
outside the loop.
This variable is supposed to indicate whether we are doing a first or last
index reduction. The problem is that by defining it outside the loop it becomes
dependent on the order we visit the
Hi All,
I had missed a conversion from unsigned long to uint64_t.
This fixes the failing test on -m32.
Regtested on x86_64-pc-linux-gnu with -m32 and no issues.
Committed as obvious.
Thanks,
Tamar
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-early-break_110-pr113467.c: Change unsigned
> It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"? Is that
> why you are doing gsi_move_before + gsi_prev? Why do gsi_prev
> at all?
>
As discussed on IRC, then how about this one.
Incremental building passed all tests and bootstrap is running.
Ok for master if bootstrap and regtesting
> > Ok for master?
>
> I think you need a lp64 target check for the large constants or
> alternatively use uint64_t?
>
Ok, how about this one.
Regtested on x86_64-pc-linux-gnu with -m32,-m64 and no issues.
Ok for master?
Thanks,
Tamar
gcc/testsuite/ChangeLog:
PR
> -Original Message-
> From: Richard Biener
> Sent: Monday, February 5, 2024 1:22 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH]middle-end: fix ICE when moving statements to empty BB
> [PR113731]
>
>
Hi All,
The report shows that if the FE leaves a label as the first thing in the dest
BB then we ICE because we move the stores before the label.
This is easy to fix if we know that there's still only one way into the BB.
We would have already rejected the loop if there was multiple paths into
Hi All,
We use gsi_move_before (_gsi, _gsi); to request that the new statement
be placed before any other statement. Typically this then moves the current
pointer to be after the statement we just inserted.
However it looks like when the BB is empty, this does not happen and the CUR
pointer
Hi All,
This just adds an additional runtime testcase for the fixed issue.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/testsuite/ChangeLog:
PR tree-optimization/113467
* gcc.dg/vect/vect-early-break_110-pr113467.c: New
> >
> > If the above is correct then I think I understand what you're saying and
> > will update the patch and do some Checks.
>
> Yes, I think that's what I wanted to say.
>
As discussed:
Bootstrapped Regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu no
issues.
Also checked both
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, February 1, 2024 2:24 PM
> To: Andrew Pinski
> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd
> ; Richard Earnshaw ; Marcus
> Shawcroft ; Kyrylo Tkachov
>
> Subject: Re: [PATCH]AArch64: up
Hi All,
In the vget_set_lane_1.c test the following entries now generate a zip1 instead
of an INS
BUILD_TEST (float32x2_t, float32x2_t, , , f32, 1, 0)
BUILD_TEST (int32x2_t, int32x2_t, , , s32, 1, 0)
BUILD_TEST (uint32x2_t, uint32x2_t, , , u32, 1, 0)
This is because the non-Q variant for
Hi All,
With recent updates to hwasan runtime libraries, the error reporting for
this particular check is has been reworked.
I would question why it has lost this message. To me it looks strange
that num_descriptions_printed is incremented whenever we call
PrintHeapOrGlobalCandidate whether
Hi All,
Recent libhwasan updates[1] intercept various string and memory functions.
These functions have checking in them, which means there's no need to
inline the checking.
This patch marks said functions as intercepted, and adjusts a testcase
to handle the difference. It also looks for HWASAN
> -Original Message-
> From: Andrew Pinski
> Sent: Monday, January 29, 2024 9:55 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; ja...@redhat.com;
> do...@redhat.com; k...@google.com; dvyu...@google.com
> Subject: Re: [PATCH][libsanitizer]: Sync fixes
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, January 30, 2024 9:51 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH]middle-end: check memory accesses in the destination block
> [PR113588].
>
Hi All,
When analyzing loads for early break it was always the intention that for the
exit where things get moved to we only check the loads that can be reached from
the condition.
However the main loop checks all loads and we skip the destination BB. As such
we never actually check the loads
Hi All,
Recently something in the midend had started inverting the branches by inverting
the condition and the branches.
While this is fine, it makes it hard to actually test. In RTL I disable
scheduling and BB reordering to prevent this. But in GIMPLE there seems to be
nothing I can do.
Hi All,
This cherry-picks and squashes the differences between commits
d3e5c20ab846303874a2a25e5877c72271fc798b..76e1e45922e6709392fb82aac44bebe3dbc2ea63
from LLVM upstream from compiler-rt/lib/hwasan/ to GCC on the changes relevant
for GCC.
This is required to fix the linked PR.
As mentioned
Hi All,
The AArch64 vector PCS does not allow simd calls with simdlen 1,
however due to a bug we currently do allow it for num == 0.
This causes us to emit a symbol that doesn't exist and we fail to link.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master? and for
Hi All,
As suggested in the ticket this replaces the expansion by converting the
Advanced SIMD types to SVE types by simply printing out an SVE register for
these instructions.
This fixes the subreg issues since there are no subregs involved anymore.
Bootstrapped Regtested on
Hi All,
This renamed main_exit_p to last_val_reduc_p to more accurately
reflect what the value is calculating.
Ok for master if bootstrap passes? Incremental build shows it's fine.
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop.cc (vect_get_vect_def,
Hi All,
This fixes a bug where vect_create_epilog_for_reduction does not handle the
case where all exits are early exits. In this case we should do like induction
handling code does and not have a main exit.
Bootstrapped Regtested on x86_64-pc-linux-gnu
with --enable-checking=release
Hi All,
This replaces two more usages of single_exit that I had missed before.
They both seem to happen when we re-use the ifcvt scalar loop for versioning.
The condition in versioning is the same as the one for when we don't re-use the
scalar loop.
I hit these during an LTO enabled bootstrap
Hi All,
This removes -save-temps from the tests I've introduced to fix the LTO
mismatches.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issue
Ok for master?
Thanks,
Tamar
gcc/testsuite/ChangeLog:
PR testsuite/113319
* gcc.dg/bic-bitmask-13.c:
Hi All,
Instead of searching for where to move stores to, they should always be in
exit belonging to the latch. We can only ever delay stores and even if we
pick a different exit than the latch one as the main one, effects still
happen in program order when vectorized. If we don't move the
Hi All,
When we have a loop with more than 2 exits and a reduction I forgot to fill in
the PHI value for all alternate exits.
All alternate exits use the same PHI value so we should loop over the new
PHI elements and copy the value across since we call the reduction calculation
code only once
> But I'm afraid I have no idea how is this supposed to work on
> non-bitint targets or where __BITINT_MAXWIDTH__ is smaller than 9020.
> There is no loop at all there, so what should be vectorized?
>
Yeah It was giving an unresolved and I didn't notice in diff.
> I'd say introduce
> # Return 1
Hi All,
This changes the tests I committed for PR113287 to also
run on targets that don't support bitint.
Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues and tests run on both.
Ok for master?
Thanks,
Tamar
gcc/testsuite/ChangeLog:
PR tree-optimization/113287
> -Original Message-
> From: Jakub Jelinek
> Sent: Wednesday, January 10, 2024 2:42 PM
> To: Tamar Christina ; Richard Biener
>
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH]middle-end: correctly identify the edge taken when
Hi All,
The vectorizer needs to know during early break vectorization whether the edge
that will be taken if the condition is true stays or leaves the loop.
This is because the code assumes that if you take the true branch you exit the
loop. If you don't exit the loop it has to generate a
Hi All,
Should control enter the switch from one of the cases other than
the IVDEP one then the variable remains uninitialized.
This fixes it by initializing it to false.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues
Committed as obvious.
Thanks,
Tamar
ping
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 5, 2024 1:31 PM
> To: Xi Ruoyao ; Palmer Dabbelt
> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; Jeff Law
>
> Subject: RE: [PATCH]middle-end: Don't apply copysign optimizat
Hi All,
It looks like the previous patch had an unused variable.
It's odd that my bootstrap didn't catch it (I'm assuming
-Werror is still on for O3 bootstraps) but this fixes it.
Committed to fix bootstrap.
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop.cc
Hmm I'm confused as to why It didn't break mine.. just did one again.. anyway
I'll remove the unused variable.
> -Original Message-
> From: Rainer Orth
> Sent: Tuesday, January 9, 2024 4:06 PM
> To: Richard Biener
> Cc: Tamar Christina ; gcc-patches@gcc.gn
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, January 9, 2024 1:51 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with
> multiple exits [PR11314
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, January 9, 2024 12:26 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with
> multiple exits [PR1
1 - 100 of 1433 matches
Mail list logo