https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109519
--- Comment #5 from Sebastian Pop ---
Thanks Andrew for the patch, it fixes the issue.
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
Steps to reproduce:
$ git clone https://github.com/sebpop/bitshuffle.git -b gcc-10-bug
$ cd bitshuffle/reproduce
$ make
$ ./a.out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
--- Comment #10 from Sebastian Pop ---
Patch for arm64:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607601.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107485
--- Comment #10 from Sebastian Pop ---
Thanks Richard.
The patch fixed the larger test as well.
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
On arm64-linux I see the following crash only on gcc-10.
I do not see the ICE on gcc-11, 12, and trunk.
$ ~/gcc-10/bld/gcc/cc1plus -fnon-call-exceptions f.ii
[...]
f.ii:29:23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment #9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
Sebastian Pop changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
Sebastian Pop changed:
What|Removed |Added
Attachment #52762|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
Sebastian Pop changed:
What|Removed |Added
Attachment #52755|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
--- Comment #4 from Sebastian Pop ---
The attached patch degrades performance on cpus with LSE: the barrier is not
needed when outline-atomics execute an LSE instruction.
I was thinking to add the barrier to the armv8.0 generic path (no LSE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
Sebastian Pop changed:
What|Removed |Added
Attachment #52750|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
--- Comment #2 from Sebastian Pop ---
Created attachment 52750
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52750=edit
patch
Fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162
--- Comment #1 from Sebastian Pop ---
Also happens when compiling with LSE: -march=armv8.1-a or later.
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
With -mno-outline-atomics gcc produces a `dmb ish` barrier on __sync builtins
as required by the Intel specification
(see fix
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
Created attachment 50289
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50289=edit
pre-processed reduced testcase
gcc-8, gcc-9, and gcc-10 from Ubu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99012
--- Comment #3 from Sebastian Pop ---
I do not see the bug with today's cc1plus from origin/releases/gcc-8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99012
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment #2
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
The use of NEON intrinsics is inefficient and leads developers to prefer inline
assembly instead of intrinsics.
A similar performance bug
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
The following text in doc/invoke.texi seems to be outdated. To avoid confusion
the text needs to be more specific on which NEON implementations it applies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92665
--- Comment #7 from Sebastian Pop ---
Hi Andrew, have you committed the fix for this?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692
--- Comment #23 from Sebastian Pop ---
> I don't see anything like that on the gcc-9 branch - are you sure you don't
> have an outstanding change somehow?
You are right, a part of the -moutline-atomics patch that I am working on
backporting to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment #21
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
With gcc as of today I see dup instructions that could be optimized out:
$ cat red.c
#include "arm_neon.h"
int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86865
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87917
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
--- Comment #3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82449
--- Comment #2 from Sebastian Pop ---
This part is not affine: {0, +, {1, +, 1}_1}_1
This is a polynomial of degree 2.
Are you sure the scev analysis reports this as affine?
I was trying to understand from the fortran code which part this scev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69728
--- Comment #22 from Sebastian Pop ---
> I put it on my TODO to figure out how to "DCE" a stmt
> (or in this case it's rather the whole "loop body", right?).
The code generator would not even see a statement to be generated: it would
just
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81373
--- Comment #4 from Sebastian Pop ---
The patch looks good. Thanks!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622
--- Comment #10 from Sebastian Pop ---
> So a black-box would be a set of stmts rather than a whole GIMPLE BB
Correct: this can be an abstract view of the IR. The only place where we want
to start transforming the code is in the code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69728
--- Comment #19 from Sebastian Pop ---
> So how'd we properly handle a valid empty domain?
DCE the statement.
If the domain for a statement is empty, it means that the statement does not
execute: it is dead code.
I think we are better
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622
--- Comment #8 from Sebastian Pop ---
> I would have expected at least each memory op to be in a separate "black box"
We could have a pass before graphite that splits BBs with more than one write
into blocks that contain one data write with all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69728
--- Comment #15 from Sebastian Pop ---
It makes sense to early fail when the schedule builder gets confused and built
an empty domain. Could you please also add a comment around the if that sets
schedule_error? The change looks good. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622
--- Comment #4 from Sebastian Pop ---
Yes, that phi node looks like a reduction. We need to handle the phi as a
write to expose the loop carried reduction variable to the dependence analysis.
I think your change goes in the right direction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #15 from Sebastian Pop ---
> when DR_NUM_DIMENSIONS (dr1->dr) != DR_NUM_DIMENSIONS (dr2->dr) better "FAIL"?
Yes.
The patch looks good to me.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65972
--- Comment #9 from Sebastian Pop ---
In the link in the previous comment, Richi has a similar patch as suggested by
Dehao pending review/test/commit: let's close this bug when Richi's patch lands
in trunk.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65972
--- Comment #8 from Sebastian Pop ---
Yes please!
This patch also solves the problem I was chasing a week or so ago:
https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00067.html
I also know that this is ICE-ing on a large proprietary project when
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79637
--- Comment #3 from Sebastian Pop ---
As to why we call it a "finite state automaton" jump threading, that is because
this transform shows to be useful when the switch statement in the previous
example is contained in a loop, which is the way
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79637
--- Comment #2 from Sebastian Pop ---
Here is what I see in doc/invoke.texi:
@item max-fsm-thread-path-insns
Maximum number of instructions to copy when duplicating blocks on a
finite state automaton jump thread path. The default is 100.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69675
--- Comment #10 from Sebastian Pop ---
(In reply to Richard Biener from comment #9)
> Yeah, seems to be gone with ISL 0.18 here as well... (but with 0.16.1 I can
> still reproduce it). ISL 0.18 doesn't do anything to the loop. ISL 0.16.1
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #11 from Sebastian Pop ---
(In reply to Richard Biener from comment #10)
> But then with different number of subscripts (and also likely different
> DR_BASE_OBJECT) you can't do anything with them and have to assume
> dependence.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #9 from Sebastian Pop ---
/* Determines the base object and the list of indices of memory reference
DR, analyzed in LOOP and instantiated in loop nest NEST. */
static void
dr_analyze_indices (struct data_reference *dr, loop_p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #8 from Sebastian Pop ---
The code in fault is called from pdr_add_memory_accesses()
Maybe the problem is in parsing the gimple MEM[] into a data reference.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #7 from Sebastian Pop ---
(In reply to Martin Liška from comment #5)
> Created attachment 40662 [details]
> Isolated graphite dump for miscompiled function
>
> As shown in the dump file, there are dependencies for the problematic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68823
--- Comment #4 from Sebastian Pop ---
The data dependence relations are dumped in the output of
-fdump-tree-graphite-all.
graphite-dependences.c contains the code for the data dependence computations.
Looking at the gimple code it seems like a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77362
--- Comment #10 from Sebastian Pop ---
(In reply to Richard Biener from comment #9)
> Yeah, but the user can write such dependences himself so ideally we have
> a way to undo them, like by using local scratch memory? So
You are right.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77362
--- Comment #8 from Sebastian Pop ---
LIM in general is bad for loop transforms: it introduces loop carried
dependences. If we can move graphite before LIM that would solve some problems.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77362
--- Comment #7 from Sebastian Pop ---
The fix looks good. Thanks!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77605
--- Comment #6 from Sebastian Pop ---
The proposed change looks good to me.
"last_conflicts" is the max index in the conflicting functions for which there
is a dependence:
mem_access_a (conflicting_iterations_in_a (last_conflicts)) is in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70956
--- Comment #2 from Sebastian Pop ---
The change looks good to me.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159
--- Comment #9 from Sebastian Pop ---
Created attachment 37927
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37927=edit
patch for hoisting expressions
Updated the patch from PR23286 to hoist the redundant expressions:
:
inv_4 =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159
--- Comment #7 from Sebastian Pop ---
(In reply to Andrew Pinski from comment #6)
> Note this is both a hoisting and a sinking issue.
> Hoisting should happen before sinking.
> LLVM looks like it only implements sinking.
You are right: LLVM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159
--- Comment #2 from Sebastian Pop ---
Right, with -Ofast it be able to optimize away the branch or selects.
The original benchmark had something more complex than fadd to use the tmin and
tmax results. Here is one more test using the results in
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
$ cat h.c
float foo_p(float d, float min, float max, float a)
{
float tmin;
float tmax;
float inv = 1.0f / d;
if (inv >= 0) {
tmin = (min - a) * inv;
tmax = (max - a) *
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69545
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69545
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68343
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68398
--- Comment #4 from Sebastian Pop ---
Thanks Jeff for looking into this issue.
I was thinking about a heuristic as you mentioned in comment #2:
what about allowing creation of irreducible loops, multiple latches, etc. after
the loop optimizers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69341
Bug 69341 depends on bug 68692, which changed state.
Bug 68692 Summary: [6 Regression][graphite] ice: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69292
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692
--- Comment #9 from Sebastian Pop ---
*** Bug 69292 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68976
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68756
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659
Sebastian Pop changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|DUPLICATE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68667
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68692
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68693
--- Comment #3 from Sebastian Pop ---
Author: spop
Date: Fri Dec 4 21:36:55 2015
New Revision: 231309
URL: https://gcc.gnu.org/viewcvs?rev=231309=gcc=rev
Log:
fix PR68693: Check for loop structure when extending the SCoP
The check for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68693
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550
Sebastian Pop changed:
What|Removed |Added
CC||sch...@linux-m68k.org
--- Comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659
--- Comment #6 from Sebastian Pop ---
I do not see the error on today's trunk at r231233. Could you please verify
that this has been fixed by our changes from yesterday?
Thanks!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550
--- Comment #4 from Sebastian Pop ---
Author: spop
Date: Wed Dec 2 20:40:17 2015
New Revision: 231206
URL: https://gcc.gnu.org/viewcvs?rev=231206=gcc=rev
Log:
fix PR68550: do not handle ISL loop peeled statements
In case ISL did some loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68550
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68565
--- Comment #2 from Sebastian Pop ---
Author: spop
Date: Mon Nov 30 20:39:16 2015
New Revision: 231086
URL: https://gcc.gnu.org/viewcvs?rev=231086=gcc=rev
Log:
check for ISL generated code that leads to division by zero
we used to generate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68565
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68453
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67984
Sebastian Pop changed:
What|Removed |Added
Status|WAITING |RESOLVED
Known to work|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68493
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68314
--- Comment #2 from Sebastian Pop ---
This patch exposes the problem without valgrind:
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 2054fad..b932dae 100644
--- a/gcc/graphite-sese-to-poly.c
+++
||2015-11-23
CC||spop at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #2 from Sebastian Pop ---
I cannot reproduce the error on GCC 6.0 trunk.
Also, please provide a reduced testcase, the attached
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68314
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279
--- Comment #5 from Sebastian Pop ---
After fixing the graphite fail, I get these warnings from the testcase in
comment4:
FAIL: gfortran.dg/graphite/pr68279.f90 -O (test for excess errors)
Excess errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68493
--- Comment #1 from Sebastian Pop ---
Passes on ISL 0.14, fails with 0.15.
This patch fixes it: we will bootstrap and commit.
diff --git a/gcc/graphite-isl-ast-to-gimple.c
b/gcc/graphite-isl-ast-to-gimple.c
index 30c3a21..2783ac4 100644
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68314
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68453
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68335
--- Comment #4 from Sebastian Pop ---
testcase added in r230630
||spop at gcc dot gnu.org
Resolution|--- |FIXED
--- Comment #1 from Sebastian Pop ---
Fixed in r230632
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68335
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68341
Sebastian Pop changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63602
Sebastian Pop changed:
What|Removed |Added
Known to work||6.0
Summary|[4.9/5/6
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: spop at gcc dot gnu.org
Target Milestone: ---
We have seen a performance regression due to r229685.
We see fewer FSM jump threads on the reduced testcase.
CC=2015-11-02-23-23-28-d3063db-trunk/bin/gcc
$CC -O3 m.c -fdump
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68343
Sebastian Pop changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68279
Sebastian Pop changed:
What|Removed |Added
CC||spop at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66070
--- Comment #4 from Sebastian Pop ---
r227572
||spop at gcc dot gnu.org
Resolution|--- |FIXED
--- Comment #3 from Sebastian Pop ---
Fixed on trunk with a recent ISL-0.15 that contains the compute time out
functions.
$ time gcc -O2 -floop-parallelize-all -c rdft.i
real 0m1.763s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47598
Sebastian Pop changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
1 - 100 of 502 matches
Mail list logo