Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Created attachment 40395
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40395&
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78794
--- Comment #9 from Yuri Rumyantsev ---
Hi Uros,
I checked thta with your patch performance is recovered on Avoton machine:
before after
462.libquantum18.400020.9000 +13.58%
Best regards.
Yuri.
2016-12-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78794
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 40322
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40322&action=edit
test-case to reproduce
Compile with -O2 -march=slm -m32 options to reproduce.
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
STV phase does not recognize some patterns after this revision, regression can
be reproduced with
: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Created attachment 40215
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40215&action=edit
test-case to reproduce
We noticed a huge performance reg
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Created attachment 40131
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40131&action=edit
test-case to reproduce, compile with -O3 option.
We noticed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
--- Comment #5 from Yuri Rumyantsev ---
Yes, I think so.
2016-11-15 14:49 GMT+03:00 rguenth at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
>
> Richard Biener changed:
>
>What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77445
--- Comment #4 from Yuri Rumyantsev ---
Ping.
Do you have any progress on this?
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 40036
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40036&action=edit
test-case to reproduce
Must be compiled with -O3 option to reproduce.
: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed huge (>15%) performance drop after fix in loop distribution phase.
Before fix fix distribution is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78268
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78116
--- Comment #7 from Yuri Rumyantsev ---
Compiler was configures with:
Configured with: /configure --enable-languages=c,c++
--enable-clocale=gnu --enable-cloog-backend=isl --enable-shared
--disable-libsanitizer --disable-bootstrap --disable-nls -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78116
--- Comment #5 from Yuri Rumyantsev ---
Yes, some virtual register are allocated on stack and we got more loads from
stack to get their values.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78116
--- Comment #3 from Yuri Rumyantsev ---
Created attachment 39910
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39910&action=edit
another test-case
Must be compiled with "-Ofast -fopenmp -funroll-loops -march=knl"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78116
--- Comment #2 from Yuri Rumyantsev ---
WE also found out performance drop on another important benchmark with the same
symptoms after r241170, namely loop marked with .L18 has +12 more fills from
stack. The test-case will be attached.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78116
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 39892
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39892&action=edit
test-case to reproduce
Must be compiled with "-Ofast -funroll-loops -march=knl" options.
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
I attached the simple test-case to reproduce issue.
Before this revision loop marked with label .L27 has 25 instructions but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78007
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 39821
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39821&action=edit
test-case to reproduce
It is sufficient to compiler it with -Ofast option on x86 platform.
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
The issue is related to missing support for __builtin_bswap32:
t1.c:9:3: note: function is not vectorizable.
t1.c:9:3: note: not vectorized: relevant stmt not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77498
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 39574
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39574&action=edit
test-case to reproduce
Need to compile with -O2 -ffast-math to reproduce.
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed significant regression after
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=239414
I attached si
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71956
--- Comment #5 from Yuri Rumyantsev ---
This bug is fixed by
Author: ppalka
Date: Sat Aug 27 22:00:17 2016
New Revision: 239798
URL: https://gcc.gnu.org/viewcvs?rev=239798&root=gcc&view=rev
Log:
Fix folding of VECTOR_CST comparisons
gcc/ChangeL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77445
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 39535
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39535&action=edit
test-case to reproduce
It is sufficient to compile it with -Ofast option.
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed huge (32%) performance drop on coremark-pro/core (former coremark
benchmark) after
http://gcc.gnu.org/viewcvs/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77344
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
lla/show_bug.cgi?id=71077
>
> --- Comment #6 from patrick at parcs dot ath.cx ---
> On Fri, 12 Aug 2016, ysrumyan at gmail dot com wrote:
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71077
>>
>> Yuri Rumyantsev changed:
>>
>>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71077
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71956
--- Comment #4 from Yuri Rumyantsev ---
Need to read "problem file is 176.gcc/src/sched.c, problem function
sched_analyze_insn.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71956
--- Comment #3 from Yuri Rumyantsev ---
It turned out that after r235653 (with minor int->bool type change) 176.gcc
started RF. If we turn off vrp phase benchmark passes. The problem fail is
sched.c. Note that avx2 is essential for reproducing. T
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70467
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71956
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72850
--- Comment #4 from Yuri Rumyantsev ---
Created attachment 39093
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39093&action=edit
test-case to reproduce
It is safficient use -Ofast option to compile on x86 machine.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72850
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72794
--- Comment #6 from Yuri Rumyantsev ---
Thanks for clarification.
This bug can be closed as user misunderstanding.
2016-08-04 14:08 GMT+03:00 rguenth at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72794
>
> --- Comment #5 fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72794
--- Comment #4 from Yuri Rumyantsev ---
I assume that there is still issue in lto part of compiler - even if
we ignore "inline" attribute we (lto) must not delete such functions
from binaries. So this bug must be forwarded to lto phase.
2016-08-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72794
--- Comment #2 from Yuri Rumyantsev ---
Yes, this option cures CF. Does it mean that we must compile spec2000
with this flag?
2016-08-03 19:08 GMT+03:00 pinskia at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72794
>
> --- Co
y: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed that after this commit benchmark is failed with message:
/tmp/cchqWD0Q.ltrans0.ltrans.o: In function `yylex':
:(.text+0x566e): undefined re
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed that after this revision test is failed:
FAIL: gcc.dg/vect/vect-mask-store-move-1.c scan-tree-dump-times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56688
--- Comment #8 from Yuri Rumyantsev ---
I checked that if we comment down 'save' stmt in thin6d.f all loops will be
vectorized:
grep -c 'LOOP VECTORIZED' thin6d.f.149t.vect 32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56688
--- Comment #7 from Yuri Rumyantsev ---
I checked that GCC 7 compiler still does not vectorize loops in thin6d function
which is the only hottest function in 200.sixtrack benchmark.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65698
--- Comment #3 from Yuri Rumyantsev ---
I see that this bug was no considered for a while.
Here is my additional comment.
First of all, this test was extracted from bzip2 benchmark, mainGTU function.
The problem is that (1) tree optimizer collect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71734
--- Comment #7 from Yuri Rumyantsev ---
H.J.
I've just checked this test with my local fixed compiler and got:
Running /users/ysrumyan/workspaces/71261/gcc/testsuite/g++.dg/vect/vect.exp ...
PASS: g++.dg/vect/pr70729.cc -std=c++11 scan-tree-du
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #37 from Yuri Rumyantsev ---
Jakub,
I assume that yoour #C33 test-case is not correct, i.e. it can not be
marked with pragma omp simd. For example, even if we turn off lim
phase it will be aborted:
my_g++ -O3 -m64 t33.cpp -o t33.exe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #36 from Yuri Rumyantsev ---
#c33 testcase was not tested since I have some doubts about it. Note
that original problem was
#pragma omp simd
for (int i=0; i:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
>
> --- Comment #35 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #34 from Yuri Rumyantsev ---
Thanks a lot Jakub for your detail comments.
I have simple fix which cures failures from 71734. The fix is simple
enough and simply check that the ref in problem belongs to simd loop:
diff --git a/gcc/tre
lla/show_bug.cgi?id=70729
>
> --- Comment #24 from rguenther at suse dot de ---
> On Wed, 8 Jun 2016, ysrumyan at gmail dot com wrote:
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
>>
>> --- Comment #23 from Yuri Rumyantsev ---
>> OK. I will try to prepare th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #23 from Yuri Rumyantsev ---
OK. I will try to prepare the second part of patch.
Few comments about vect-simd-clone-5.c test failure.
1. This loop is marked with safelen=MAX_INT.
2. It contains the following stmt's:
D.3301 = foo.simdc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71453
--- Comment #2 from Yuri Rumyantsev ---
Forgot to mention that number of instructions is on 10% more 632 vs 702 for
spills into vector registers.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71453
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38659
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38659&action=edit
test-case to reproduce
Must be compiled with -O2 -march=core-avx2 -m32 options.
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We notice significant performance regression on one important benchmark after
r235523.
Note that fix is not responsible for it. A problem is related to spill/fill
to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #21 from Yuri Rumyantsev ---
Richard!
Are you planning to prepare the second part of the patch (zeroing safelen and
testing it in loop invariant motion phase as you proposed)?
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71437
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38652
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38652&action=edit
test-case to reproduce
Need to be compiled with -O3 -m32 -ffast-math on x86-64.
onent: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed ~10% slowdown on one important benchmark used for Silvermont
testing. I can reproduced this performance gap using attached test-case on
SandyB
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38600
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38600&action=edit
test-case to reproduce
Need to be compiled with -O2 -m32 -march=slm -ffast-math options on x64-64.
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
We noticed significant regression (more then 10%) after this revision whicn can
be illustrated on the following simple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71275
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38564
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38564&action=edit
test-case to reproduce
Must be compiled with -O2 -m32 -march=slm options.
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Regression can be seen at attached test-case. In the tail block of innermost
loop redundant fill was added:
before
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70935
--- Comment #3 from Yuri Rumyantsev ---
Jacub,
Here is a simple fix - do not take into consideration edges destination of
which is loop latch block, i.e. loop is endless:
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38375
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38375&action=edit
test-case to reproduce
Must be compiled with -O2 -mavx2 -m32 options.
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
This degradation is caused by known issue with partial register dependency:
https
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #12 from Yuri Rumyantsev ---
Created attachment 38367
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38367&action=edit
modified patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #11 from Yuri Rumyantsev ---
Richard,
I slightly modify the patch proposed by you:
1. Apply loop->safelen check only if lim is invoked before loop vectorization
since its value could be non-correct (I simply add bool param to it).
2.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70849
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38365
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38365&action=edit
test-case to reproduce
Must be compiled with -O3 -mavx2 options
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Simple test which will be attached is not vectorized as not profitable:
test.c:11:5: note: cost model: the vector iteration cost
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #6 from Yuri Rumyantsev ---
Richard,
I did change proposed by you but it still does not help since we have
loop-carried dependency through this_4(D)->S_n:
:
_5 = this_4(D)->S_n;
...
:
pretmp_54 = this_4(D)->C2;
pretmp_57 =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70729
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 38309
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38309&action=edit
test-case to reproduce
Must be compiled with -Ofast -mavx2 -fopenmp options on x86 machine.
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Analyzing performance of one important benchmark we found out that one of the
hot loop is no vectorized since loop-invariant load of the class member has not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70482
--- Comment #2 from Yuri Rumyantsev ---
Richard,
The problem is in pattern matching:
/* Pattern detected. */
if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
"vect_recog_widen_mult_pattern: detected
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
If we compile bb-slp-pattern-1.c from gcc.dg/vect suite with -mavx pattern
vectorization won't happen since AVX has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66142
--- Comment #27 from Yuri Rumyantsev ---
Created attachment 37940
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37940&action=edit
test-case to reproduce
Need to be compiled with -Ofast -mavx2 -fopenmp options.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66142
--- Comment #26 from Yuri Rumyantsev ---
If we convert copy structures to copy structure fields test will be vectorized
and all mentions of GOMP_SIMD_LANE will be deleted. But if we slightly modify
test by introducing new function vdot and insert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69633
--- Comment #3 from Yuri Rumyantsev ---
Sorry for a confusion. The bug must be closed as user mistake.
2016-03-07 19:18 GMT+03:00 bernds at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69633
>
> Bernd Schmidt changed:
>
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69942
--- Comment #3 from Yuri Rumyantsev ---
Created attachment 37822
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37822&action=edit
proposed patch
Patch to resolve ifcvt5.c failure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69942
--- Comment #2 from Yuri Rumyantsev ---
I attached patch which resolves failure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69942
--- Comment #1 from Yuri Rumyantsev ---
The cause of issue is that before ce1 phase pde (or pre) transformation has
been done to remove partial redundant moves to variable i and j, i.e.
code
int i = x;
int j = y;
if (x > y)
{
i =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69783
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37671
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37671&action=edit
test-case to reproduce
It needs to be compiled with -Ofast -funroll-loops on x86-64
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
After changes in vect_prune_runtime_alias_test_list() a number of merging
ranges was significantly decreased:
Before fix
improved number of alias checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052
--- Comment #13 from Yuri Rumyantsev ---
I checked that performance is back for the whole benchmark. Thanks a lot.
Yuri.
2016-02-09 14:17 GMT+03:00 amker at gcc dot gnu.org :
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052
>
> --- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69652
--- Comment #5 from Yuri Rumyantsev ---
Jacub,
I'd like to clarify one your remark:
5) IMHO you should give up also for !is_gimple_assign, say trying to move an
elemental function call into the conditional is just wrong
What's wrong in call mo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69652
--- Comment #4 from Yuri Rumyantsev ---
Jacub,
Thanks a lot for your detail comments!
I've just sent a patch for review to gcc-patches. Could you please
take a look on it?
Best regards.
Yuri.
2016-02-03 20:22 GMT+03:00 jakub at gcc dot gnu.or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69652
--- Comment #2 from Yuri Rumyantsev ---
This is my fault - forgot to fix vuse for scalar statements which are crossed
by masked stores during code motion. Fix is testing and will be sent for review
tomorrow.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69633
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37559
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37559&action=edit
test-case to reproduce
Need to be compiled with -O2 -m32 -pie -fPIE.
Assume that -march=slm is not needed.
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
Sorry, that we noticed this regression just now but not in September.
After Makarov's fix for 61578 ( and s390 regression) we no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69467
--- Comment #3 from Yuri Rumyantsev ---
Richard,
I checked that performance is back with your patch.
Thanks.
2016-01-25 17:50 GMT+03:00 rguenth at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69467
>
> Richard Biener chang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69467
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37462
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37462&action=edit
test-case to reproduce
Need to compile with -m32 at -O2 or -O3 -funroll-loops options.
In description the ass
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
This is caused by the same revision as 67438
http://gcc.gnu.org/viewcvs/gcc?view=revision&revi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69297
--- Comment #4 from Yuri Rumyantsev ---
Yes, this loop was added for avoiding dce phase.
Thanks.
Yuri.
2016-01-18 13:33 GMT+03:00 rguenth at gcc dot gnu.org
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69297
>
> --- Comment #3 from Richard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69297
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37356
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37356&action=edit
test-case to reproduce
TO reproduce compile with -Ofast -march=core-avx2 options.
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
This regression was found on spec2006/464.h264ref. The problem is related to
SLP vectorization of BB's and caused by the wrong calculation of scalar cost
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
After this simple fix we got huge regression ( > 16%) for spec2006/435.gromacs
on Haswell with "-O2 -ffast-math&
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67145
--- Comment #6 from Yuri Rumyantsev ---
We checked that proposed patch does not introduce new performance regression
and I will prepare it for review after bootstrapping and regression testing
completion, likely tomorrow.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68522
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37133
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37133&action=edit
test-case to reproduce
It should be compile with -O2 -m32 options to reproduce.
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
Target Milestone: ---
In loop_invariant phase additional function inv_can_prop_to_addr_use which
tried to determine if forward propagation for cheap address is possible through
call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67145
--- Comment #4 from Yuri Rumyantsev ---
I attached simple non-tested patch which restores performance on x86. This
change is no perfect but using it I noticed 2%-6% speed-up on 32-bit x86
platform. The idea of patch is very simple - we do not bai
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67145
--- Comment #3 from Yuri Rumyantsev ---
Created attachment 37120
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37120&action=edit
non-tested patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68920
--- Comment #4 from Yuri Rumyantsev ---
You are quite right - the cost model is very poor. We did simple experiment and
set up the branch cost to 1 but noticed performance regressions on other
benchmarks. when we set it to 2 we did not see any di
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68906
--- Comment #3 from Yuri Rumyantsev ---
I've prepared simple fix which cures ICE. I will send it for review tomorrow.
2015-12-15 12:50 GMT+03:00 jakub at gcc dot gnu.org :
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68906
>
> Jakub Jelinek c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68898
--- Comment #1 from Yuri Rumyantsev ---
Created attachment 37028
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37028&action=edit
test-case to reproduce
Need to compile with -O2 -m32 -ffast-math options to reproduce. Note that
32-bit and -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68898
--- Comment #2 from Yuri Rumyantsev ---
Forgot to add stack trace:
Error: dominator of 6 status unknown
t2.f:41:0: internal compiler error: Segmentation fault
0xb4e62f crash_signal
/export/users/gnutester/stability/svn/trunk/gcc/toplev.c
1 - 100 of 309 matches
Mail list logo