On Tue, 4 Dec 2012, Jan Hubicka wrote:
here is updated patch. It should get the bounds safe enough to not have
effect on codegen of complette unrolling.
There is IMO no way to cut the walk of loop body w/o affecting codegen in
unrolling for size mode. The condition on unroling
here is updated patch. It should get the bounds safe enough to not have
effect on codegen of complette unrolling.
There is IMO no way to cut the walk of loop body w/o affecting codegen in
unrolling for size mode. The condition on unroling to happen is
unrolled_size * 2 / 3
here is updated patch. It should get the bounds safe enough to not have
effect on codegen of complette unrolling.
There is IMO no way to cut the walk of loop body w/o affecting codegen in
unrolling for size mode. The condition on unroling to happen is
unrolled_size * 2 / 3
... I believe I posted a patch?
Yes: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01799.html
I have found another fall out: I have some avatars of the polyhedron tests
where the REAL(8) have been replaced with REAL(10). Some of them are now
Should I open a new PR for that?
Cheers,
Dominique
My mailer has eaten a line in my previous mail. One should read:
I have found another fall out: I have some avatars of the polyhedron tests
where the REAL(8) have been replaced with REAL(10). Some of them are now ~50%
slower with the new value of max-completely-peeled-insns.
Should I open a new
On Sun, 18 Nov 2012, Jan Hubicka wrote:
this patch reduces max-peeled-insns and max-completely-peeled-insns
from 400
to 100. The reason why I am doing this is that I want to reduce code
bloat
caused by my cunroll work that enabled a lot more unrolling then
previously
On Sun, 18 Nov 2012, Jan Hubicka wrote:
this patch reduces max-peeled-insns and max-completely-peeled-insns
from 400
to 100. The reason why I am doing this is that I want to reduce code
bloat
caused by my cunroll work that enabled a lot more unrolling then
Hi Jan,
this is patch I will try to test once I have chance :)
It simply prevents unroller from analyzing loops when they are already too
large.
...
This patch breaks bootstrap with
...
/opt/gcc/p_build/./prev-gcc/g++ -B/opt/gcc/p_build/./prev-gcc/
FAIL: gcc.dg/graphite/interchange-8.c scan-tree-dump-times graphite will be
interchanged 2
FAIL: gcc.dg/graphite/pr42530.c (internal compiler error)
FAIL: gcc.dg/graphite/pr42530.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/cunroll-1.c scan-tree-dump cunrolli Unrolled loop 1
completely
Hi,
here is updated patch. It should get the bounds safe enough to not have effect
on codegen of complette unrolling.
There is IMO no way to cut the walk of loop body w/o affecting codegen in
unrolling for size mode. The condition on unroling to happen is
unrolled_size * 2 / 3 original_size
Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again
takes a while to compile, so times out on slow machines:
...
On a 2.5Ghz Core2Duo, compiling the test with revision 192891 (2012-10-28)
takes a small fraction of a second, while with revision 193270 (2012-11-06)
Hi,
this is patch I will try to test once I have chance :)
t simply prevents unroller from analyzing loops when they are already too large.
* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
parameter.
(try_unroll_loop_completely) Update.
Index:
this patch reduces max-peeled-insns and max-completely-peeled-insns from 400
to 100. The reason why I am doing this is that I want to reduce code bloat
caused by my cunroll work that enabled a lot more unrolling then previously
causing considerable code size regression at -O3.
Did you notice
Did you notice that gcc.c-torture/compile/pr43186.c regressed? It now again
takes a while to compile, so times out on slow machines:
...
On a 2.5Ghz Core2Duo, compiling the test with revision 192891 (2012-10-28)
takes a small fraction of a second, while with revision 193270 (2012-11-06)
it
this patch reduces max-peeled-insns and max-completely-peeled-insns from 400
to 100. The reason why I am doing this is that I want to reduce code bloat
caused by my cunroll work that enabled a lot more unrolling then previously
causing considerable code size regression at -O3.
Did you
this patch reduces max-peeled-insns and max-completely-peeled-insns from
400
to 100. The reason why I am doing this is that I want to reduce code
bloat
caused by my cunroll work that enabled a lot more unrolling then
previously
causing considerable code size regression at
OK, here are multiple issues.
1) recursive inlining makes huge loop nest (of 18 loops)
2) SCEV is very slow on answering simple_iv tests in this case becuase it
walks the nest
3) unroller is computing loop body size even when it is clear the body is
much larger than the limit (the outer loop
On Thu, Nov 15, 2012 at 12:34:07AM +0100, Jan Hubicka wrote:
* params.def (max-peeled-insns, max-completely-peeled-insns): Reduce to
100.
Ok, thanks.
--- params.def(revision 193505)
+++ params.def(working copy)
@@ -290,7 +290,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES,
Hi,
this patch reduces max-peeled-insns and max-completely-peeled-insns from 400 to
100. The reason why I am doing this is that I want to reduce code bloat caused
by my cunroll work that enabled a lot more unrolling then previously causing
considerable code size regression at -O3.
I do not think
19 matches
Mail list logo