On Sun, May 18, 2014 at 10:59 PM, Jan Hubicka <hubi...@ucw.cz> wrote: > Sandra, >> This patch seems quite similar in purpose to the >> remove_local_statics optimization that Mentor has proposed, although >> the implementation is quite different. Here is the last version of >> our patch, prepared by Bernd Schmidt last year: >> >> https://gcc.gnu.org/ml/gcc-patches/2013-06/msg00317.html > > Thanks for pointer, I did not notice this patch! > The approach is indeed very different. So the patch works on function basis > and cares only about local statics of functions that was not inlined? >> >> I think we can drop our patch from our local tree now, but it >> includes a large number of test cases which I think are worth >> keeping on mainline. A few of them fail with your implementation, >> though -- which might be genuine bugs, or just different limitations >> of the two approaches. Can you take a look? >> >> The failing tests are remove-local-statics-{4,5,7,12,14b}.c. > > +/* Verify that we don't eliminate a global static variable. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler "global_static" } } */ > + > +static int global_static; > + > +int > +test1 (int x) > +{ > + global_static = x; > + > + return global_static + x; > +} > > here test1 optimizes into > > global_static=x; > return x+x; > > this makes global_static write only and thus it is correctly eliminated. > So I think this testcase is bogus. > > +++ b/gcc/testsuite/gcc.dg/remove-local-statics-5.c > @@ -0,0 +1,24 @@ > +/* Verify that we do not eliminate a static local variable whose uses > + are dominated by a def when the function calls setjmp. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler "thestatic" } } */ > + > +#include <setjmp.h> > + > +int > +foo (int x) > +{ > + static int thestatic; > + int retval; > + jmp_buf env; > + > + thestatic = x; > + > + retval = thestatic + x; > + > + setjmp (env); > + > + return retval; > +} > > I belive this is similar case. I do not see setjmp changing anything here, > since > local optimizers turns retval = x+x; > What it was intended to test? > > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/remove-local-statics-7.c > @@ -0,0 +1,19 @@ > +/* Verify that we eliminate a static local variable where it is defined > + along all paths leading to a use. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler-not "thestatic" } } */ > + > +int > +test1 (int x) > +{ > + static int thestatic; > + > + if (x < 0) > + thestatic = x; > + else > + thestatic = -x; > + > + return thestatic + x; > +} > > Here we get after early optimizations: > > int > test1 (int x) > { > static int thestatic; > int thestatic.0_5; > int thestatic.1_7; > int _8; > > <bb 2>: > if (x_2(D) < 0) > goto <bb 3>; > else > goto <bb 4>; > > <bb 3>: > thestatic = x_2(D); > goto <bb 5>; > > <bb 4>: > thestatic.0_5 = -x_2(D); > thestatic = thestatic.0_5; > > <bb 5>: > thestatic.1_7 = thestatic; > _8 = thestatic.1_7 + x_2(D); > return _8; > > } > > and thus we still have bogus read from thestatic. Because my analysis works > at IPA level, > we won't benefit from fact that dom2 eventually cleans it up as: > int > test1 (int x) > { > static int thestatic; > int thestatic.0_5; > int thestatic.1_7; > int _8; > int prephitmp_10; > > <bb 2>: > if (x_2(D) < 0) > goto <bb 3>; > else > goto <bb 4>; > > <bb 3>: > thestatic = x_2(D); > goto <bb 5>; > > <bb 4>: > thestatic.0_5 = -x_2(D); > thestatic = thestatic.0_5; > > <bb 5>: > # prephitmp_10 = PHI <x_2(D)(3), thestatic.0_5(4)> > thestatic.1_7 = prephitmp_10; > _8 = thestatic.1_7 + x_2(D); > return _8; > > } > > Richi, is there a way to teach early FRE to get this transformation? > I see it is a partial redundancy problem...
Yeah, nothing FRE can do about (needs insert of a PHI node). Eventually for this particular case running phiopt early would solve it. > +/* Verify that we do not eliminate a static variable when it is declared > + in a function that has nested functions. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler "thestatic" } } */ > + > +int test1 (int x) > +{ > + static int thestatic; > + > + int nested_test1 (int x) > + { > + return x + thestatic; > + } > + > + thestatic = x; > + > + return thestatic + x + nested_test1 (x); > +} > > Here we work hard enough to optimize test1 as: > int > test1 (int x) > { > static int thestatic; > int _4; > int _5; > > <bb 2>: > thestatic = x_2(D); > _4 = x_2(D) + x_2(D); > _5 = _4 + _4; > return _5; > > } > > thus inlining nested_test1 during early optimization. This makes the removal > valid. > > +/* Verify that we do not eliminate a static local variable if the function > + containing it is inlined. */ > + > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler "thestatic" } } */ > + > +int > +test2 (int x) > +{ > + if (x < 0) > + return 0; > + else > + return test1 (x - 1); > +} > + > +inline int > +test1 (int x) > +{ > + static int thestatic; > + int y; > + > + thestatic = x; > + > + y = test2 (thestatic - 1); > + > + return y + x; > +} > > Here thestatic becomes write only during early optimization, so again we can > correctly eliminate it. > > Sandra, > do you think you can drop the testcases that are not valid and commit the > valid one minus > remove-local-statics-7.c for which we can fill in enhancement request? > > For cases like local-statics-7 your approach can be "saved" by adding simple > IPA analysis > to look for static vars that are used only by one function and keeping your > DSE code active > for them, so we can still get rid of this special case assignments during > late compilation. > I am however not quite convinced it is worth the effort - do you have some > real world > cases where it helps? > > I am rather thinking about cutting the passmanager queue once again after main > tree optimization and re-running IPA unreachable code removal after them. This > should help with rather common cases where we optimize out code as effect > of inlining. > > This would basically mean running pass_all_optimizations from late IPA pass > and scheduling one extra fixup_cfg and perhaps DCE pass at begginig of > pass_all_optimizations. > > Honza