On Tue, May 20, 2014 at 6:29 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> On 05/18/2014 08:45 PM, Sandra Loosemore wrote:
>> >On 05/18/2014 02:59 PM, Jan Hubicka wrote:
>> >>For cases like local-statics-7 your approach can be "saved" by adding
>> >>simple IPA analysis
>> >>to look for static vars that are used only by one function and keeping
>> >>your DSE code active
>> >>for them, so we can still get rid of this special case assignments
>> >>during late compilation.
>> >>I am however not quite convinced it is worth the effort - do you have
>> >>some real world
>> >>cases where it helps?
>> >
>> >Um, the well-known benchmark.  ;-)
>>
>> I looked at the generated code for this benchmark and see that your
>> patch is indeed not getting rid of all the pointless static
>> variables, while ours is, so this is quite disappointing.  I'm
>> thinking now that we will still need to retain our patch at least
>> locally, although we'd really like to get it on trunk if possible.
>
> Yours patch can really be improved by adding simple IPA analysis to work
> out what variables are refernced by one function only instead of using
> not-longer-that-relevant local static info.
> As last IPA pass for each static variable with no address taken, look at all
> references and see if they all come from one function or functions inlined to
> it.
>>
>> Here is another testcase vaguely based on the same kind of
>> signal-processing algorithm as the benchmark, but coded without
>> reference to it.  I have arm-none-eabi builds around for both 4.9.0
>> with our remove_local_statics patch applied, and trunk with your
>> patch.  With -O2, our optimization produces 23 instructions and gets
>> rid of all 3 statics, yours produces 33 and only gets rid of 1 of
>> them.
>>
>> Of course it's lame to use static variables in this way, but OTOH
>> it's lame if GCC can't recognize them and optimize them away, too.
>>
>> -Sandra
>>
>
>> void
>> fir (int *coeffs, int coeff_length, int *input, int input_length, int 
>> *output)
>> {
>>   static int *coeffp;
>>   static int *inputp;
>>   static int *outputp;
>
> Here inputp read is rmeoved only at 101.dceloop1 pass. That is really late.
> coeffp is removed late, too.
>
>>   int i, c, acc;
>>
>>   for (i = 0; i < input_length; i++)
>>     {
>>       coeffp = coeffs;
>>       inputp = input + i;
>>       outputp = output + i;
>>       acc = 0;
>>       for (c = 0; c < coeff_length; c++)
>>       {
>>         acc += *coeffp * *inputp;
>>         coeffp++;
>>         inputp--;
>>       }
>
> It seems like AA can not work out the fact that inputp is unchanged until that
> late.  Richi, any ideas?

Well, AA is perfectly fine - it's just that this boils down to a partial
redundancy problem.  The stores can be removed by DCE even with

Index: gcc/tree-ssa-dce.c
===================================================================
--- gcc/tree-ssa-dce.c  (revision 210635)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -278,10 +278,20 @@ mark_stmt_if_obviously_necessary (gimple
       break;

     case GIMPLE_ASSIGN:
-      if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
-         && TREE_CLOBBER_P (gimple_assign_rhs1 (stmt)))
-       return;
-      break;
+      {
+       tree lhs = gimple_assign_lhs (stmt);
+       if (TREE_CODE (lhs) == SSA_NAME
+           && TREE_CLOBBER_P (gimple_assign_rhs1 (stmt)))
+         return;
+       if (TREE_CODE (lhs) == VAR_DECL
+           && !TREE_ADDRESSABLE (lhs)
+           && TREE_STATIC (lhs)
+           && !TREE_PUBLIC (lhs) && !DECL_EXTERNAL (lhs)
+           /* ???  Make sure lhs is only refered to from cfun->decl.  */
+           && decl_function_context (lhs) == cfun->decl)
+         return;
+       break;
+      }

     default:
       break;

but I don't have a good way of checking the ??? prerequesite
(even without taking its address the statics may be refered to
by a) inline copies, b) ipa-split function parts, c) nested functions).
I'm sure IPA references are not kept up-to-date.

The last reads go away with PRE at the expense of two
additional induction variables.

If we know that a static variable does not have its address taken
and is only refered to from a single function we can in theory
simply rewrite it into SSA form during early opts (before inlining
its body), for example by SRA.  That isn't even restricted to
function-local statics (for statics used in multiple functions but
not having address-taken we can do the same with doing
function entry / exit load / store).  I think that would be a much
better IPA enabled optimization than playing games with
looking at individual stmts.

Richard.

> Honza
>>       *outputp = acc;
>>     }
>> }
>>
>>
>

Reply via email to