On Tue, 26 May 2015, Jan Hubicka wrote:

> > > Will do if we agree on having this.
> > > 
> > > I know you would like ipa-icf to keep original bodies and use them for 
> > > inlining declaring alias sets to be function local.  This is wrong plan.  
> > > Consder:
> > > 
> > > void t(int *ptr)
> > > {
> > >   *ptr=1;
> > > }
> > > 
> > > int a(int *ptr1, int *ptr2)
> > > {
> > >   int a = *ptr1;
> > >   t(ptr2)
> > >   return a+*ptr1;
> > > }
> > > 
> > > long b(long *ptr1, int *ptr2)
> > > {
> > >   int a = *ptr1;
> > >   t(ptr2)
> > >   return a+*ptr1;
> > > }
> > > 
> > > here aliasing leads to the two options to be optimizer differently:
> > > a:
> > > .LFB1:  
> > >         .cfi_startproc
> > >         movl    4(%esp), %edx
> > >         movl    8(%esp), %ecx
> > >         movl    (%edx), %eax
> > >         movl    $1, (%ecx)
> > >         addl    (%edx), %eax
> > >         ret
> > >         .cfi_endproc
> > > b:
> > > .LFB2:  
> > >         .cfi_startproc
> > >         movl    4(%esp), %eax
> > >         movl    8(%esp), %edx
> > >         movl    (%eax), %eax
> > >         movl    $1, (%edx)
> > >         addl    %eax, %eax
> > >         ret
> > >         .cfi_endproc
> > > 
> > > however with -fno-early-inlining the functions look identical (modulo 
> > > alias
> > > sets) at ipa-icf time.  If we merged a/b, we could get wrong code for a
> > > even though no inlining of a or b happens.
> > 
> > First of all the return types don't agree so the testcase is bogus.
> 
> With -m32 they are types_compatible_p because they are of same size.
> > 
> > > So either we match the alias sets or we need to verify that the alias sets
> > > permit precisely the same set of optimizations with taking possible 
> > > inlining
> > > into account.
> > 
> > Hmm, but then what makes ICF of a and b _with_ early inlining fail with
> > -fno-tree-fre1?  The casts from *ptr1 to int in the 'long' case.
> 
> Dereferencing *ptr1 that has different alias set in each function.
> > 
> > So I think I need to see a real testcase and then I'll show you
> > even with no inlining after ICF you get wrong-code thus it is a bug
> > in ICF ;)
> 
> I added the inline only to make it clear that the loads won't be optimized
> at early optimization time.
> long a(int *ptr1, int *ptr2)
> {
>   int a = *ptr1;
>   *ptr2=1;
>   return a+*ptr1;
> }
> 
> long b(long *ptr1, int *ptr2)
> {
>   int a = *ptr1;
>   *ptr2=1;
>   return a+*ptr1;
> }
> 
> with -fno-tree-fre may be more real
> 
> a (int * ptr1, int * ptr2)
> {
>   int a;
>   int D.1380;
>   long int D.1379;
>   int _4;
>   long int _5;
> 
>   <bb 2>:
>   a_2 = *ptr1_1(D);
>   *ptr2_3(D) = 1;
>   _4 = *ptr1_1(D);
>   _5 = _4 + a_2;
> 
> <L0>:
>   return _5;
> 
> }
> 
> ;; Function b (b, funcdef_no=1, decl_uid=1375, cgraph_uid=1)
> 
> b (long int * ptr1, int * ptr2)
> {
>   int a;
>   long int D.1383;
>   long int D.1382;
>   long int _4;
>   long int _5;
> 
>   <bb 2>:
>   a_2 = *ptr1_1(D);
>   *ptr2_3(D) = 1;
>   _4 = *ptr1_1(D);
>   _5 = _4 + a_2;
> 
> <L0>:
>   return _5;
> 
> }

Yes, so this shows using original bodies for inlining isn't the issue.
The issue is that we can't really ignore TBAA (completely?) when
merging function bodies, independent of any issues that pop up
when inlining merged bodies.  We should have the above as testcase
in the testsuite (with both source orders of a and b to make sure
ICF will eventually pick both).

Now the question is whether we can in some way still merge the above
two functions and retain (some) TBAA, like by making sure to adjust
all MEM_REFs to use union { type1; type2; } for the TBAA type... (eh).

No longer globbing all pointer types will even make std::vector<ptr>
no longer mergeable...

Richard.

> > 
> > > I also do not believe that TBAA should be function local.  I believe it is
> > > useful to propagate stuff interprocedurally, like ipa-prop could be able 
> > > to
> > > propagate this:
> > > 
> > > long *ptr1;
> > > int *ptr2;
> > > t(int *ptr)
> > > {
> > >   return *ptr;
> > > }
> > > wrap(int *ptr)
> > > {
> > >  *ptr1=1;
> > > }
> > > call()
> > > {
> > >   return wrap (*ptr2);
> > > }
> > > 
> > > and we could have ipa-reference style pass that collect alias sets 
> > > read/written by a function and uses it during local optimization to 
> > > figure out if there is a true dependence between function call and 
> > > memory store.
> > 
> > Sure, but after ICF there is no IPA propagation...
> Doesn't matter if you propagate before or after ICF. If you do before, ICF
> would need to match/merge the alias set in optimization summary to be sure 
> that
> the functions are same.
> 
> Honza
> > 
> > Richard.
> > 
> > -- 
> > Richard Biener <rguent...@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, 
> > Graham Norton, HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Reply via email to