On Fri, 26 Jan 2018, Richard Biener wrote:

> On Fri, 26 Jan 2018, Richard Sandiford wrote:
> 
> > Richard Biener <rguent...@suse.de> writes:
> > > On Thu, 25 Jan 2018, Richard Sandiford wrote:
> > >
> > >> Richard Sandiford <richard.sandif...@linaro.org> writes:
> > >> > Richard Biener <rguent...@suse.de> writes:
> > >> >> The following patch fixes PR84003 where RTL DSE removes a redundant
> > >> >> store (a store storing the same value as an earlier store) but in
> > >> >> doing this changing program semantics because the later store changes
> > >> >> the effective type of the memory location.  This in turn allows
> > >> >> sched2 to do an originally invalid transform.
> > >> >>
> > >> >> The fix is to harden DSE so that while removing the later store
> > >> >> the earlier stores alias-sets are changed to reflect both dynamic
> > >> >> types - which means using alias-set zero.
> > >> >>
> > >> >> On GIMPLE we simply avoid removing such type-changing stores because
> > >> >> we cannot easily get hold on the earlier store.
> > >> >>
> > >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > >> >>
> > >> >> Disclaimer: this is a "local" fix, I didn't try to understand much
> > >> >> of the DSE data structures but only learned from +-10 lines around
> > >> >> the affected transform which I found by searching for the dumped
> > >> >> messages ...  Richard, you contributed the pass so please review.
> > >> >
> > >> > So the file says.  In reality I only wrote an early version, and what
> > >> > was committed contains very little of that.  So TBH this pass is almost
> > >> > a complete unknown to me.  That said...
> > >> >
> > >> > Might it be worth keeping the store instead?
> > >> 
> > >> ...that is, like gimple. :-)  Sorry, I spent so long thinking about this
> > >> that I forgot you'd said that gimple already does the same thing.
> > >
> > > Yeah, and it still runs into PR82224 ... which means it is not 
> > > conservative enough.
> > >
> > >> Richard
> > >> 
> > >> > Giving things alias set
> > >> > zero seems like a pretty big hammer and would turn the remaining store
> > >> > into something close to a scheduling barrier.
> > >> >
> > >> > IOW, how about checking alias_set_subset_of in:
> > >> >
> > >> >          /* Even if PTR won't be eliminated as unneeded, if both
> > >> >             PTR and this insn store the same constant value, we might
> > >> >             eliminate this insn instead.  */
> > >> >          if (s_info->const_rhs
> > >> >              && const_rhs
> > >> >              && known_subrange_p (offset, width,
> > >> >                                   s_info->offset, s_info->width)
> > >> >              && all_positions_needed_p (s_info, offset - 
> > >> > s_info->offset,
> > >> >                                         width))
> > >> >
> > >> > instead?
> > >
> > > That's what GIMPLE does (use alias_set_subset_of), but it arguably
> > > doesn't work for unions (ok, there even the equality case doesn't 
> > > work...).
> > >
> > > I guess I thought it's worth trying sth new and adjust the earlier
> > > store ;)  Sth that I can't easily do on GIMPLE but everything seemed
> > > to be in place in RTL DSE.
> > >
> > > So, when looking at the above code it seems like there are exactly
> > > two insns we look at?  s_info and 'mem' I guess.  And s_info is
> > > the earlier one.
> > 
> > Yeah.
> > 
> > > So sth like
> > >
> > > Index: gcc/dse.c
> > > ===================================================================
> > > --- gcc/dse.c   (revision 257077)
> > > +++ gcc/dse.c   (working copy)
> > > @@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf
> > >               && known_subrange_p (offset, width,
> > >                                    s_info->offset, s_info->width)
> > >               && all_positions_needed_p (s_info, offset - s_info->offset,
> > > -                                        width))
> > > +                                        width)
> > > +             /* We can only remove the later store if the earlier aliases
> > > +                at least all accesses the later one.  */
> > > +             && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
> > > +                 || alias_set_subset_of (MEM_ALIAS_SET (mem),
> > > +                                         MEM_ALIAS_SET (s_info->mem))))
> > >             {
> > >               if (GET_MODE (mem) == BLKmode)
> > >                 {
> > >
> > > ?  I can confirm that this patch works as well.  Is that what we prefer?
> > 
> > Not sure I can call that one, but it seems safer (especially for
> > backports).
> > 
> > > It would be consistent with what we do on GIMPLE currently (irrespective
> > > of still existing bugs in this area...).
> > >
> > > Note that with the above instead of dse1 removing the later store
> > > dse2 now decides to remove the earlier one...  (which is fine!).
> > >
> > > So I'm testing the above now, ok if it succeeds?
> > 
> > LGTM (but I can't approve it).
> 
> I can self-approve ;)
> 
> > I was wondering later... if the problem is that we have:
> > 
> >    I1: store X to Y with alias set S1
> >    I2: load from Y with alias set S1
> >    I3: store X to Y with alias set S2
> >    I4: load from Y with alias set S2
> > 
> > and removing I3 allows I4 to be scheduled before I1, what would happen
> > if I1 and I3 store different values (with no dse)?  Would something stop
> > the scheduler from ordering it as:
> > 
> >    I1 I3 I2 I4
> > 
> > where I2 would load the wrong value?  If so, why didn't that same thing
> > work here?
> 
> The scheduler cannot use TBAA when moving a load down across a store so
> it has to prove the addresses are different.

So only RAW dependences can be disambiguated using TBAA, not
WAW or WAR ones.  In GCC terms only true_dependence uses TBAA.

Richard.

Reply via email to