On Wed, 13 Mar 2024, Jan Hubicka wrote:

> > On Tue, 12 Mar 2024, Jakub Jelinek wrote:
> > 
> > > On Tue, Mar 12, 2024 at 05:21:58PM +0100, Jakub Jelinek wrote:
> > > > On Tue, Mar 12, 2024 at 10:46:42AM +0100, Jan Hubicka wrote:
> > > > > I am sorry for delaying this.  I made the variant that simply compares
> > > > > value range of functions and prevents merging if they diverge and 
> > > > > wanted
> > > > > to make some bigger statistics.  This made me notice some performance
> > > > > problems on clang performance and libstdc++ RB-trees which disrailed 
> > > > > me
> > > > > from the original PR.  I will finish the statistics today.
> > > > 
> > > > With the posted patch, perhaps if we don't want to union jump_tables 
> > > > etc.,
> > > > all we could punt on is differences in the jump_table VRs rather than 
> > > > just
> > > > any SSA_NAME_RANGE_INFO differences.
> > > 
> > > To expand on this, I think we need to either union or punt on jump_func
> > > differences in any case, because for LTO we can't really punt on
> > > SSA_NAME_RANGE_INFO differences given that we don't stream that out and 
> > > in.
> 
> I noticed that yesterday too (I added my jump function testcase to
> testsuit and it fails with -flto, too).  I implemented comparator for
> them too and run the stats.  There was over 3000 functions in bootstrap
> where we run into differences in value-range and about 150k in LLVM
> build.
> 
> Inspecting random examples shown that those are usually false positives
> (pair of functions that are different but triggers hash colision) caused
> by the fact that we do not hash PHI arguments, so code like
> 
> int test (int a)
> {
> return a>0 ? CST1:  CST2;
> }
> 
> gets same hash value no matter what CST1/CST2 is.  I added hasher and I
> am re-running stats.

The hash should be commutative here at least.

> > > So the ipa_jump_func are I think the only thing that actually can differ
> > > on the ICF merging candidates from value range POV.
> > 
> > I agree.  Btw, I would have approved the original patch in this
> > thread that wipes SSA_NAME_INFO in merged bodies to mimic what LTO
> > effectively does right now.  That also looks most sensible to
> > backport.
> > 
> > But I'll defer to Honza in the end (but also want to point out we
> > need something suitable for backporting).
> 
> My main worry here is that I tried to relax matching of IL metadata in
> the past and it triggers interesting problems.  (I implemented TBAA
> merging and one needs to match additional things in summaries and loop
> structures)
> 
> If value range differs at IPA analysis time, we need to be sure that we
> compare everything that possibly depends on it. So it is always safer to
> just compare more than try to merge. Which is what we do in all cases so
> far.  Here, for the first time, is the problem is with LTO streaming
> missing the data though.
> 
> Thinking more about it, I wonder if different value ranges can be
> exploited to cause different decisions about function being finite
> (confuse pure/const) or different outcome of alias analysis yielding to
> different aggregate jump functions (confusing ipa-prop).

The obvious thing would be range info making it possible to prove
a stmt cannot trap, say for an array index or for arithmetic with
-ftrapv.  But I'm not sure that makes a differnce for pure/const-ness.

Making something looping vs. non-looping should be easy though,
just have range info for a loop with an unsigned IV that evolves
like { 0, +, increment } and with a != exit condition where for some 
'increment' we know we eventually reach equality but with some
other we know we never do.

But that just means pure/const state needs to be recomputed after
merging?  Or compared.

Richard.

Reply via email to