Richard Sandiford <rdsandif...@googlemail.com> writes:
> Alexandre Oliva <aol...@redhat.com> writes:
>> On Nov 25, 2011, Jakub Jelinek <ja...@redhat.com> wrote:
>>> The numbers I got with your patch (RTL checking) are below, seems
>>> the cumulative numbers other than 100% are all bigger with patched stage2,
>>> which means unfortunately debug info quality degradation.
>>
>> Not really.  I found some actual degradation after finally getting back
>> to it.  In some cases, I failed to reset NO_LOC_P, and this caused
>> expressions that depended on it to resort to alternate values or end up
>> unset.  In other cases, we created different cselib values for debug
>> temps and implicit ptrs, and merging them at dataflow confluences no
>> longer found a common value because the common value was in cselib's
>> static equivalence table.  I've fixed (and added an assertion to catch)
>> left-over NO_LOC_Ps, and arranged for values created for debug exprs,
>> implicit ptr, entry values and parameter refs to be preserved across
>> basic blocks as constants within cselib.
>>
>> With that, the debug info we get is a strict improvement in terms of
>> coverage, even though a bunch of .o files still display a decrease in
>> 100% coverage.  In the handful files I examined, the patched compiler
>> was emitting a loc list without full coverage, while the original
>> compiler was emitting a single loc expr, that implicitly got full
>> coverage even though AFAICT it should really cover a narrower range.
>> Full coverage was a false positive, and less-than-100% coverage in these
>> cases is not a degradation, but rather an improvement.
>>
>> Now, the reason why we emit additional expressions now is that the new
>> algorithm is more prone to emitting different (and better) expressions
>> when entering basic block, because we don't try as hard as before to
>> keep on with the same location expression.  Instead we recompute all the
>> potentially-changed expressions, which will tend to select better
>> expressions if available.
>>
>>> Otherwise the patch looks good to me.
>>
>> Thanks.  After the updated comparison data below, you can find the patch
>> I'm checking in, followed by the small interdiff from the previous
>> patch.
>>
>>
>> Happy GNU Year! :-)
>>
>>
>> The results below can be reproduced with r182723.
>>
>> stage1 sources are patched, stage2 and stage3 aren't, so
>> stage2 is built with a patched compiler, stage3 isn't.
>>
>> $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.ev
>>   100784 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.ev
>>   102406 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.ev
>>    33275 obj-i686-linux-gnu/stage2-gcc/cc1plus.ev
>>    33944 obj-i686-linux-gnu/stage3-gcc/cc1plus.ev
>>
>> $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.csv
>>    523647 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.csv
>>    523536 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.csv
>>    521276 obj-i686-linux-gnu/stage2-gcc/cc1plus.csv
>>    521907 obj-i686-linux-gnu/stage3-gcc/cc1plus.csv
>>
>> $ diff -yW80 obj-x86_64-linux-gnu/stage[23]-gcc/cc1plus.ls
>> cov%    samples cumul                   cov%    samples cumul
>> 0.0     150949/30%      150949/30%    | 0.0     150980/30%      150980/30%
>> 0..5    6234/1% 157183/31%            | 0..5    6254/1% 157234/31%
>> 6..10   5630/1% 162813/32%            | 6..10   5641/1% 162875/32%
>> 11..15  4675/0% 167488/33%            | 11..15  4703/0% 167578/33%
>> 16..20  5041/1% 172529/34%            | 16..20  5044/1% 172622/34%
>> 21..25  5435/1% 177964/35%            | 21..25  5466/1% 178088/35%
>> 26..30  4249/0% 182213/36%            | 26..30  4269/0% 182357/36%
>> 31..35  4666/0% 186879/37%            | 31..35  4674/0% 187031/37%
>> 36..40  6939/1% 193818/38%            | 36..40  6982/1% 194013/38%
>> 41..45  7824/1% 201642/40%            | 41..45  7859/1% 201872/40%
>> 46..50  8538/1% 210180/42%            | 46..50  8536/1% 210408/42%
>> 51..55  7585/1% 217765/43%            | 51..55  7611/1% 218019/43%
>> 56..60  6088/1% 223853/44%            | 56..60  6108/1% 224127/44%
>> 61..65  5545/1% 229398/45%            | 61..65  5574/1% 229701/46%
>> 66..70  7151/1% 236549/47%            | 66..70  7195/1% 236896/47%
>> 71..75  8068/1% 244617/49%            | 71..75  8104/1% 245000/49%
>> 76..80  18852/3%        263469/52%    | 76..80  18879/3%        263879/52%
>> 81..85  11958/2%        275427/55%    | 81..85  11954/2%        275833/55%
>> 86..90  15201/3%        290628/58%    | 86..90  15145/3%        290978/58%
>> 91..95  16814/3%        307442/61%    | 91..95  16727/3%        307705/61%
>> 96..99  17121/3%        324563/65%    | 96..99  16991/3%        324696/65%
>> 100     174515/34%      499078/100%   | 100     173994/34%      498690/100%
>>
>> $ diff -yW80 obj-i686-linux-gnu/stage[23]-gcc/cc1plus.ls
>> cov%    samples cumul                   cov%    samples cumul
>> 0.0     145453/27%      145453/27%    | 0.0     145480/27%      145480/27%
>> 0..5    6594/1% 152047/29%            | 0..5    6603/1% 152083/29%
>> 6..10   5664/1% 157711/30%            | 6..10   5671/1% 157754/30%
>> 11..15  4982/0% 162693/31%            | 11..15  4997/0% 162751/31%
>> 16..20  6155/1% 168848/32%            | 16..20  6169/1% 168920/32%
>> 21..25  5038/0% 173886/33%            | 21..25  5057/0% 173977/33%
>> 26..30  4925/0% 178811/34%            | 26..30  4918/0% 178895/34%
>> 31..35  4359/0% 183170/35%            | 31..35  4372/0% 183267/35%
>> 36..40  6977/1% 190147/36%            | 36..40  6972/1% 190239/36%
>> 41..45  8138/1% 198285/38%            | 41..45  8148/1% 198387/38%
>> 46..50  8538/1% 206823/39%            | 46..50  8538/1% 206925/39%
>> 51..55  5607/1% 212430/40%            | 51..55  5610/1% 212535/40%
>> 56..60  6629/1% 219059/41%            | 56..60  6629/1% 219164/42%
>> 61..65  5232/1% 224291/42%            | 61..65  5242/1% 224406/43%
>> 66..70  8827/1% 233118/44%            | 66..70  8824/1% 233230/44%
>> 71..75  6240/1% 239358/45%            | 71..75  6245/1% 239475/45%
>> 76..80  8573/1% 247931/47%            | 76..80  8577/1% 248052/47%
>> 81..85  8235/1% 256166/49%            | 81..85  8236/1% 256288/49%
>> 86..90  13385/2%        269551/51%    | 86..90  13365/2%        269653/51%
>> 91..95  21427/4%        290978/55%    | 91..95  21397/4%        291050/55%
>> 96..99  20791/3%        311769/59%    | 96..99  20739/3%        311789/59%
>> 100     209906/40%      521675/100%   | 100     209781/40%      521570/100%
>>
>>
>> for  gcc/ChangeLog
>> from  Alexandre Oliva  <aol...@redhat.com>
>>
>>      * cselib.h (cselib_add_permanent_equiv): Declare.
>>      (canonical_cselib_val): New.
>>      * cselib.c (new_elt_loc_list): Rework to support value
>>      equivalences.  Adjust all callers.
>>      (preserve_only_constants): Retain value equivalences.
>>      (references_value_p): Retain preserved values.
>>      (rtx_equal_for_cselib_1): Handle value equivalences.
>>      (cselib_invalidate_regno): Use canonical value.
>>      (cselib_add_permanent_equiv): New.
>>      * alias.c (find_base_term): Reset locs lists while recursing.
>>      * var-tracking.c (val_bind): New.  Don't add equivalences
>>      present in cselib table, compared with code moved from...
>>      (val_store): ... here.
>>      (val_resolve): Use val_bind.
>>      (VAL_EXPR_HAS_REVERSE): Drop.
>>      (add_uses): Do not create MOps for addresses.  Do not mark
>>      non-REG non-MEM expressions as requiring resolution.
>>      (reverse_op): Record reverse as a cselib equivalence.
>>      (add_stores): Use it.  Do not create MOps for addresses.
>>      Do not require resolution for non-REG non-MEM expressions.
>>      Simplify support for reverse operations.
>>      (compute_bb_dataflow): Drop reverse support.
>>      (emit_notes_in_bb): Likewise.
>>      (create_entry_value): Rename to...
>>      (record_entry_value): ... this.  Use cselib equivalences.
>>      (vt_add_function_parameter): Adjust.
>
> This has significantly increased the compile time of 20001226-1.c -O3 -g
> on mips64-linux-gnu:
>
> time for -O3:
>     real    0m25.656s
>     user    0m25.106s
>     sys     0m0.544s
>
> time for -O3 -g with this patch and the follow-ons reverted:
>     real    0m28.212s
>     user    0m27.586s
>     sys     0m0.624s
>
> time for -O3 -g with today's trunk:
>     CTRL-Ced after (sorry, I got bored):
>     real    15m11.725s
>     user    15m11.177s
>     sys     0m0.496s
>     
> I realise that 20001226-1.c isn't exactly the most representative
> testcase in the world, but it does look like a real problem.
>
> The testcase has a lot of identical "load high, add low" sequences.
> Even though there are only two values here (the high part and
> the result), we end up with thousands of value rtxs and thousands
> of distinct "reverse" locations attached to the "load high" value.
>
> One source of inefficiency seems to be that if an as-yet-unseen value
> is passed to cselib_add_permanent_equiv, we create a new value for it,
> then make the previous value (ELT) equivalent to it.  AIUI, ELT is still
> supposed to be the "canonical" representation of the value in this
> situation, so most (but presumably not all, given the bug) instances
> of the new value will be replaced by the old.
>
> Which led to the patch below.  However, it triggers an infinite loop
> in alias.c:memrefs_conflict_p when compiling libgcov.c, because we end
> up recording that
>
>   value V1 = value V2 + OFFSET
>   value V2 = value V1 - OFFSET
>
> and the function cycles endlessly between V1 and V2.
>
> And this is what makes me a bit nervous about the original patch.
> AIUI, cselib values were always supposed to form a dag, whereas
> the patch above is deliberately recording cycles.  It seems a bit
> dangerous to change something like that at the end of stage 3.
> I couldn't convince myself that, without this patch, all these
> cycles were harmless, especially given the increased use of
> canonical_cselib_val in the follow-on patches.
>
> The follow-on patches also made me realise that I don't understand
> when cselib is supposed to use "canonical" values and when it should
> use (or can make do with) the non-canonical ones.

Now filed as PR 52001 to make sure it doesn't get lost.

Richard

Reply via email to