On Tue, Mar 29, 2016 at 3:19 PM, Bill Schmidt
<wschm...@linux.vnet.ibm.com> wrote:
> Hi Jakub,
>
> On Tue, 2016-03-29 at 08:53 +0200, Jakub Jelinek wrote:
>> On Mon, Mar 28, 2016 at 07:38:46PM -0500, Bill Schmidt wrote:
>> > For a long time we've had hundreds of failing guality tests.  These
>> > failures don't seem to have any correlation with gdb functionality for
>> > POWER, which is working fine.  At this point the value of these tests to
>> > us seems questionable.  Fixing these is such low priority that it is
>> > unlikely we will ever get around to it.  In the meanwhile, the failures
>> > simply clutter up our regression test reports.  Thus I'd like to disable
>> > them, and that's what this test does.
>> >
>> > Verified to remove hundreds of failure messages on
>> > powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?
>>
>> This is IMNSHO very wrong, you then lose tracking of regressions in the
>> debug info quality.  It is true that the debug info quality is already
>> pretty bad  on powerpc*, it would be really very much desirable if
>> anyone had time to analyze some of them and improve stuff,
>> but we at least shouldn't regress.  Guality testsuite has various FAILs
>> and/or XFAILs on lots of architectures, the problem is that the testing
>> matrix is simply too large to have them in the testcases
>> - it depends on the target, various ISA settings on the target, on the
>> optimization level (most of the guality tests are torture tested through
>> -O0 up to -O3 with extra flags), and in some cases also on the version of
>> the used GDB.
>>
>> For guality, the most effective test for regressions is simply always
>> running contrib/test_summary after all your bootstraps and then just
>> diffing up that against the same from earlier bootstrap.
>
> And of course we do this, and we can keep doing it.  My main purpose in
> opening this issue is to try to understand whether we are getting any
> benefit from these tests, rather than just noise.
>
> When you say that "the debug info quality is already pretty bad on
> powerpc*," do you mean that it is known to be bad, or simply that we
> have a lot of guality failures that may or may not indicate that the
> debug info is bad?  I don't have experiential evidence of bad debug info
> that shows up during debugging sessions.  Perhaps these are corner cases
> that I will never encounter in practice?  Or perhaps the tests are just
> badly formed?
>
> The failing tests have all been bit-rotten (or never worked) since
> before I joined this project, and from what others tell me, for at least
> a decade.  As you suggest here, others have always told me just to
> ignore the existing guality failures.  However, this can easily lead to
> a culture of "ignore any guality failure, that stuff is junk" which can
> cause regressions to be missed.  (I can't say that I've actually
> observed this, but it is a concern I have.)
>
> I have been consistently told that the same situation exists on most of
> the supported targets, again because of the size of the testing matrix.
> I'd be interested in knowing if this is true, or just anecdotal.
>
> The other point, "it would be really very much desirable if
> anyone had time to analyze some of them and improve stuff," has to be
> answered by "apparently nobody does."  I am currently tracking well over
> 200 improvements I would like to see made to the powerpc64le target
> alone.  Investigating old guality failures isn't even on that list.  Our
> team won't have time for it, and if we have bounty money to spend, it
> will be spent on more important things.  That's just the economic
> reality, not a desire to disrespect the guality tests or anyone
> associated with them.
>
> From my limited perspective, it seems like the guality tests are unique
> within the test suite as a set of tests that everyone just expects to
> have lots of failures.  Is that healthy?  Will it ever change?
>
> That said, it is clear that you feel the guality tests provide at least
> some value in their present state, so we can continue to live with
> things as they are.  I'm just curious how others feel about the state of
> these tests.

I agree with Jakub that disabling the tests is not good.  Just look at
a random testcase that FAILs on powerpc but not on x86_64-linux
for all optimization levels.  You can literally "debug" this manually
as the guality would - there may be ABI issues that make handling
the case hard or there may be simple bugs (like a target reorg pass
not properly caring for debug insns).

Richard.

> Thanks,
> Bill
>
>>
>>       Jakub
>>
>
>

Reply via email to