On Tue, Mar 29, 2016 at 3:19 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote: > Hi Jakub, > > On Tue, 2016-03-29 at 08:53 +0200, Jakub Jelinek wrote: >> On Mon, Mar 28, 2016 at 07:38:46PM -0500, Bill Schmidt wrote: >> > For a long time we've had hundreds of failing guality tests. These >> > failures don't seem to have any correlation with gdb functionality for >> > POWER, which is working fine. At this point the value of these tests to >> > us seems questionable. Fixing these is such low priority that it is >> > unlikely we will ever get around to it. In the meanwhile, the failures >> > simply clutter up our regression test reports. Thus I'd like to disable >> > them, and that's what this test does. >> > >> > Verified to remove hundreds of failure messages on >> > powerpc64le-unknown-linux-gnu. :) Is this ok for trunk? >> >> This is IMNSHO very wrong, you then lose tracking of regressions in the >> debug info quality. It is true that the debug info quality is already >> pretty bad on powerpc*, it would be really very much desirable if >> anyone had time to analyze some of them and improve stuff, >> but we at least shouldn't regress. Guality testsuite has various FAILs >> and/or XFAILs on lots of architectures, the problem is that the testing >> matrix is simply too large to have them in the testcases >> - it depends on the target, various ISA settings on the target, on the >> optimization level (most of the guality tests are torture tested through >> -O0 up to -O3 with extra flags), and in some cases also on the version of >> the used GDB. >> >> For guality, the most effective test for regressions is simply always >> running contrib/test_summary after all your bootstraps and then just >> diffing up that against the same from earlier bootstrap. > > And of course we do this, and we can keep doing it. My main purpose in > opening this issue is to try to understand whether we are getting any > benefit from these tests, rather than just noise. > > When you say that "the debug info quality is already pretty bad on > powerpc*," do you mean that it is known to be bad, or simply that we > have a lot of guality failures that may or may not indicate that the > debug info is bad? I don't have experiential evidence of bad debug info > that shows up during debugging sessions. Perhaps these are corner cases > that I will never encounter in practice? Or perhaps the tests are just > badly formed? > > The failing tests have all been bit-rotten (or never worked) since > before I joined this project, and from what others tell me, for at least > a decade. As you suggest here, others have always told me just to > ignore the existing guality failures. However, this can easily lead to > a culture of "ignore any guality failure, that stuff is junk" which can > cause regressions to be missed. (I can't say that I've actually > observed this, but it is a concern I have.) > > I have been consistently told that the same situation exists on most of > the supported targets, again because of the size of the testing matrix. > I'd be interested in knowing if this is true, or just anecdotal. > > The other point, "it would be really very much desirable if > anyone had time to analyze some of them and improve stuff," has to be > answered by "apparently nobody does." I am currently tracking well over > 200 improvements I would like to see made to the powerpc64le target > alone. Investigating old guality failures isn't even on that list. Our > team won't have time for it, and if we have bounty money to spend, it > will be spent on more important things. That's just the economic > reality, not a desire to disrespect the guality tests or anyone > associated with them. > > From my limited perspective, it seems like the guality tests are unique > within the test suite as a set of tests that everyone just expects to > have lots of failures. Is that healthy? Will it ever change? > > That said, it is clear that you feel the guality tests provide at least > some value in their present state, so we can continue to live with > things as they are. I'm just curious how others feel about the state of > these tests.
I agree with Jakub that disabling the tests is not good. Just look at a random testcase that FAILs on powerpc but not on x86_64-linux for all optimization levels. You can literally "debug" this manually as the guality would - there may be ABI issues that make handling the case hard or there may be simple bugs (like a target reorg pass not properly caring for debug insns). Richard. > Thanks, > Bill > >> >> Jakub >> > >