Hi Mike, thanks for your comments.
On Tue, Jul 28, 2020 at 06:37:26PM -0700, Mike Stump via Gcc-patches wrote: > I'll punt to the the C++ front-end folks to chime in. Usually we only check > in bugs that are fixed, as they are fixed, this is what makes it a regression > suite. Doing this does have advantages, like, the testsuite is small and > doesn't have duplicates and doesn't test anything that is known to fail that > isn't an actual regression. We also add sanity tests for new language features, for example. I also like adding (small) tests for DRs that happen to be already fixed to insure that the compiler continues to behave as expected. Avoiding duplicates is a good thing, obviously, but, with this new scheme, if you fix something and while testing the fix you find that we already have a test, you can choose not to introduce a new test. Whether or not a compiler bug is a regression shouldn't, IMHO, be a criterion for (not) including the test. tree(1) in testsuite/g++.dg says 70 directories, 13406 files. If I go wild and add 200 new C++ tests, that's ~1.5% increase in the number of tests. That seems reasonable. If it still causes grief for some people, and we go with the idea of using an unfixed/ directory, we could add GCC_TESTSUITE_TEST_UNFIXED envvar to enable or disable running such tests. > Ideally, it would be nice to have a way to test bugs out of bugzilla, and > report on those fixed bugs as they are fixed. If people want to keep the > test suite a regression suite, then my counter proposal would be to have a > branch with the non-regression bugs on it and then people can checkout and > test that branch. Most of the people don't (saving the testing time, which > is handy), and then more sporadically, the old bugs branch can be test and BZ > state can be moved along as bugs as fixed. A run once a week or even once a > month would seem to be plenty often. I'm afraid this would defeat the point of this proposal. If the additional tests are only available on a branch, no one is going to use it. Quite frankly, I'd just stick with my personal repo (where I can do whatever I want, include test with #includes, unreduced tests, ...) rather than to bother with rebasing a branch etc. Even if someone would actually use such a branch from time to time, we'd lose the benefit of "left shifting" bugs in the developer workflow -- you would only notice that your patch had changed something days or weeks after the commit at which point you likely have lost state on it. > You in general don't need to check in fixes from bug reports that have been > fixed in the past, as those fixes generally already have a test case for the > fix that went in with the fix. I agree, but experience shows that's not always the case. Just the few PRs I mentioned in my original mail prove that. There are patches that change something as a side-effect only, and, as a developer, you want to be aware of those side-effects. > As for how old is old, we'd leave that to the contributor of work to decide. > In theory, bugs can be added as soon as they come in, no need to wait. To > the extent that waiting saves work, well, that's a personal choice, for the > person doing the work to choose. I agree. I'd leave it up to developers. Just... be reasonable. No need to add every broken nonsense test lurking in Bugzilla. > Why not just use xfail and xpass? Seems less work than doing a setup_xfail. > Also, why not just use the existing directives instead of adding new > directives? I'm suspicious of expect_ice and accepts_invalid. You set them > to 1 all the time, but almost never set them to 0? I'm wondering if it > should be more like shouldfail? Good point, I missed that I could use xfail and xpass. Fixed. For accepts_invalid I think we *could* use the existing directives, if you know where to expect an error. But for ICEs we currently have nothing that would work well. I'll probably drop dg-accepts-invalid completely. I followed shouldfail's suit when it comes to setting them to 1, so that should be fine. Thanks, Marek > On Jul 28, 2020, at 2:44 PM, Marek Polacek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > In Bugzilla, for the c++ component, we currently have over 3200 open bugs. > > In > > my experience, a good amount of them have already been fixed; my periodical > > sweeps always turn up a bunch of PRs that had already been fixed previously. > > Sometimes my sweeps are more or less random, but more often than not I'm > > just > > looking for duplicates of an existing PR. Sometimes the reason the already > > fixed PRs are still open is because a PR that was fixed had duplicates that > > we > > didn't catch earlier when confirming the PR. Sometimes a PR gets fixed as a > > side-effect of fixing another PR. Manual sweeps are tedious and > > time-consuming > > because often you need to grab the test from the Bugzilla yet again (and > > sometimes there are multiple tests). Even if you find a PR that was fixed, > > you > > still need to bisect the fix and perhaps add the test to our testsuite. > > That's > > draining and since the number of bugs only increases, never decreases, it > > is not > > sustainable. > > > > So I've started a personal repo where I've gathered dozens of tests and > > wrote a > > script that just compiles every test in the repo and reports if anything > > changed. One line from it: > > > > pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || > > echo -e "$pr: ${msg_ice}" > > > > This has major drawbacks: you have to remember to run this manually, keep > > updating it, and it's yet another repo that people interested in this would > > have to clone, but the worst thing is that typically you would only discover > > that a patch fixed a different PR long after the patch was committed. And > > quite likely it wasn't even your patch. We know that finding problems > > earlier > > in the developer workflow reduces costs; if we can catch this before the > > original developer commits & pushes the changes, it's cheaper, because the > > developer already understands what the patch does. > > > > A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently > > by an unrelated (?) patch. Knowing that the tsubst_pack_expansion hunk in > > the patch had this effect would probably have been very useful. More > > testing > > will lead to a better compiler. > > > > Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after > > it was reported by a different change. > > > > Or another: https://gcc.gnu.org/91525 where the patch contained a test, but > > that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid. > > > > To alleviate some of these problems, I propose that we introduce a means to > > our > > DejaGNU infrastructure that allows adding tests for old bugs that have not > > been > > fixed yet, and re-introduce the keyword monitored (no longer used for > > anything > > -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is > > tracked in the testsuite. I don't want any unnecessary moving tests > > around, so > > the tests would go where they would normally go; they have to be reduced and > > have proper targets, etc. Having such tests in the testsuite means that > > when > > something changes, you will know immediately, before you push any changes. > > > > My thinking is that for: > > > > * rejects-valid: use the existing dg-xfail-if > > * accepts-valid: use the new dg-accepts-invalid > > * ICEs: use the new dg-ice > > > > dg-ice can be used like this: > > > > // { dg-ice "build_over_call" { target c++11 } } > > > > and it means that if the test still ICEs, you'll get a quiet XFAIL. If the > > ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors, > > you'll get an XPASS + FAIL. Then you can close the old PR. > > > > Similarly, dg-accepts-invalid: > > > > // { dg-accepts-invalid "PR86500" } > > > > means that if the test still compiles without errors, you'll get a quiet > > XFAIL. > > If we start giving errors, you'll get an XPASS. > > > > If the bug is fixed, simply remove the directive. > > > > The patch implementing these new directives is appended. Once/if this is > > accepted, I can start adding the old tests we have in our Bugzilla. (I'm > > only concerned about the c++ component, if that wasn't already clear.) > > > > The question is what makes the bug "old": is it one year without it being > > assigned? 6 months? 3 months? Note: I *don't* propose to add every test > > for > > every new PR, just the reasonably old ones that are useful/important. Such > > additions should be done in batches, so that we don't have dozens of > > commits, > > each of them merely adding a single test. > > > > We will still have a surfeit of bugs that we've given short shrift to, but > > let's at least automate what we can. The initial addition of the relevant > > old(-ish) tests won't of course happen automagically, but it's a price I'm > > willing to pay. My goal here isn't merely to reduce the number of open PRs; > > it is to improve the testing of the compiler overall. > > > > Thoughts? > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu. > > > > [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid. > > > > gcc/ChangeLog: > > > > * doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid. > > > > gcc/testsuite/ChangeLog: > > > > * lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid. > > * lib/prune.exp (prune_ices): New. > > * lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New. > > --- > > gcc/doc/sourcebuild.texi | 19 +++++++ > > gcc/testsuite/lib/gcc-dg.exp | 39 +++++++++++++- > > gcc/testsuite/lib/prune.exp | 9 ++++ > > gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++ > > 4 files changed, 134 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi > > index a7a922d84a2..636d21d30dd 100644 > > --- a/gcc/doc/sourcebuild.texi > > +++ b/gcc/doc/sourcebuild.texi > > @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the > > conditions (which are > > the same as for @code{dg-skip-if}) are met. > > @end table > > > > +@subsubsection Expect the compiler to crash > > + > > +@table @code > > +@item @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ > > @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @} > > +Expect the compiler to crash with an internal compiler error and return > > +a nonzero exit status if the conditions (which are the same as for > > +@code{dg-skip-if}) are met. Used for tests that test bugs that have not > > been > > +fixed yet. > > +@end table > > + > > @subsubsection Expect the test executable to fail > > > > @table @code > > @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}. > > > > @item @{ dg-prune-output @var{regexp} @} > > Prune messages matching @var{regexp} from the test output. > > + > > +@table @code > > +@item @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ > > @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @} > > +Expect the compiler to accept the test (even though it should be rejected > > with > > +a compile-time error), if the conditions (which are the same as for > > +@code{dg-skip-if}) are met. Used for tests that test bugs that have not > > been > > +fixed yet. > > +@end table > > + > > @end table > > > > @subsubsection Verify output of the test executable > > diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp > > index 45d97024883..6478eda283b 100644 > > --- a/gcc/testsuite/lib/gcc-dg.exp > > +++ b/gcc/testsuite/lib/gcc-dg.exp > > @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what > > extra_tool_flags } { > > verbose "$target_compile $prog $output_file $compile_type $options" 4 > > set comp_output [$target_compile "$prog" "$output_file" "$compile_type" > > $options] > > > > + global expect_ice > > # Look for an internal compiler error, which sometimes masks the fact > > # that we didn't get an expected error message. XFAIL an ICE via > > # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" } > > # to avoid a second failure for excess errors. > > - if [string match "*internal compiler error*" $comp_output] { > > + # "Error reporting routines re-entered" ICE says "Internal" rather than > > + # "internal", so match that too. > > + if [string match {*[Ii]nternal compiler error*} $comp_output] { > > upvar 2 name name > > - fail "$name (internal compiler error)" > > + if { $expect_ice == 0 } { > > + fail "$name (internal compiler error)" > > + } else { > > + # We expected an ICE and we got it. Emit an XFAIL. > > + setup_xfail "*-*-*" > > + fail "$name (internal compiler error)" > > + clear_xfail "*-*-*" > > + # Prune the ICE from the output. > > + set comp_output [prune_ices $comp_output] > > + } > > + } else { > > + upvar 2 name name > > + global accepts_invalid > > + if { $expect_ice == 1 } { > > + # We expected an ICE but we didn't get it. We want an XPASS, so > > + # call setup_xfail to set xfail_flag. > > + setup_xfail "*-*-*" > > + pass "$name (internal compiler error)" > > + clear_xfail "*-*-*" > > + } elseif { $accepts_invalid == 1 } { > > + if [string match {*error: *} $comp_output] { > > + # We expected that this test be (wrongly) accepted, but now we > > have > > + # seen error(s). Issue an XPASS to signal that. > > + setup_xfail "*-*-*" > > + pass "$name (accepts invalid)" > > + clear_xfail "*-*-*" > > + } else { > > + # This test is still (wrongly) accepted. Just emit an XFAIL. > > + setup_xfail "*-*-*" > > + fail "$name (accepts invalid)" > > + clear_xfail "*-*-*" > > + } > > + } > > } > > > > if { $do_what == "repo" } { > > diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp > > index 1c776249f1a..58a739684a5 100644 > > --- a/gcc/testsuite/lib/prune.exp > > +++ b/gcc/testsuite/lib/prune.exp > > @@ -118,6 +118,15 @@ proc prune_file_path { text } { > > return $text > > } > > > > +# Prune internal compiler error messages, including the "Please submit..." > > +# footnote. > > + > > +proc prune_ices { text } { > > + regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for > > instructions\[^\n\]*" $text "" text > > + regsub -all "(^|\n|')*Internal compiler error:.*for > > instructions\[^\n\]*" $text "" text > > + return $text > > +} > > + > > # Provide a definition of this if missing (delete after next dejagnu > > release). > > > > if { [info procs prune_warnings] == "" } then { > > diff --git a/gcc/testsuite/lib/target-supports-dg.exp > > b/gcc/testsuite/lib/target-supports-dg.exp > > index 2a21424b890..765f3a2e27a 100644 > > --- a/gcc/testsuite/lib/target-supports-dg.exp > > +++ b/gcc/testsuite/lib/target-supports-dg.exp > > @@ -495,6 +495,75 @@ proc dg-shouldfail { args } { > > } > > } > > > > +# Record whether the compiler is expected (at the moment) to ICE. > > +# Used for tests that test bugs that have not been fixed yet. > > + > > +set expect_ice 0 > > +set accepts_invalid 0 > > + > > +proc dg-ice { args } { > > + # Don't bother if we're already skipping the test. > > + upvar dg-do-what dg-do-what > > + if { [lindex ${dg-do-what} 1] == "N" } { > > + return > > + } > > + > > + global accepts_invalid > > + # Can't be combined with dg-accepts-invalid. > > + if { $accepts_invalid == 1 } { > > + error "dg-ice: cannot be combined with dg-accepts-invalid" > > + return > > + } > > + > > + global expect_ice > > + > > + set args [lreplace $args 0 0] > > + if { [llength $args] > 1 } { > > + set selector [list target [lindex $args 1]] > > + if { [dg-process-target-1 $selector] == "S" } { > > + # The target matches, now check the flags. > > + if [check-flags $args] { > > + set expect_ice 1 > > + } > > + } > > + } else { > > + set expect_ice 1 > > + } > > +} > > + > > +# Record whether the compiler should reject the testcase with an error, > > +# but currently doesn't do so. Used for accepts-invalid bugs. > > + > > +proc dg-accepts-invalid { args } { > > + # Don't bother if we're already skipping the test. > > + upvar dg-do-what dg-do-what > > + if { [lindex ${dg-do-what} 1] == "N" } { > > + return > > + } > > + > > + global expect_ice > > + # Can't be combined with dg-ice. > > + if { $expect_ice == 1 } { > > + error "dg-accepts-invalid: cannot be combined with dg-ice" > > + return > > + } > > + > > + global accepts_invalid > > + > > + set args [lreplace $args 0 0] > > + if { [llength $args] > 1 } { > > + set selector [list target [lindex $args 1]] > > + if { [dg-process-target-1 $selector] == "S" } { > > + # The target matches, now check the flags. > > + if [check-flags $args] { > > + set accepts_invalid 1 > > + } > > + } > > + } else { > > + set accepts_invalid 1 > > + } > > +} > > + > > # Intercept the call to the DejaGnu version of dg-process-target to > > # support use of an effective-target keyword in place of a list of > > # target triplets to xfail or skip a test. > > > > base-commit: f3665bd1111c1799c0421490b5e655f977570354 > > -- > > 2.26.2 > >