Re: RFC: Monitoring old PRs, new dg directives

Marek Polacek via Gcc-patches Tue, 04 Aug 2020 15:17:10 -0700

On Wed, Jul 29, 2020 at 04:00:27PM -0600, Martin Sebor wrote:
> I've created a much more rudimentary setup for myself to deal
> with the same problem.  I copy tests from Bugzilla, sometimes
> with tweaks, and compile them from time to time as I revisit
> unresolved bugs.  I've also thought about adding those to
> the test suite and marking them XFAIL but I don't think I've
> actually done it more than a handful of times.  I was told
> adding tests (passing or xfailing) is fine without approval.
> 
> I think your proposal to add tests for known failures is a good
> idea.  I don't have much of an opinion about extending the test
> harness to differentiate other kinds of failures (like ICEs) and
> mark them as expected.  I'm not sure I understand the benefit
> of adding directives like dg-accepts-invalid over using xfail.


Thanks for the feedback.  The benefit of dg-accepts-invalid was that you would
get an XPASS even for a test that should not be accepted, but you didn't know
what line to expect an error on, so you put a dg-error at the end of the test.

That's probably not necessary for the first incarnation of this patch so I've
dropped it.

Thanks,
Marek

> On 7/28/20 3:44 PM, Marek Polacek via Gcc-patches wrote:
> > In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  
> > In
> > my experience, a good amount of them have already been fixed; my periodical
> > sweeps always turn up a bunch of PRs that had already been fixed previously.
> > Sometimes my sweeps are more or less random, but more often than not I'm 
> > just
> > looking for duplicates of an existing PR.  Sometimes the reason the already
> > fixed PRs are still open is because a PR that was fixed had duplicates that 
> > we
> > didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a
> > side-effect of fixing another PR.  Manual sweeps are tedious and 
> > time-consuming
> > because often you need to grab the test from the Bugzilla yet again (and
> > sometimes there are multiple tests).  Even if you find a PR that was fixed, 
> > you
> > still need to bisect the fix and perhaps add the test to our testsuite.  
> > That's
> > draining and since the number of bugs only increases, never decreases, it 
> > is not
> > sustainable.
> > 
> > So I've started a personal repo where I've gathered dozens of tests and 
> > wrote a
> > script that just compiles every test in the repo and reports if anything
> > changed.  One line from it:
> > 
> > pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || 
> > echo -e "$pr: ${msg_ice}"
> > 
> > This has major drawbacks: you have to remember to run this manually, keep
> > updating it, and it's yet another repo that people interested in this would
> > have to clone, but the worst thing is that typically you would only discover
> > that a patch fixed a different PR long after the patch was committed.  And
> > quite likely it wasn't even your patch.  We know that finding problems 
> > earlier
> > in the developer workflow reduces costs; if we can catch this before the
> > original developer commits & pushes the changes, it's cheaper, because the
> > developer already understands what the patch does.
> > 
> > A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently
> > by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in
> > the patch had this effect would probably have been very useful.  More 
> > testing
> > will lead to a better compiler.
> > 
> > Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after
> > it was reported by a different change.
> > 
> > Or another: https://gcc.gnu.org/91525 where the patch contained a test, but
> > that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.
> > 
> > To alleviate some of these problems, I propose that we introduce a means to 
> > our
> > DejaGNU infrastructure that allows adding tests for old bugs that have not 
> > been
> > fixed yet, and re-introduce the keyword monitored (no longer used for 
> > anything
> > -- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is
> > tracked in the testsuite.  I don't want any unnecessary moving tests 
> > around, so
> > the tests would go where they would normally go; they have to be reduced and
> > have proper targets, etc.  Having such tests in the testsuite means that 
> > when
> > something changes, you will know immediately, before you push any changes.
> > 
> > My thinking is that for:
> > 
> > * rejects-valid: use the existing dg-xfail-if
> > * accepts-valid: use the new dg-accepts-invalid
> > * ICEs: use the new dg-ice
> > 
> > dg-ice can be used like this:
> > 
> > // { dg-ice "build_over_call" { target c++11 } }
> > 
> > and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the
> > ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,
> > you'll get an XPASS + FAIL.  Then you can close the old PR.
> > 
> > Similarly, dg-accepts-invalid:
> > 
> > // { dg-accepts-invalid "PR86500" }
> > 
> > means that if the test still compiles without errors, you'll get a quiet 
> > XFAIL.
> > If we start giving errors, you'll get an XPASS.
> > 
> > If the bug is fixed, simply remove the directive.
> > 
> > The patch implementing these new directives is appended.  Once/if this is
> > accepted, I can start adding the old tests we have in our Bugzilla.  (I'm
> > only concerned about the c++ component, if that wasn't already clear.)
> > 
> > The question is what makes the bug "old": is it one year without it being
> > assigned?  6 months?  3 months?  Note: I *don't* propose to add every test 
> > for
> > every new PR, just the reasonably old ones that are useful/important.  Such
> > additions should be done in batches, so that we don't have dozens of 
> > commits,
> > each of them merely adding a single test.
> > 
> > We will still have a surfeit of bugs that we've given short shrift to, but
> > let's at least automate what we can.  The initial addition of the relevant
> > old(-ish) tests won't of course happen automagically, but it's a price I'm
> > willing to pay.  My goal here isn't merely to reduce the number of open PRs;
> > it is to improve the testing of the compiler overall.
> > 
> > Thoughts?
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu.
> > 
> > [PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.
> > 
> > gcc/ChangeLog:
> > 
> >     * doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >     * lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.
> >     * lib/prune.exp (prune_ices): New.
> >     * lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.
> > ---
> >   gcc/doc/sourcebuild.texi                 | 19 +++++++
> >   gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-
> >   gcc/testsuite/lib/prune.exp              |  9 ++++
> >   gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++
> >   4 files changed, 134 insertions(+), 2 deletions(-)
> > 
> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > index a7a922d84a2..636d21d30dd 100644
> > --- a/gcc/doc/sourcebuild.texi
> > +++ b/gcc/doc/sourcebuild.texi
> > @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the 
> > conditions (which are
> >   the same as for @code{dg-skip-if}) are met.
> >   @end table
> > +@subsubsection Expect the compiler to crash
> > +
> > +@table @code
> > +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ 
> > @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
> > +Expect the compiler to crash with an internal compiler error and return
> > +a nonzero exit status if the conditions (which are the same as for
> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not 
> > been
> > +fixed yet.
> > +@end table
> > +
> >   @subsubsection Expect the test executable to fail
> >   @table @code
> > @@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.
> >   @item @{ dg-prune-output @var{regexp} @}
> >   Prune messages matching @var{regexp} from the test output.
> > +
> > +@table @code
> > +@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ 
> > @var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
> > +Expect the compiler to accept the test (even though it should be rejected 
> > with
> > +a compile-time error), if the conditions (which are the same as for
> > +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not 
> > been
> > +fixed yet.
> > +@end table
> > +
> >   @end table
> >   @subsubsection Verify output of the test executable
> > diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
> > index 45d97024883..6478eda283b 100644
> > --- a/gcc/testsuite/lib/gcc-dg.exp
> > +++ b/gcc/testsuite/lib/gcc-dg.exp
> > @@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what 
> > extra_tool_flags } {
> >       verbose "$target_compile $prog $output_file $compile_type $options" 4
> >       set comp_output [$target_compile "$prog" "$output_file" 
> > "$compile_type" $options]
> > +    global expect_ice
> >       # Look for an internal compiler error, which sometimes masks the fact
> >       # that we didn't get an expected error message.  XFAIL an ICE via
> >       # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" 
> > }
> >       # to avoid a second failure for excess errors.
> > -    if [string match "*internal compiler error*" $comp_output] {
> > +    # "Error reporting routines re-entered" ICE says "Internal" rather than
> > +    # "internal", so match that too.
> > +    if [string match {*[Ii]nternal compiler error*} $comp_output] {
> >     upvar 2 name name
> > -   fail "$name (internal compiler error)"
> > +   if { $expect_ice == 0 } {
> > +     fail "$name (internal compiler error)"
> > +   } else {
> > +     # We expected an ICE and we got it.  Emit an XFAIL.
> > +     setup_xfail "*-*-*"
> > +     fail "$name (internal compiler error)"
> > +     clear_xfail "*-*-*"
> > +     # Prune the ICE from the output.
> > +     set comp_output [prune_ices $comp_output]
> > +   }
> > +    } else {
> > +   upvar 2 name name
> > +   global accepts_invalid
> > +   if { $expect_ice == 1 } {
> > +     # We expected an ICE but we didn't get it.  We want an XPASS, so
> > +     # call setup_xfail to set xfail_flag.
> > +     setup_xfail "*-*-*"
> > +     pass "$name (internal compiler error)"
> > +     clear_xfail "*-*-*"
> > +   } elseif { $accepts_invalid == 1 } {
> > +       if [string match {*error: *} $comp_output] {
> > +         # We expected that this test be (wrongly) accepted, but now we 
> > have
> > +         # seen error(s).  Issue an XPASS to signal that.
> > +         setup_xfail "*-*-*"
> > +         pass "$name (accepts invalid)"
> > +         clear_xfail "*-*-*"
> > +       } else {
> > +         # This test is still (wrongly) accepted.  Just emit an XFAIL.
> > +         setup_xfail "*-*-*"
> > +         fail "$name (accepts invalid)"
> > +         clear_xfail "*-*-*"
> > +       }
> > +   }
> >       }
> >       if { $do_what == "repo" } {
> > diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
> > index 1c776249f1a..58a739684a5 100644
> > --- a/gcc/testsuite/lib/prune.exp
> > +++ b/gcc/testsuite/lib/prune.exp
> > @@ -118,6 +118,15 @@ proc prune_file_path { text } {
> >       return $text
> >   }
> > +# Prune internal compiler error messages, including the "Please submit..."
> > +# footnote.
> > +
> > +proc prune_ices { text } {
> > +  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for 
> > instructions\[^\n\]*" $text "" text
> > +  regsub -all "(^|\n|')*Internal compiler error:.*for 
> > instructions\[^\n\]*" $text "" text
> > +  return $text
> > +}
> > +
> >   # Provide a definition of this if missing (delete after next dejagnu 
> > release).
> >   if { [info procs prune_warnings] == "" } then {
> > diff --git a/gcc/testsuite/lib/target-supports-dg.exp 
> > b/gcc/testsuite/lib/target-supports-dg.exp
> > index 2a21424b890..765f3a2e27a 100644
> > --- a/gcc/testsuite/lib/target-supports-dg.exp
> > +++ b/gcc/testsuite/lib/target-supports-dg.exp
> > @@ -495,6 +495,75 @@ proc dg-shouldfail { args } {
> >       }
> >   }
> > +# Record whether the compiler is expected (at the moment) to ICE.
> > +# Used for tests that test bugs that have not been fixed yet.
> > +
> > +set expect_ice 0
> > +set accepts_invalid 0
> > +
> > +proc dg-ice { args } {
> > +    # Don't bother if we're already skipping the test.
> > +    upvar dg-do-what dg-do-what
> > +    if { [lindex ${dg-do-what} 1] == "N" } {
> > +      return
> > +    }
> > +
> > +    global accepts_invalid
> > +    # Can't be combined with dg-accepts-invalid.
> > +    if { $accepts_invalid == 1 } {
> > +      error "dg-ice: cannot be combined with dg-accepts-invalid"
> > +      return
> > +    }
> > +
> > +    global expect_ice
> > +
> > +    set args [lreplace $args 0 0]
> > +    if { [llength $args] > 1 } {
> > +   set selector [list target [lindex $args 1]]
> > +   if { [dg-process-target-1 $selector] == "S" } {
> > +       # The target matches, now check the flags.
> > +       if [check-flags $args] {
> > +           set expect_ice 1
> > +       }
> > +   }
> > +    } else {
> > +   set expect_ice 1
> > +    }
> > +}
> > +
> > +# Record whether the compiler should reject the testcase with an error,
> > +# but currently doesn't do so.  Used for accepts-invalid bugs.
> > +
> > +proc dg-accepts-invalid { args } {
> > +    # Don't bother if we're already skipping the test.
> > +    upvar dg-do-what dg-do-what
> > +    if { [lindex ${dg-do-what} 1] == "N" } {
> > +      return
> > +    }
> > +
> > +    global expect_ice
> > +    # Can't be combined with dg-ice.
> > +    if { $expect_ice == 1 } {
> > +      error "dg-accepts-invalid: cannot be combined with dg-ice"
> > +      return
> > +    }
> > +
> > +    global accepts_invalid
> > +
> > +    set args [lreplace $args 0 0]
> > +    if { [llength $args] > 1 } {
> > +   set selector [list target [lindex $args 1]]
> > +   if { [dg-process-target-1 $selector] == "S" } {
> > +       # The target matches, now check the flags.
> > +       if [check-flags $args] {
> > +           set accepts_invalid 1
> > +       }
> > +   }
> > +    } else {
> > +   set accepts_invalid 1
> > +    }
> > +}
> > +
> >   # Intercept the call to the DejaGnu version of dg-process-target to
> >   # support use of an effective-target keyword in place of a list of
> >   # target triplets to xfail or skip a test.
> > 
> > base-commit: f3665bd1111c1799c0421490b5e655f977570354
> > 
>

Re: RFC: Monitoring old PRs, new dg directives

Reply via email to