[
https://issues.apache.org/jira/browse/SOLR-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375701#comment-16375701
]
Erick Erickson commented on SOLR-12016:
---------------------------------------
[~thetaphi] The RMs job (and all volunteers who run smoke tests) is hard enough
as it is, so I'll start working on this. The initial cut shouldn't take long at
all given Hoss' reports.
The idea of a precommit test is a good one, but I confess I won't get to it.
I'll fold your build system changes in at the same time of course.
[~thetaphi] I have one suggestion though. Running with BadApple enabled on
Jenkins all the time still seems like it makes it harder to catch regressions.
If some % of runs ran with badapples=true and the rest with badapples=false
then _any_ failures with badapples=false would be cause for reverting the JIRA
that caused it or fixing immediately. The idea here is to keep more flakey
tests from creeping in.
It'd be easy enough to set up a mail filter to make failures with
badapples=false stand out and then we could complain loudly. That would still
allow Hoss' reports to be valid since there'd also be badapples=true tests for
them to chew on. In that case we might slightly modify the use of AwaitsFix to
mean "only tests that have a known cause". Or maybe just run with
badapples=false nightly. Or??? If/when we work through the current situation,
we could set badapples=true all the time. WDYT?
I'll see if I can get the first cut ready to rock-n-roll this weekend. After
that I expect a (hopefully short) period of adding BadApple annotations as
less-frequently failing tests show up..
All:
Expect a number of JIRAs to change status over the next few days as I
regularize the current use of BadApple and AwaitsFix. I'll probably raise a new
JIRA we can point BadApple tests at explaining the situation rather than make
someone wade through this one. Unless they contain specific information that
would help someone trying to debug them, I'll point the tests at the new JIRA
and close the old ones.
> Reduce noise from flakey tests
> ------------------------------
>
> Key: SOLR-12016
> URL: https://issues.apache.org/jira/browse/SOLR-12016
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Tests
> Affects Versions: 7.2, master (8.0)
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Priority: Major
> Attachments: SOLR-12016-buildsystem.patch,
> SOLR-12016-buildsystem.patch
>
>
> We had a discussion of this topic on the dev list, look for a thread titled:
> "Test failures are out of control.....". I'll try to summarize that
> discussion here and we can move this JIRA forward. This may become an
> umbrella issue.
> Current situation concerns:
> > There is so much noise from flakey tests (particularly Solr tests) that
> > they are difficult to use.
> > The number of tests that regularly fail is increasing
> > Failures are being ignored
> > The number of failing tests makes releasing more difficult.
> > The number of failing tests make it harder to determine whether recent
> > changes actually caused problems. Running the tests again until they
> > succeed is used commonly at present, which is not robust.
> > e-mail notifications of failing tests are largely being ignored.
> Propsal:
> > Mark all currently "flakey" tests as BadApple or AwaitsFix
> > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled.
> > Frequency TBD, depends partly on whether we can label emails from these
> > runs for easy filtering of the two flavors.
> >> Label these runs with something suitable in the subject line (wish list)
> > Weekly reports on the tests labeled BadApple or AwaitsFix
> >> Perhaps this could be incorporated in the reports linked below (wish list)
> > Committers should enable BadApple (or AwaitsFix) regularly as a sanity
> > check. Leave these as defaults.
> > We start getting _much_ more aggressive about not allowing _new_ flakey
> > tests.
> NOTE: It's perfectly acceptable to have failing flakey tests as long as
> someone is activey working on _fixing_ them.
> Concerns with solution
> > Decreases test coverage
> > Decreases visibility of flakey tests, making fixing them less likely.
> > Some tools (see below) that report on bad tests will not see tests that are
> > annotated with BadApple or AwaitsFix.
> > Running unit tests and reporting errors are being conflated
> To be decided:
> > Can we label e-mails with failing tests with something in the subject line
> > identifying whether they were run with BadApple/Awaits fix enabled or
> > disabled? Can someone volunteer?
> > Is there any difference between BadApple and AwaitsFix? If not should we
> > deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> > Can the automated reports (see below) be enhanced to also report tests
> > labeled BadApple or AwaitsFix?
> Useful tools:
> > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106)
> > Hoss has worked on aggregating all test failures from the 3 Jenkins systems
> > (ASF, Policeman, and Steve's), downloading the test results & logs, and
> > running some reports/stats on failures.
> >> http://fucit.org/solr-jenkins-reports/
> >> https://github.com/hossman/jenkins-reports/
> >> http://fucit.org/solr-jenkins-reports/failure-report.html
> I've assigned this JIRA to myslef, but all volunteers welcome, especially
> anything that changes the build system.....
> I've decided to make this a SOLR jira on the theory that most of the
> offending tests are in the Solr hive, any sub-tasks for touching the build
> system can go under LUCENE if wanted.
> Also, I expect to add the annotation to some more tests for a few days as
> infrequent failures occur. Once we have stability (defined by there being
> little noise) that'll stop.
> 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to
> these issues:
> HADOOP-14044
> HADOOP-9893
> LUCENE-3869
> LUCENE-5575")
> LUCENE-5595
> LUCENE-5737
> LUCENE-6709
> LUCENE-7161
> SOLR-2715
> SOLR-6213
> SOLR-6443
> SOLR-6944
> SOLR-7736
> SOLR-9036
> SOLR-10071
> SOLR-10107
> SOLR-10136
> SOLR-10734
> SOLR-10191
> SOLR-11134
> SOLR-11458
> SOLR-11714
> SOLR-11974
> Solr JIRAS about bad tests
> SOLR-2175
> SOLR-4147
> SOLR-5880
> SOLR-6423
> SOLR-6944
> SOLR-6961
> SOLR-6974
> SOLR-8122
> SOLR-8182
> SOLR-9869
> SOLR-10053
> SOLR-10070
> SOLR-10071
> SOLR-10139
> SOLR-10287
> SOLR-10815
> SOLR-11911
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]