[ 
https://issues.apache.org/jira/browse/SOLR-10032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851620#comment-15851620
 ] 

Mark Miller commented on SOLR-10032:
------------------------------------

This report is not made to reproduce all fails.

Some tests will fail for a variety of resources. Resources are too low, Java/OS 
version, which tests they happen to run against, etc, etc.

So this will not exhaustively produce all flakey tests, nor is it trying to. In 
fact, I've tried to make sure there are *plenty* of resources to the run the 
tests reasonably. My goal is to find flakey tests that pop out easily, and not 
due to very specific conditions. This should target and find obvious problems 
and then help clamp down on minor flakey tests in time. Jenkins and individual 
devs will still play an important role in outliers and other, hopefully much 
less common, fails.

That said, most things still end up popping out if you beast long enough in my 
experience. Beasting for 100 runs would probably surface even more flakey 
tests. Producing this test with 30 is already quite time expensive though ;) 
I'll eventually do some longer reports as we whittle down the obvious issues. 
It's really a judgment call of time vs coverage, and in these early reports 30 
seemed like a reasonable condition to pass.

The other tests are not all cleared, but here is a very reasonable list of 
tests we should focus on - that even in a good clean evn appear to fail too 
much.

I will also focus 100 run or more beasting on the tests that this report 
surfaces as flakey, and likely some tests will enter and drop off the report 
from one to the next. Those tests will end up needing more extensive individual 
beasting to pass as 100% clean.

rock-solid is not really a definitive judgment, just the rating for no fails. 
If you did a single run and it passed it would be rock-solid. I can change that 
to something a little less confusing.

If you do have a specific test that seems to fail for you, I'm happy to beast 
it more extensively and let you know if fails pop out. I'll try ShardSplitTest. 
It may be that it has more severe resource problems when it ends up running 
against some other intensive tests in ant test.

> Create report to assess Solr test quality at a commit point.
> ------------------------------------------------------------
>
>                 Key: SOLR-10032
>                 URL: https://issues.apache.org/jira/browse/SOLR-10032
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Tests
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>         Attachments: Lucene-Solr Master Test Beast Results 
> 01-24-2017-9899cbd031dc3fc37a384b1f9e2b379e90a9a3a6 Level Medium- Running 30 
> iterations, 12 at a time .pdf, Lucene-Solr Master Test Beasults 
> 02-01-2017-bbc455de195c83d9f807980b510fa46018f33b1b Level Medium- Running 30 
> iterations, 10 at a time.pdf
>
>
> We have many Jenkins instances blasting tests, some official, some policeman, 
> I and others have or had their own, and the email trail proves the power of 
> the Jenkins cluster to find test fails.
> However, I still have a very hard time with some basic questions:
> what tests are flakey right now? which test fails actually affect devs most? 
> did I break it? was that test already flakey? is that test still flakey? what 
> are our worst tests right now? is that test getting better or worse?
> We really need a way to see exactly what tests are the problem, not because 
> of OS or environmental issues, but more basic test quality issues. Which 
> tests are flakey and how flakey are they at any point in time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to