[ 
https://issues.apache.org/jira/browse/SOLR-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705126#comment-16705126
 ] 

Hoss Man commented on SOLR-12932:
---------------------------------

Although it doesn't seem like there has been an apache jenkins build of master 
since mark's commit 75b1831967982, there have been 3 jenkins.thetaphi.de builds 
of master, and 1 jenkins.sarowe.net build of master...
 * jenkins.sarowe.net
 ** prior to mark's commit, the most recent master build had 2 test failures
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/sarowe/Lucene-Solr-tests-master/19410/]
 ** The single build since mark's commit had no failures...
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/sarowe/Lucene-Solr-tests-master/19411/]
 * jenkins.thetaphi.de
 ** prior to mark's commit, the most recent master build coincidently succeeded 
w/o any test failures.
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23276/]
 ** All 3 builds of master since that commit, have had 100+ suite level 
failures – although it should be noted they all used diff JVMs
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23277/]
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23278/]
 *** 
[http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23279/]

----
Skimming the logs from jenkins.thetaphi.de #23277, the suite failures mostly 
seem to fall into 2 types...
 # Thread Leaks
 ** 
 *** SolrRrdBackendFactory-* and MetricsHistoryHandler-* threads seem to 
frequently leak in "pairs" – ie: it's very common to see 2 threads leaked w/one 
of each – but also 4 threads leaked 2 of each, etc...
 *** there were other thread's leaked in some tests to various degrees, not 
enough for a pattern to be obvious
 # Object Leaks
 ** 
 *** Notably instances of [ZkStateReader, SolrZkClient] – also typically in 
pairs (if 2 objects leaked, one of each; if 24 objects leaked, 12 of each)
 *** there were a handful of object leak failures that sometimes included other 
objects: in particular some large lists w/multiple SolrCore objects being 
leaked and ohter objects you'd expect to see hanging off of a SolrCore

Hypothosis: I know mark has mentioned cleaning up a lot of "sleep" and "wait" 
type logic in tests, I'm guessing that in doing this it's exposed some 
"shutdown" logic bugs that that in the past weren't as obvious because "slow" 
jenkins machines were waiting longer for other things due to hardcoded "waits" 
and were getting lucky that the the threads/objects were being cleaned up in a 
timely manor.

> ant test (without badapples=false) should pass easily for developers.
> ---------------------------------------------------------------------
>
>                 Key: SOLR-12932
>                 URL: https://issues.apache.org/jira/browse/SOLR-12932
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Tests
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>            Priority: Major
>
> If we fix the tests we will end up here anyway, but we can shortcut this.
> Once I get my first patch in, anyone who mentions a test that fails locally 
> for them at any time (not jenkins), I will fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to