[ 
https://issues.apache.org/jira/browse/SOLR-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317085#comment-15317085
 ] 

Hoss Man commented on SOLR-9189:
--------------------------------

sarowe reminded me offline about the "buildTimeTrend" feature of jenkins -- 
while the ASF jenkins machines have only been running tests about once a day, 
so it's hard to spot an obvious pattern, uwe & sarowe's jenkins machines have 
been hammering on tests a lot faster, and you can really spot a trend...

http://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/buildTimeTrend
http://jenkins.thetaphi.de/view/All/job/Lucene-Solr-6.x-Linux/buildTimeTrend

http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/buildTimeTrend
http://jenkins.sarowe.net/job/Lucene-Solr-tests-6.x/buildTimeTrend

...from sarowe's master job, build #7028 was the first test in a while to go 
over 20 minutes, and from that point on tests were reliably over 40 minutes 
until build #7035 which droped down to 10 minutes....

* http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/7028/
** 1e2ba9fe9be84f0b5defe4965735eae892fabf7b
** "Jun 4, 2016 7:14:24 AM"
** changes:
*** Revert "SOLR-9181: Fix test bug in ZkStateReaderTest" (detail)
* http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/7035/
** c8570ed821654cdce5f92ae17d06a21f242524e2
** "Jun 6, 2016 1:08:05 PM"
** changes:
*** Revert "SOLR-9140: Replace some zk state polling with (detail)
*** LUCENE-7132: BooleanQuery sometimes assigned the wrong score when ranges 
(detail)

...that means the slow down didn't hit jenkins master until 3 days *after* i 
committed SOLR-9107 to that branch -- but it did start right whne a 
SOLR-9181commit happened.  Likewise the build#7035 speedup was *before* my 
SOLR-9189 commit to disable randomized ssl testing on on master completely - 
and again, coincided with a SOLR-9140 commit.

[~romseygeek] - definitely wnat to draw your attention to this issue -- your 
recent commits may have resvolved the slowdowns (at least on master), but i 
want to make sure you're aware of the situation.


> explosion of timeout related failures in jenkins the past few days
> ------------------------------------------------------------------
>
>                 Key: SOLR-9189
>                 URL: https://issues.apache.org/jira/browse/SOLR-9189
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Critical
>
> In the past few days, something has gone seriously wonky with our jenkins 
> tests -- causing a serious explosion in the number of test failures -- 
> notably do to various sorts of timeouts...
> * "Unable to create core ... Timed out getting coreNodeName for ..."
> * "msg=SolrCore is loading,code=503"
> * "Timeout occured while waiting response from server"
> * "No registered leader was found after waiting for 30000ms"
> * "Unable to create core ... Caused by: Timed out getting shard id for core: 
> ..."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to