[ https://issues.apache.org/jira/browse/SOLR-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682039#comment-16682039 ]
Hoss Man commented on SOLR-12313: --------------------------------- [~caomanhdat] ... RecoveryAfterSoftCommitTest has been failing roughly 50% of the time the past few days - but only on master, and git bisect identifies your [13a8356|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=13a8356] commit as the cause... Here is an example of a seed from jenkins that reproduces reliably for me (and fails a the same place everytime: {{RecoveryAfterSoftCommitTest.java:87}} ) ... {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=RecoveryAfterSoftCommitTest -Dtests.method=test -Dtests.seed=9AB4E0C0AB3BEF87 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ru -Dtests.timezone=America/Indiana/Tell_City -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 [junit4] ERROR 78.5s | RecoveryAfterSoftCommitTest.test <<< [junit4] > Throwable #1: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:52448/ol_wuc/collection1 [junit4] > at __randomizedtesting.SeedInfo.seed([9AB4E0C0AB3BEF87:12E0DF1A05C7827F]:0) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1107) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817) [junit4] > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260) [junit4] > at org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test(RecoveryAfterSoftCommitTest.java:87) [junit4] > at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:1010) [junit4] > at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:985) [junit4] > at java.lang.Thread.run(Thread.java:748) [junit4] > Caused by: java.net.SocketTimeoutException: Read timed out [junit4] > at java.net.SocketInputStream.socketRead0(Native Method) [junit4] > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) [junit4] > at java.net.SocketInputStream.read(SocketInputStream.java:171) [junit4] > at java.net.SocketInputStream.read(SocketInputStream.java:141) [junit4] > at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) [junit4] > at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) [junit4] > at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) [junit4] > at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) [junit4] > at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) [junit4] > at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) [junit4] > at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) [junit4] > at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) [junit4] > at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) [junit4] > at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) [junit4] > at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) [junit4] > at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) [junit4] > at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) [junit4] > at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) [junit4] > at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) [junit4] > at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) [junit4] > at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:542) [junit4] > ... 50 more {noformat} > TestInjection#waitForInSyncWithLeader needs improvement. > -------------------------------------------------------- > > Key: SOLR-12313 > URL: https://issues.apache.org/jira/browse/SOLR-12313 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Mark Miller > Priority: Major > > This really should have some doc for why it would be used. > I also think it causes BasicDistributedZkTest to take forever for sometimes > and perhaps other tests? > I think checking for uncommitted data is probably a race condition and should > be removed. > Checking index versions should follow the rules that replication does - if > the slave is higher than the leader, it's in sync, being equal is not > required. If it's expected for a test it should be a specific test that > fails. This just introduces massive delays. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org