[ 
https://issues.apache.org/jira/browse/SOLR-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682039#comment-16682039
 ] 

Hoss Man commented on SOLR-12313:
---------------------------------

[~caomanhdat] ...

RecoveryAfterSoftCommitTest has been failing roughly 50% of the time the past 
few days - but only on master, and git bisect identifies your 
[13a8356|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=13a8356] 
commit as the cause...

Here is an example of a seed from jenkins that reproduces reliably for me (and 
fails a the same place everytime: {{RecoveryAfterSoftCommitTest.java:87}} ) ...
{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=RecoveryAfterSoftCommitTest -Dtests.method=test 
-Dtests.seed=9AB4E0C0AB3BEF87 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.badapples=true -Dtests.locale=ru 
-Dtests.timezone=America/Indiana/Tell_City -Dtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1
   [junit4] ERROR   78.5s | RecoveryAfterSoftCommitTest.test <<<
   [junit4]    > Throwable #1: 
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://127.0.0.1:52448/ol_wuc/collection1
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([9AB4E0C0AB3BEF87:12E0DF1A05C7827F]:0)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1107)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
   [junit4]    >        at 
org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test(RecoveryAfterSoftCommitTest.java:87)
   [junit4]    >        at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:1010)
   [junit4]    >        at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:985)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
   [junit4]    > Caused by: java.net.SocketTimeoutException: Read timed out
   [junit4]    >        at java.net.SocketInputStream.socketRead0(Native Method)
   [junit4]    >        at 
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
   [junit4]    >        at 
java.net.SocketInputStream.read(SocketInputStream.java:171)
   [junit4]    >        at 
java.net.SocketInputStream.read(SocketInputStream.java:141)
   [junit4]    >        at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
   [junit4]    >        at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
   [junit4]    >        at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
   [junit4]    >        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
   [junit4]    >        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
   [junit4]    >        at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
   [junit4]    >        at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
   [junit4]    >        at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
   [junit4]    >        at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
   [junit4]    >        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
   [junit4]    >        at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
   [junit4]    >        at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
   [junit4]    >        at 
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
   [junit4]    >        at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
   [junit4]    >        at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   [junit4]    >        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   [junit4]    >        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:542)
   [junit4]    >        ... 50 more

{noformat}
 

> TestInjection#waitForInSyncWithLeader needs improvement.
> --------------------------------------------------------
>
>                 Key: SOLR-12313
>                 URL: https://issues.apache.org/jira/browse/SOLR-12313
>             Project: Solr
>          Issue Type: Test
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mark Miller
>            Priority: Major
>
> This really should have some doc for why it would be used.
> I also think it causes BasicDistributedZkTest to take forever for sometimes 
> and perhaps other tests?
> I think checking for uncommitted data is probably a race condition and should 
> be removed.
> Checking index versions should follow the rules that replication does - if 
> the slave is higher than the leader, it's in sync, being equal is not 
> required. If it's expected for a test it should be a specific test that 
> fails. This just introduces massive delays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to