[
https://issues.apache.org/jira/browse/SOLR-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki reopened SOLR-13975:
-------------------------------------
Jenkins failure related to this change
([https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Windows/591/]):
{code:java}
[junit4] 2> java.io.IOException: Request processing has stalled for 102ms
with 100 remaining elements in the queue.
[junit4] 2> at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:446)
[junit4] 2> at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
[junit4] 2> at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClientTest$SendDocsRunnable.run(ConcurrentUpdateSolrClientTest.java:301)
[junit4] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[junit4] 2> at java.lang.Thread.run(Thread.java:748)
[junit4] 2> 46259 INFO
(TEST-ConcurrentUpdateHttp2SolrClientTest.testConcurrentUpdate-seed#[2B5284771F7CB908])
[ ] o.a.s.SolrTestCaseJ4 ###Ending testConcurrentUpdate
[junit4] 2> NOTE: reproduce with: ant test
-Dtestcase=ConcurrentUpdateHttp2SolrClientTest
-Dtests.method=testConcurrentUpdate -Dtests.seed=2B5284771F7CB908
-Dtests.slow=true -Dtests.locale=uk-UA -Dtests.timezone=America/Punta_Arenas
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
[junit4] FAILURE 5.10s J0 |
ConcurrentUpdateHttp2SolrClientTest.testConcurrentUpdate <<<
[junit4] > Throwable #1: java.lang.AssertionError: Expected CUSS to send
500 but got 495
[junit4] > at
__randomizedtesting.SeedInfo.seed([2B5284771F7CB908:D354DD8CCB44DC43]:0)
[junit4] > at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClientTest.testConcurrentUpdate(ConcurrentUpdateHttp2SolrClientTest.java:102)
[junit4] > at java.lang.Thread.run(Thread.java:748){code}
> ConcurrentUpdateSolrClient connection stall prevention
> ------------------------------------------------------
>
> Key: SOLR-13975
> URL: https://issues.apache.org/jira/browse/SOLR-13975
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 8.3, 8.4
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Priority: Major
> Fix For: 8.4
>
> Attachments: SOLR-13975.patch, SOLR-13975.patch
>
>
> When a Solr process, which hosts replicas of a collection, is suspended -
> that is, the OS process is suspended using eg. {{kill -STOP <pid>}} - a long
> stall may occur in CUSC until a socket timeout is reached.
> During this stall updates from the leader are not forwarded to any replica,
> even though other replicas are still active and can receive updates. If the
> sender uses CUSC (eg. via {{CloudSolrClient}}) then it becomes stalled
> because the leader stops processing updates, too.
> This situation is caused by several issues:
> * when a process is suspended its sockets remain open - so there is no
> immediate disconnect as if the process died, but the process becomes
> unresponsive. Eventually, a socket timeout will be reached
> (distribUpdateSoTimeout) - but in the default version of {{solr.xml}} this is
> set to 10 min. During this time all indexing to that shard will be stuck.
> * there are several infinite {{for}} loops in CUSC (eg. in
> {{blockUntilFinished}}, {{waitForEmptyQueue}} and even in {{request}}), which
> rely either on the relatively quick success of the call or an exception to be
> thrown. However, in this situation neither happens quickly - the call is
> stuck waiting for the remote end until soTimeout expires.
> This issue proposes to add a stall prevention logic, which would break these
> infinite loops long before the socket timeout occurs based on the progress of
> the queue processing.
> This is a follow-up to SOLR-13896.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]