[
https://issues.apache.org/jira/browse/SOLR-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002508#comment-15002508
]
Mike Drob commented on SOLR-8275:
---------------------------------
Yea, we could put all of the info into one error message, that's fine too.
Looking at the wait loop logging to figure out what went wrong is not intuitive
because it shows you a bunch of state parameters and then you're on your own to
figure out the failing condition. I'll rewrite this patch to have a single
exception path.
What error code do you think is more appropriate? SERVER_ERROR?
> Unclear error message during recovery
> -------------------------------------
>
> Key: SOLR-8275
> URL: https://issues.apache.org/jira/browse/SOLR-8275
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.10.3
> Reporter: Mike Drob
> Attachments: SOLR-8275.patch
>
>
> A SolrCloud install got into a bad state (mostly around LeaderElection, I
> think) and during recovery one of the nodes was giving me this message:
> {noformat}
> 2015-11-09 13:00:56,158 ERROR org.apache.solr.cloud.RecoveryStrategy: Error
> while trying to recover.
> core=c1_shard1_replica4:java.util.concurrent.ExecutionException:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was
> asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I
> still do not see the requested state. I see state: recovering live:true
> leader from ZK: http://node1:8983/solr/c1_shard1_replica2/
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:599)
> at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370)
> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:236)
> Caused by:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was
> asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I
> still do not see the requested state. I see state: recovering live:true
> leader from ZK: http://node1:8983/solr/c1_shard1_replica2/
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:621)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:292)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:288)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The crux of this message: "I was asked to wait on state recovering for shard1
> in c1 on node2:8983_solr but I still do not see the requested state. I see
> state: recovering" seems contradictory. At a minimum, we should improve this
> error, but there might also be some erroneous logic going on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]