[jira] [Commented] (SOLR-8275) Unclear error message during recovery

Mike Drob (JIRA) Fri, 13 Nov 2015 11:36:30 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004577#comment-15004577
 ]


Mike Drob commented on SOLR-8275:
---------------------------------

The wait loop logging is at INFO level, while this is at ERROR level. Some 
places decide to turn off INFO logging for a variety of reasons (ill advised as 
it may be). I'm hoping that a clearer error message will save the next 
developer who is stuck debugging this particular piece some time. Tracing 
through the code probably took me 20 minutes on the first pass, and I'm not 
going to claim that I am smart enough to remember what it means the next time I 
have to go look. I feel like letting the logs tell me what the problem is 
explicitly would be a really useful improvement.

> Unclear error message during recovery
> -------------------------------------
>
>                 Key: SOLR-8275
>                 URL: https://issues.apache.org/jira/browse/SOLR-8275
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10.3
>            Reporter: Mike Drob
>         Attachments: SOLR-8275.patch, SOLR-8275.patch
>
>
> A SolrCloud install got into a bad state (mostly around LeaderElection, I 
> think) and during recovery one of the nodes was giving me this message:
> {noformat}
> 2015-11-09 13:00:56,158 ERROR org.apache.solr.cloud.RecoveryStrategy: Error 
> while trying to recover. 
> core=c1_shard1_replica4:java.util.concurrent.ExecutionException: 
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was 
> asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I 
> still do not see the requested state. I see state: recovering live:true 
> leader from ZK: http://node1:8983/solr/c1_shard1_replica2/
>       at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>       at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:599)
>       at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370)
>       at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:236)
> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was 
> asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I 
> still do not see the requested state. I see state: recovering live:true 
> leader from ZK: http://node1:8983/solr/c1_shard1_replica2/
>       at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:621)
>       at 
> org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:292)
>       at 
> org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:288)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The crux of this message: "I was asked to wait on state recovering for shard1 
> in c1 on node2:8983_solr but I still do not see the requested state. I see 
> state: recovering" seems contradictory. At a minimum, we should improve this 
> error, but there might also be some erroneous logic going on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8275) Unclear error message during recovery

Reply via email to