[ https://issues.apache.org/jira/browse/SOLR-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004577#comment-15004577 ]
Mike Drob commented on SOLR-8275: --------------------------------- The wait loop logging is at INFO level, while this is at ERROR level. Some places decide to turn off INFO logging for a variety of reasons (ill advised as it may be). I'm hoping that a clearer error message will save the next developer who is stuck debugging this particular piece some time. Tracing through the code probably took me 20 minutes on the first pass, and I'm not going to claim that I am smart enough to remember what it means the next time I have to go look. I feel like letting the logs tell me what the problem is explicitly would be a really useful improvement. > Unclear error message during recovery > ------------------------------------- > > Key: SOLR-8275 > URL: https://issues.apache.org/jira/browse/SOLR-8275 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 4.10.3 > Reporter: Mike Drob > Attachments: SOLR-8275.patch, SOLR-8275.patch > > > A SolrCloud install got into a bad state (mostly around LeaderElection, I > think) and during recovery one of the nodes was giving me this message: > {noformat} > 2015-11-09 13:00:56,158 ERROR org.apache.solr.cloud.RecoveryStrategy: Error > while trying to recover. > core=c1_shard1_replica4:java.util.concurrent.ExecutionException: > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was > asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I > still do not see the requested state. I see state: recovering live:true > leader from ZK: http://node1:8983/solr/c1_shard1_replica2/ > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:599) > at > org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:370) > at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:236) > Caused by: > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: I was > asked to wait on state recovering for shard1 in c1 on node2:8983_solr but I > still do not see the requested state. I see state: recovering live:true > leader from ZK: http://node1:8983/solr/c1_shard1_replica2/ > at > org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:621) > at > org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:292) > at > org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:288) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The crux of this message: "I was asked to wait on state recovering for shard1 > in c1 on node2:8983_solr but I still do not see the requested state. I see > state: recovering" seems contradictory. At a minimum, we should improve this > error, but there might also be some erroneous logic going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org