Ah, ok, that might then be related to the auto add replica feature. Since trying Solr 7 I noticed that Solr is moving my cores around on its own. I did not see that happening in Solr 6. I believe Solr 6 could also move replicas on HDFS around but I actually never see that happening.

According to CloudConfig.java the default auto replica failover time is 30s and I used to wait 2min when restarting nodes as otherwise I ran into problems with the overseer queue, which got fixed in later Solr 6 releases. I'm actually just experimenting with increasing the failover time to 5min so that my nodes can restart before the replicas get moved. Maybe that does then also resolve this type of problem. Issue SOLR-12114 does make changing the config a bit more tricky though but I got it updated.

thanks,
Hendrik

On 24.03.2018 18:31, Shawn Heisey wrote:
On 3/24/2018 11:22 AM, Hendrik Haddorp wrote:
below is the full entry from the Solr log. I actually also found the list of implicit request handlers later on. But that does make it even more strange that Solr complains about a missing handler.

The "not found" is rather generic, and might not be referring to the handler.  I wonder if we can improve those not found messages to indicate *what* wasn't found.

2018-03-22 18:19:25.599 ERROR (updateExecutor-3-thread-7-processing-n:search-agent3:9007_solr x:collection-0005_shard1_replica_n2 s:shard1 c:collection-0005 r:core_node4) [c:collection-0005 s:shard1 r:core_node4 x:collection-0005_shard1_replica_n2] o.a.s.c.SyncStrategy http://search-agent3:9007/solr/collection-0005_shard1_replica_n2/: Could not tell a replica to recover:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://search-agent3:9007/solr: Unable to locate core collection-0005_shard1_replica_n1

Based on the end of what I quoted here, I think that the issue here might be that the *core* doesn't exist, not that the handler doesn't exist.  Which may mean that the info in zookeeper doesn't match the cores that are actually present and working.

If the core does exist on the disk, maybe Solr had a problem getting the core started.

Thanks,
Shawn


Reply via email to