Ah, ok, that might then be related to the auto add replica feature.
Since trying Solr 7 I noticed that Solr is moving my cores around on its
own. I did not see that happening in Solr 6. I believe Solr 6 could also
move replicas on HDFS around but I actually never see that happening.
According to CloudConfig.java the default auto replica failover time is
30s and I used to wait 2min when restarting nodes as otherwise I ran
into problems with the overseer queue, which got fixed in later Solr 6
releases. I'm actually just experimenting with increasing the failover
time to 5min so that my nodes can restart before the replicas get moved.
Maybe that does then also resolve this type of problem. Issue SOLR-12114
does make changing the config a bit more tricky though but I got it updated.
thanks,
Hendrik
On 24.03.2018 18:31, Shawn Heisey wrote:
On 3/24/2018 11:22 AM, Hendrik Haddorp wrote:
below is the full entry from the Solr log. I actually also found the
list of implicit request handlers later on. But that does make it
even more strange that Solr complains about a missing handler.
The "not found" is rather generic, and might not be referring to the
handler. I wonder if we can improve those not found messages to
indicate *what* wasn't found.
2018-03-22 18:19:25.599 ERROR
(updateExecutor-3-thread-7-processing-n:search-agent3:9007_solr
x:collection-0005_shard1_replica_n2 s:shard1 c:collection-0005
r:core_node4) [c:collection-0005 s:shard1 r:core_node4
x:collection-0005_shard1_replica_n2] o.a.s.c.SyncStrategy
http://search-agent3:9007/solr/collection-0005_shard1_replica_n2/:
Could not tell a replica to
recover:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at http://search-agent3:9007/solr: Unable to locate
core collection-0005_shard1_replica_n1
Based on the end of what I quoted here, I think that the issue here
might be that the *core* doesn't exist, not that the handler doesn't
exist. Which may mean that the info in zookeeper doesn't match the
cores that are actually present and working.
If the core does exist on the disk, maybe Solr had a problem getting
the core started.
Thanks,
Shawn