Wonder what clusterstate actually says. I can think of two things that could possibly heal the cluster:
A rolling restart of all nodes may make Solr heal itself, but the risk is that some shards may not have a replica and if you get stuck in recovery during restart you have downtime. Another way could be to use admin UI and remove all replicas from the defunct node. Then reboot/reinstall that node and then add back missing replicas and let solr replicate shards to the new node. A third more defensive way is to add a fourth node, add replicas to it to make all collections redundant and then remove replicas from the defunct node and finally decommission it. Jan Høydahl > 28. des. 2019 kl. 02:17 skrev David Barnett <oand...@gmail.com>: > > Happy holidays folks, we have a production deployment usage Solr 7.3 in a > three node cluster we have a number of collections setup, three shards with a > replica factor of 2. The system has been fine, but we experienced issues with > disk space one of the nodes. > > Node 0 starts but does not show any cores / replicas, the solr.log is full of > these "o.a.s.c.ZkController org.apache.solr.common.SolrException: Replica > core_node7 is not present in cluster state: null” > > Node 1 and Node 2 are OK, all data from all collections is accessible. > > Can I recreate node 0 as though it had failed completely ?, is it OK to > remove the references to the replicas (missing) and recreate. Would you be > able to provide me some guidance of the safest way to reintroduce node 0 > given our situation. > > Many thanks > > Dave