Hello.

In one of our test environment, we have a SolrCloud cluster of 8 SolrCloud
nodes  and a quorum of 5 ZooKeeper node.
 We have only 2 collections and all SolrCloud nodes are identical and have
a single replica of each collection.

I noticed that when I shut down one of the solr nodes and refresh the
solrCloud admin UI, the Cloud->Graph view immediately shows the node/shards
as Gone/down (in gray color), which is what I expected.

Now, when I go through the UI to the Tree view and browse under individual
collections, the file state.json shows all nodes as "Active" or up. I
expected this to show "down":  This is the main issue here.


I looked into ZK for the state.json file and all nodes are marked as
actives in state.json  on ZK as well.
So, it seems the overseer is not writing to ZK?


Note that when I use the API
/solr/admin/collections?action=CLUSTERSTATUS, I have the expected result
i.e 1 host is down

When I do
/solr/admin/collections?action=OVERSEERSTATUS
there is no failed operation shown

For now, we noticed this issue in one of our test environment.
When I deploy a local cluster on my machine, I cannot reproduce this stale
state.json issue.

Any idea or hint about what could be causing this would be very appreciated.

Thank you.


Arcadius.

Reply via email to