Why was it down? e.g. did it OOM? If so, the recommended approach is
kill the process on OOM vs. leaving it in the cluster in a zombie
state. I had similar issues when my nodes OOM'd is why I ask. That
said, you can get the /clusterstate.json which contains Zk's status of
a node using a request like:
http://localhost:8983/solr/zookeeper?detail=true&path=%2Fclusterstate.json
Although that would require some basic JSON processing to dig into the
response to get the status of the node of interest, so you may want to
implement a custom request handler.

On Mon, Jul 22, 2013 at 9:55 AM, jimtronic <jimtro...@gmail.com> wrote:
> I've run into a problem recently that's difficult to debug and search for:
>
> I have three nodes in a cluster and this weekend one of the nodes went
> partially down. It no longer responds to distributed updates and it is
> marked as GONE in the Cloud view of the admin screen. That's not ideal, but
> there's still two boxes up so not the end of the world.
>
> The problem is that it is still responding to ping requests and returning
> queries successfully. In my setup, I have the three servers on an haproxy
> load balancer so that I can distribute requests and have clients stick to a
> specific solr box. Because the bad node is still returning OK to the ping
> requests and still returns results for simple queries, the load balancer
> does not remove it from the group.
>
> Is there a ping like request handler that would tell me whether the given
> box I'm hitting is still "in the cloud"?
>
> Thanks!
> Jim Musil
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to