From the wiki: "SolrCloud can continue to serve results without interruption
as long as at least one server hosts every shard. You can demonstrate this
by judiciously shutting down various instances and looking for results. If
you have killed all of the servers for a particular shard, requests to other
servers will result in a 503 error. To return just the documents that are
available in the shards that are still alive (and avoid the error), add the
following query parameter: shards.tolerant=true"
That doesn't completely answer your question, but is an important part of
the puzzle.
-- Jack Krupansky
-----Original Message-----
From: Dennis Haller
Sent: Friday, May 03, 2013 3:21 PM
To: solr-user@lucene.apache.org
Subject: disaster recovery scenarios for solr cloud and zookeeper
Hi,
Solr 4.x is architected with a dependency on Zookeeper, and Zookeeper is
expected to have a very high (perfect?) availability. With 3 or 5 zookeeper
nodes, it is possible to manage zookeeper maintenance and online
availability to be close to %100. But what is the worst case for Solr if
for some unanticipated reason all Zookeeper nodes go offline?
Could someone comment on a couple of possible scenarios for which all ZK
nodes are offline. What would happen to Solr and what would be needed to
recover in each case?
1) brief interruption, say <2 minutes,
2) longer downtime, say 60 min
Thanks
Dennis