In general when determining the number of ZooKeeper serving nodes to deploy (the size of an ensemble) you need to think in terms of reliability, and not performance.

Reliability:

A single ZooKeeper server (standalone) is essentially a coordinator with no reliability (a single serving node failure brings down the ZK service).

A 3 server ensemble (you need to jump to 3 and not 2 because ZK works based on simple majority voting) allows for a single server to fail and the service will still be available.

So if you want reliability go with at least 3. We typically recommend having 5 servers in "online" production serving environments. This allows you to take 1 server out of service (say planned maintenance) and still be able to sustain an unexpected outage of one of the remaining servers w/o interruption of the service.

Performance:

Write performance actually _decreases_ as you add ZK servers, while read performance increases modestly: http://bit.ly/9JEUju

See this page for a recent survey I did looking at operational latency with both standalone server and an ensemble of size 3: http://bit.ly/4ekN8G You'll notice that a single core machine running a standalone ZK ensemble (1 server) is still able to process 15k requests per second. This is orders of magnitude greater than what hbase currently uses ZK for (may change in future). (background: http://bit.ly/csQLQ5)

Patrick

Micha? Podsiad?owski wrote:
Hey all,
I was asking about minimum number of zookeepers and usually everybody was
saying odd number >=3. Are there any reasons for this. Have you encounter
any problems from single zookeeper?  As far as know already hbase is doing
very very little operations using zookeeper so load on it is insignificant.
If I have only one master and one namenode i do have 2 SPOF so another one
is not a big deal.  Currently we have 3 zookeepers running on xen os with
datanode/hregion on physical machine.
Can someone advice something?

Thanks,
Michal

Reply via email to