Re: HBase Insert Performance

Patrick Hunt Fri, 12 Feb 2010 09:59:30 -0800

In general when determining the number of ZooKeeper serving nodes todeploy (the size of an ensemble) you need to think in terms ofreliability, and not performance.


Reliability:

A single ZooKeeper server (standalone) is essentially a coordinator withno reliability (a single serving node failure brings down the ZK service).

A 3 server ensemble (you need to jump to 3 and not 2 because ZK worksbased on simple majority voting) allows for a single server to fail andthe service will still be available.

So if you want reliability go with at least 3. We typically recommendhaving 5 servers in "online" production serving environments. Thisallows you to take 1 server out of service (say planned maintenance) andstill be able to sustain an unexpected outage of one of the remainingservers w/o interruption of the service.


Performance:

Write performance actually _decreases_ as you add ZK servers, while readperformance increases modestly: http://bit.ly/9JEUju

See this page for a recent survey I did looking at operational latencywith both standalone server and an ensemble of size 3:http://bit.ly/4ekN8G You'll notice that a single core machine running astandalone ZK ensemble (1 server) is still able to process 15k requestsper second. This is orders of magnitude greater than what hbasecurrently uses ZK for (may change in future). (background:http://bit.ly/csQLQ5)


Patrick

Micha? Podsiad?owski wrote:

Hey all,
I was asking about minimum number of zookeepers and usually everybody was
saying odd number >=3. Are there any reasons for this. Have you encounter
any problems from single zookeeper?  As far as know already hbase is doing
very very little operations using zookeeper so load on it is insignificant.
If I have only one master and one namenode i do have 2 SPOF so another one
is not a big deal.  Currently we have 3 zookeepers running on xen os with
datanode/hregion on physical machine.
Can someone advice something?

Thanks,
Michal

Re: HBase Insert Performance

Reply via email to