FYI How hbase uses zk is documented here: http://bit.ly/cWGUSa
This link http://bit.ly/4ekN8G can give you some insight into the types
of load zk can handle with typical hardware/setup and the latencies
seen. AFAIK hbase currently puts a pretty light load on ZK.
The zk docs detail the performance impact of ensemble size
http://bit.ly/4B8gC7
Patrick
Andrew Purtell wrote:
The Zookeeper devs suggest giving 1 GB heap to each process. I run it
with default heap (256 MB) and it's stable for me, but I run relatively
small clusters.
ZK wants its own disk for the transaction log. So if you can, dedicate a
disk, or run ZK on separate servers.
Our EC2 scripts start a separate ZK quorum ensemble.
It's really better to run ZK on separate servers if you can spare them.
This decouples ZK from any HBase or HDFS loading. ZK is especially
sensitive to latencies introduced by CPU or I/O contention.
ZN is a 2N+1 fault tolerant system. Run 3 to tolerate the loss of 1
instance. Run 5 to tolerate the loss of 2. Etc. Based on literature I've
seen there are diminishing returns after an ensemble size of about 9.
Increase the number of instances in the ensemble on roughly a log scale
as your cluster size increases, e.g. use 3 for cluster of 4-50 servers,
5 for 50-1000, 7 for 1000+, 9 for 10000+. There's no hard rule there.
Monitoring for average read and write latency and adjustments to quorum
size as needed is recommended.
Hope that helps,
- Andy
----- Original Message ----
From: Michał Podsiadłowski <[email protected]>
To: [email protected]
Sent: Tue, February 9, 2010 8:10:50 AM
Subject: Zookeeper - usage and load
Hi all!
Can someone drop me few words about how exactly hbase utilizes currently
zookeeper? What kind of load it takes during intensive load on hbase? What
heap space it needs to operate correctly and how much disk space? How many
instances are needed if we have only 3 region server and one HMaster? Since
there is only one there isn't much to elect in case of it's failure. Are
there any other operations apart from master election/lookup?
I was trying to google it but there isn't much i can find except for few
jira issue.
Thanks,
Michal