Re: size of data / number of znodes

Patrick Hunt Tue, 15 Dec 2009 09:37:05 -0800

See this recent benchmark I did: http://bit.ly/4ekN8G

In this case I have 20 clients doing 10k zodes each (200k znodes of size100 bytes each with 1million watches). However I have tested similarsetup with 400 clients (so 4 million znodes and 20million watches).

As Ben mentioned there are no limits other than memory (obv cpu, disksize, network performance, etc... are also issues, but memory and memorymanagement in particular are the most important).


I found the following critical in my testing:

1) use a recent JVM. Some of the older JVMs would crash (jvm fault, notour code) with these loads. After upgrading to 1.6.0_17 I no longer saw this

2) Provide sufficient memory in the JVM heap. Tune your GC -- inparticular you need to turn on incremental/CMS GC in the JVM. Turn on GClogging so that if you do see issues you can review the GC logs forpauses. Keep in mind that CMS tends to fragment the heap. G1 will helpthis, but it's not ready in 1.6, hopefully it will be more stable in 1.7.

3) dedicated transactional log device (separate disk) if performance iscritical. ie guaranteed low latency times.


See the troubleshooting page for some common issues: http://bit.ly/5WwS44

Patrick


Benjamin Reed wrote:

there aren't any limits on the number of znodes, it's just limited byyour memory. there are two things (probably more :) to keep in mind:
1) the 1M limit also applies to the children list. you can't grow thelist of children to more than 1M (the sum of the names of all of thechildren) otherwise you cannot to a getChildren(). so, yes, you need todo some bucketing to keep the number of children to somethingreasonable. assuming your names will be less than 100 bytes, youprobably want to limit the number of children to 10,000.
2) since there are times that you need to do a state transfer betweenservers (dump all the state from one to the other to bring it online) itmay take a while depending on your network speed. you may need to bumpup the default initLimit, so make sure you do some benchmarking on yourplatform to test your configuration parameters.
ben

Michael Bauland wrote:
Hello,

I'm new to the Zookeeper project and wondering whether our use case is a
good one for Zookeeper. I read the documentation, but couldn't find an
answer. At some point it says that
A common property of the various forms of coordination data is thatthey are relatively small: measured in kilobytes. The ZooKeeperclient and the server implementations have sanity checks to ensurethat znodes have less than 1M of data
I couldn't find any limits on the number of znodes used, only that each
znode should only contain little data. We were planning to use a million
znodes (each containing a few hundred bytes of data). Would this use
case be acceptable for Zookeeper? And if so, does it matter if we have a
flat hierarchy (i.e, all nodes have the root node as their direct
ancestor) or should we introduce some (artificial) hierarchy levels to
have a more tree-like structure?

Thanks in advance for your answer.
Cheers,

Michael

Re: size of data / number of znodes

Reply via email to