On 10/06/2011 10:00 PM, Edward Capriolo wrote:
On Fri, Jun 10, 2011 at 8:22 AM, Brian Bockelman<bbock...@cse.unl.edu>wrote:
On Jun 10, 2011, at 6:32 AM, si...@ugcv.com wrote:
Dear all,
I'm looking for ways to improve the namenode heap size usage of a
800-node 10PB testing Hadoop cluster that stores
around 30 million files.
Here's some info:
1 x namenode: 32GB RAM, 24GB heap size
800 x datanode: 8GB RAM, 13TB hdd
*33050825 files and directories, 47708724 blocks = 80759549 total. Heap
Size is 22.93 GB / 22.93 GB (100%) *
From the cluster summary report, it seems the heap size usage is always
full but couldn't drop, do you guys know of any ways
to reduce it ? So far I don't see any namenode OOM errors so it looks
memory assigned for the namenode process is (just)
enough. But i'm curious which factors would account for the full use of
heap size ?
The advice I give to folks is to plan on 1GB heap for every million
objects. It's an over-estimate, but I prefer to be on the safe side. Why
not increase the heap-size to 28GB? Should buy you some time.
You can turn on compressed pointers, but your best bet is really going to
be spending some more money on RAM.
Brian
The problem with the "buy more RAM" philosophy is the JVM's tend to have
problems operating without pausing for large heaps. NameNode JVM pausing is
not a good thing. Number of Files and Number of Blocks is important so
larger block sizes help make for less NN memory usage.
Also your setup nodes do not mention a secondary name node. Do you have one?
It needs slightly more RAM then the NN.
The NN starts up with 8GB, 16GB and currently 24GB, most likely raise it
up to 28GB
next month but looks closed to max
I would add more RAM for sure but there's hardware limitation. How if
the motherboard
couldn't support more than ... say 128GB ? seems I can't keep adding RAM
to resolve it.
compressed pointers, do u mean turning on jvm compressed reference ?
I didn't try that out before, how's your experience ?
I'm running another secondary NN exactly same hardware spec with the NN.
Both using 24GB heap size, supposedly enough to handle sync'ing/merging
of namespace.
If the secondary NN needs more RAM than NN, do you suggest adding more
to NN as well ?