On 06/09/2011 03:59 PM, Mark wrote:
Can someone give some brief guidance on some (ballpark) specs for a
hadoop cluster?
- What are the typical size and memory requirements are there for a
dedicated namenode?
-this depends on cluster size
- What are the typical size and memory requirements are there for
individual datanodes?
this depends on the jobs you run, whether they are MR or HBase, and is a
bit of a moving target as the #of cores and HDDs increases. I'm going to
avoid making any recommendations.
- Should there be more, less or even amounts of ram on each datanode?
it's good to have them similar, at least with a stable amount of RAM per
core.
- Should the namenode have more, less or an even amount of ram as
compared to the datanode?
more and you want the storage to be HA/raided, something else reliable
remote mounted. you need lots of swap space too.
the secondary namenode isn't failover node, but is needed to bring up
clusters faster on a restart, it should have similar HW specs to the
namenode.
there's also the job tracker. In big clusters this can use a lot of RAM,
depending on the #of jobs and the #of counters each job has.