The below from Patrick is not uncommon to encounter. The "commodity hardware" talk around MR and BigTable is a bit of a joke -- you can do that if you can afford 1,000s or 10,000s of commodity components custom assembled. Hadoop+HBase users want to do more with less, obviously. Colocating computation with storage has its price -- either you horizontally scale wide or go vertical enough on each node to handle the load you are throwing at the cluster you can afford.
Sizing clusters is a black art. As for the spec of each individual node, I can share our current generation hardware spec: CPU: dual 6-core AMD (12 cores total) RAM: 32 GB DISK: 320 GB x 2 (RAID-1) system disk 500 GB x 8 (JBOD) data disks for HDFS custom 1U chassis We give 8 GB of RAM to the HBase region servers. All other Hadoop and HBase daemons (DataNode, ZooKeeper, TaskTracker, etc.) use the default of 1 GB. Remainder of CPU and RAM is for user tasks (MR). Reads are best served from RAM via the block cache. The more spindles, the higher I/O parallelism, therefore higher aggregate throughput. The above is a good trade off between horizontal and vertical for us. Hope that helps. > From: Patrick Hunt > Subject: Re: About test/production server configuration > The ZK servers are sensitive to disk > (io) latency. I just troubleshot an > issue last week where a user was seeing 80second (second!) > latencies. Turns out they were running zk server, namenode, > tasktracker, and hbase region server all on the same box, > that box had a single spindle for all io activity and was > at 100% utilization for long periods of time. If > you want decent ZK API latencies (<100ms) you really > need to ensure that there's at least a separate spindle > available for the ZK transaction logs.
