It depends on what stack you want to run. A quick cut: - Worker Machines (DataNode, HBase Region Servers, Spark Worker Nodes) - Dual 6 core CPU - 64 to 128 GB RAM - 3 X 3TB disk (JBOD) - Master Node (Name Node, HBase Master,Spark Master) - Dual 6 core CPU - 64 to 128 GB RAM - 2 X 3TB disk (RAID 1+0) - Start with a 5 node setup and scale out as needed - If your load is Mapreduce over HDFS, then run YRAN - If your load is HBase over HDFS, scale depending on the computational and storage needs - If you are running Spark over HDFS, scale appropriately - you might need more memory in the worker nodes - In any case, have a topology and the processes that they would run. As Soumya suggests, you can prototype at an appropriate scale using AWS.
Cheers <k/. On Wed, May 21, 2014 at 5:14 PM, Upender Nimbekar <upent...@gmail.com>wrote: > Hi, > I would like to setup apache platform on a mini cluster. Is there any > recommendation for the hardware that I can buy to set it up. I am thinking > about processing significant amount of data like in the range of few > terabytes. > > Thanks > Upender >