Depending on which distribution and what your data center power limits are
you may save a lot of money by going with machines that have 12 x 2 or 3 tb
drives.  With suitable engineering margins and 3 x replication you can have
5 tb net data per node and 20 nodes per rack.  If you want to go all cowboy
with 2x replication and little space to spare then you can double that
density.

On Monday, November 7, 2011, Rita <rmorgan...@gmail.com> wrote:
> For a 1PB installation you would need close to 170 servers with 12 TB
disk pack installed on them (with replication factor of 2). Thats a
conservative estimate
> CPUs: 4 cores with 16gb of memory
>
> Namenode: 4 core with 32gb of memory should be ok.
>
>
> On Fri, Oct 21, 2011 at 5:40 PM, Steve Ed <sediso...@gmail.com> wrote:
>>
>> I am a newbie to Hadoop and trying to understand how to Size a Hadoop
cluster.
>>
>>
>>
>> What are factors I should consider deciding the number of datanodes ?
>>
>> Datanode configuration ?  CPU, Memory
>>
>> Amount of memory required for namenode ?
>>
>>
>>
>> My client is looking at 1 PB of  usable data and will be running
analytics on TB size files using mapreduce.
>>
>>
>>
>>
>>
>> Thanks
>>
>> ….. Steve
>>
>>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

Reply via email to