I have been running with 2x replication on a 500tb cluster. No issues
whatsoever. 3x is for super paranoid.


On Mon, Nov 7, 2011 at 5:06 PM, Ted Dunning <tdunn...@maprtech.com> wrote:

> Depending on which distribution and what your data center power limits are
> you may save a lot of money by going with machines that have 12 x 2 or 3 tb
> drives.  With suitable engineering margins and 3 x replication you can have
> 5 tb net data per node and 20 nodes per rack.  If you want to go all cowboy
> with 2x replication and little space to spare then you can double that
> density.
>
> On Monday, November 7, 2011, Rita <rmorgan...@gmail.com> wrote:
> > For a 1PB installation you would need close to 170 servers with 12 TB
> disk pack installed on them (with replication factor of 2). Thats a
> conservative estimate
> > CPUs: 4 cores with 16gb of memory
> >
> > Namenode: 4 core with 32gb of memory should be ok.
> >
> >
> > On Fri, Oct 21, 2011 at 5:40 PM, Steve Ed <sediso...@gmail.com> wrote:
> >>
> >> I am a newbie to Hadoop and trying to understand how to Size a Hadoop
> cluster.
> >>
> >>
> >>
> >> What are factors I should consider deciding the number of datanodes ?
> >>
> >> Datanode configuration ?  CPU, Memory
> >>
> >> Amount of memory required for namenode ?
> >>
> >>
> >>
> >> My client is looking at 1 PB of  usable data and will be running
> analytics on TB size files using mapreduce.
> >>
> >>
> >>
> >>
> >>
> >> Thanks
> >>
> >> ….. Steve
> >>
> >>
> >
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>



-- 
--- Get your facts first, then you can distort them as you please.--

Reply via email to