I am a newbie to Hadoop and trying to understand how to Size a Hadoop cluster.
What are factors I should consider deciding the number of datanodes ? Datanode configuration ? CPU, Memory Amount of memory required for namenode ? My client is looking at 1 PB of usable data and will be running analytics on TB size files using mapreduce. Thanks ... Steve