The datanodes don't consume much memory, we run ours with 1GB and give the rest to the region servers.
BTW if you want to serve the whole dataset, depending on your SLA, you might want to try HDFS-347 since concurrent HDFS access is rather slow. The other choice would be to make sure you can hold everything in the block cache so that means very little data per region server. J-D On Fri, Apr 22, 2011 at 2:17 AM, Iulia Zidaru <[email protected]> wrote: > Hi all, > > Supposing we have to constantly hit all data stored, which is a good report > between the HDFS space used and the HBase heap size allocated per node? Do > you calculate it somehow? > Also, is there a report between the hadoop heap size and the hbase heap size > that we should take into consideration? > > Thank you, > Iulia > > >
