Re: Sane max storage size for DN

2012-12-13 Thread Mohammad Tariq
Thank you so much Hemanth. Regards, Mohammad Tariq On Thu, Dec 13, 2012 at 8:21 PM, Hemanth Yamijala wrote: > This is a dated blog post, so it would help if someone with current HDFS > knowledge can validate it: > http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_ha

Re: Sane max storage size for DN

2012-12-13 Thread Hemanth Yamijala
This is a dated blog post, so it would help if someone with current HDFS knowledge can validate it: http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/ . There is a bit about the RAM required for the Namenode and how to compute it: You can look at the 'Namespace

Re: Sane max storage size for DN

2012-12-12 Thread Mohammad Tariq
Hello Chris, Thank you so much for the valuable insights. I was actually using the same principle. I did the blunder and did the maths for entire (9*3)PB. Seems I am higher than you, that too without drinking ;) Many thanks. Regards, Mohammad Tariq On Thu, Dec 13, 2012 at 10:38 AM,

Re: Sane max storage size for DN

2012-12-12 Thread Chris Embree
Hi Mohammed, The amount of RAM on the NN is related to the number of blocks... so let's do some math. :) 1G of RAM to 1M blocks seems to be the general rule. I'll probably mess this up so someone check my math: 9 PT ~ 9,216 TB ~ 9,437,184 GB of data. Let's put that in 128MB blocks: according

Re: Sane max storage size for DN

2012-12-12 Thread Mohammad Tariq
Hello Michael, It's an array. The actual size of the data could be somewhere around 9PB(exclusive of replication) and we want to keep the no of DNs as less as possible. Computations are not too frequent, as I have specified earlier. If I have 500TB in 1 DN, the no of DNs would be around 49.

Re: Sane max storage size for DN

2012-12-12 Thread Michael Segel
500 TB? How many nodes in the cluster? Is this attached storage or is it in an array? I mean if you have 4 nodes for a total of 2PB, what happens when you lose 1 node? On Dec 12, 2012, at 9:02 AM, Mohammad Tariq wrote: > Hello list, > > I don't know if this question makes any s

Re: Sane max storage size for DN

2012-12-12 Thread Mohammad Tariq
Thank you so much for the valuable response Ted. No, there would be dedicated storage for NN as well. Any tips on RAM & N/W? *Computations are not really frequent. Thanks again. Regards, Mohammad Tariq On Wed, Dec 12, 2012 at 9:14 PM, Ted Dunning wrote: > > Yes it does make sense, dep

Re: Sane max storage size for DN

2012-12-12 Thread Ted Dunning
Yes it does make sense, depending on how much compute each byte of data will require on average. With ordinary Hadoop, it is reasonable to have half a dozen 2TB drives. With specialized versions of Hadoop considerably more can be supported. >From what you say, it sounds like you are suggesting t

Sane max storage size for DN

2012-12-12 Thread Mohammad Tariq
Hello list, I don't know if this question makes any sense, but I would like to ask, does it make sense to store 500TB (or more) data in a single DN?If yes, then what should be the spec of other parameters *viz*. NN & DN RAM, N/W etc?If no, what could be the alternative? Many thanks. Re