Re: HDFS Read ThroughPut and DISK Read ThroughPut

2010-05-14 Thread stu24mail
23:00 To: hdfs-user@hadoop.apache.org Subject: Re: HDFS Read ThroughPut and DISK Read ThroughPut In addendum the cluster invokes the max of 44 Maps at a time Regards Rohan Rohan Rai wrote: > Hi Todd > > The Node comprises of multi disk (7 to be precise), and there are 6 data > nodes.

Re: HDFS Read ThroughPut and DISK Read ThroughPut

2010-05-14 Thread Rohan Rai
In addendum the cluster invokes the max of 44 Maps at a time Regards Rohan Rohan Rai wrote: Hi Todd The Node comprises of multi disk (7 to be precise), and there are 6 data nodes. The measurement used is that provided by TestDFSIO which comes with hadoop*test.jar With the defined block size

Re: HDFS Read ThroughPut and DISK Read ThroughPut

2010-05-14 Thread Rohan Rai
Hi Todd The Node comprises of multi disk (7 to be precise), and there are 6 data nodes. The measurement used is that provided by TestDFSIO which comes with hadoop*test.jar With the defined block size of 128 MB 44 files of 120 MB was written giving an throughput of 90MB/s 44 files of 100MB gave

Re: HDFS Read ThroughPut and DISK Read ThroughPut

2010-05-14 Thread Todd Lipcon
Hi Rohan, How are you measuring throughput? The throughput from a single client will not scale up as the cluster size increases, as it does not parallelize reads across multiple nodes. Of course it will also be limited by the inbound bandwidth of that node. -Todd On Fri, May 14, 2010 at 1:23 AM,

HDFS Read ThroughPut and DISK Read ThroughPut

2010-05-14 Thread Rohan Rai
Hi Is there a relationship between HDFS Read throught put and Disk Read throughput. If yes what would be that. Lets say we have a disk giving us 120 MB/s And a Cluster of 6 Nodes Each Node having 6 disk. So in an absolutely ideal world it should give us a through put of 120*6*6 MB/s if used