On 01/25/2011 05:49 PM, M. C. Srivas wrote:
On Tue, Jan 25, 2011 at 12:33 PM, Da Zheng<zhengda1...@gmail.com>  wrote:

Hello,

I try to measure the performance of HDFS, but the writing rate is quite
low. When the replication factor is 1, the rate of writing to HDFS is about
60MB/s. When the replication factor is 3, the rate drops significantly to
about 15MB/s. Even though the actual rate of writing data to the disk is
about 45MB/s, it's still much lower than when replication factor is 1. The
link between two nodes in the cluster is 1Gbps. CPU is Dual-Core AMD
Opteron(tm) Processor 2212, so CPU isn't bottleneck either. I thought I
should be able to saturate the disk very easily. I wonder where the
bottleneck is. What is the throughput for writing on a Hadoop cluster when
the replication factor is 3?

The numbers above seem correct as per my observations.  If your data is
3-way replicated, the data-node writes about 3x the actual data written.
Conversely, your write-rate will be limited to 1/3 of  how fast the disk can
write, minus some overhead for replication.

The aggregate write-rate can get much higher if you use more drives, but a
single stream throughput is limited to the speed of one disk spindle.

You are right. I measure the performance of the hard drive. It seems the bottleneck is the hard drive, but the hard drive is a little too slow. The average writing rate is 50MB/s.

Thanks,
Da

Reply via email to