[ 
https://issues.apache.org/jira/browse/HDFS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770408#action_12770408
 ] 

Hong Tang commented on HDFS-738:
--------------------------------

+1 on the direction.

I have been brewing the idea that we should have a shared io load monitor, 
publishing the load (using util% or queue size) through shared memory, and 
allow task processes to use the same info to decide which disk to write to.

> Improve the disk utilization of HDFS
> ------------------------------------
>
>                 Key: HDFS-738
>                 URL: https://issues.apache.org/jira/browse/HDFS-738
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Zheng Shao
>
> HDFS data node currently assigns writers to disks randomly. This is good if 
> there are a large number of readers/writers on a single data node, but might 
> create a lot of contentions if there are only 4 readers/writers on a 4-disk 
> node.
> A better way is to introduce a base class DiskHandler, for registering all 
> disk operations (read/write), as well as getting the best disk for writing 
> new blocks. A good strategy of the DiskHandler would be to distribute the 
> load of the writes to the disks with more free spaces as well as less recent 
> activities. There can be many strategies.
> This could help improve the HDFS multi-threaded write throughput a lot - we 
> are seeing <25MB/s/disk on a 4-disk/node 4-node cluster (replication is 
> already considered) given 8 concurrent writers (24 writers considering 
> replication). I believe we can improve that to 2x.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to