[ https://issues.apache.org/jira/browse/HDFS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770408#action_12770408 ]
Hong Tang commented on HDFS-738: -------------------------------- +1 on the direction. I have been brewing the idea that we should have a shared io load monitor, publishing the load (using util% or queue size) through shared memory, and allow task processes to use the same info to decide which disk to write to. > Improve the disk utilization of HDFS > ------------------------------------ > > Key: HDFS-738 > URL: https://issues.apache.org/jira/browse/HDFS-738 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: Zheng Shao > > HDFS data node currently assigns writers to disks randomly. This is good if > there are a large number of readers/writers on a single data node, but might > create a lot of contentions if there are only 4 readers/writers on a 4-disk > node. > A better way is to introduce a base class DiskHandler, for registering all > disk operations (read/write), as well as getting the best disk for writing > new blocks. A good strategy of the DiskHandler would be to distribute the > load of the writes to the disks with more free spaces as well as less recent > activities. There can be many strategies. > This could help improve the HDFS multi-threaded write throughput a lot - we > are seeing <25MB/s/disk on a 4-disk/node 4-node cluster (replication is > already considered) given 8 concurrent writers (24 writers considering > replication). I believe we can improve that to 2x. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.