[ https://issues.apache.org/jira/browse/HDFS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770558#action_12770558 ]
Raghu Angadi commented on HDFS-738: ----------------------------------- DN always picked disks in round-robin. Are you using vanilla HDFS? There were proposals to make it randomly pick a partition in the past but it was not committed for precisely the same reason mentioned above ( https://issues.apache.org/jira/browse/HDFS-325?focusedCommentId=12560037&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12560037 ) +1 for smarter disk handler. It may not improve simple cases like the 24 writers on 16 disks test you mentioned, but in practice it should help more (especially with readers and un-even disk performance). > Improve the disk utilization of HDFS > ------------------------------------ > > Key: HDFS-738 > URL: https://issues.apache.org/jira/browse/HDFS-738 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: Zheng Shao > > HDFS data node currently assigns writers to disks randomly. This is good if > there are a large number of readers/writers on a single data node, but might > create a lot of contentions if there are only 4 readers/writers on a 4-disk > node. > A better way is to introduce a base class DiskHandler, for registering all > disk operations (read/write), as well as getting the best disk for writing > new blocks. A good strategy of the DiskHandler would be to distribute the > load of the writes to the disks with more free spaces as well as less recent > activities. There can be many strategies. > This could help improve the HDFS multi-threaded write throughput a lot - we > are seeing <25MB/s/disk on a 4-disk/node 4-node cluster (replication is > already considered) given 8 concurrent writers (24 writers considering > replication). I believe we can improve that to 2x. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.