[ https://issues.apache.org/jira/browse/HDFS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770593#action_12770593 ]
Hong Tang commented on HDFS-738: -------------------------------- I have done some empirical observation - on Linux, "iostat -dkx 10" would provide two useful metrics: %util and avgqu-sz. %util is a pretty good indicator of disk utilization (but sometimes it would shoot over 100%), a high %util with a large avgqu-sz (10s to 100s) means overload on disk. > Improve the disk utilization of HDFS > ------------------------------------ > > Key: HDFS-738 > URL: https://issues.apache.org/jira/browse/HDFS-738 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: Zheng Shao > > HDFS data node currently assigns writers to disks randomly. This is good if > there are a large number of readers/writers on a single data node, but might > create a lot of contentions if there are only 4 readers/writers on a 4-disk > node. > A better way is to introduce a base class DiskHandler, for registering all > disk operations (read/write), as well as getting the best disk for writing > new blocks. A good strategy of the DiskHandler would be to distribute the > load of the writes to the disks with more free spaces as well as less recent > activities. There can be many strategies. > This could help improve the HDFS multi-threaded write throughput a lot - we > are seeing <25MB/s/disk on a 4-disk/node 4-node cluster (replication is > already considered) given 8 concurrent writers (24 writers considering > replication). I believe we can improve that to 2x. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.