[ https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420776#comment-13420776 ]
Suresh Srinivas commented on HDFS-3672: --------------------------------------- Todd, thanks for describing the intent and use cases in detail. These APIs for experimentation sort of makes sense. However, I want to highlight the following: There are multiple daemons reading from/writing to disk in Hadoop. Datanodes, MapReduce shuffle and possibly HBase short circuit reads. Given this, a view given from Datanode alone would not reflect the complete reality. Also given there are many applications on HDFS that are reading/writing to disks as well, the view of a single application (in this case HBase or MapReduce) is also incomplete. While an application can make locally optimized scheduling decisions, it still may not result in better scheduling. The improvements one sees is going to best-effort and would be unpredictable. > Expose disk-location information for blocks to enable better scheduling > ----------------------------------------------------------------------- > > Key: HDFS-3672 > URL: https://issues.apache.org/jira/browse/HDFS-3672 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.0.0-alpha > Reporter: Andrew Wang > Assignee: Andrew Wang > Attachments: hdfs-3672-1.patch > > > Currently, HDFS exposes on which datanodes a block resides, which allows > clients to make scheduling decisions for locality and load balancing. > Extending this to also expose on which disk on a datanode a block resides > would enable even better scheduling, on a per-disk rather than coarse > per-datanode basis. > This API would likely look similar to Filesystem#getFileBlockLocations, but > also involve a series of RPCs to the responsible datanodes to determine disk > ids. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira