[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420776#comment-13420776
 ] 

Suresh Srinivas commented on HDFS-3672:
---------------------------------------

Todd, thanks for describing the intent and use cases in detail. These APIs for 
experimentation sort of makes sense.

However, I want to highlight the following:
There are multiple daemons reading from/writing to disk in Hadoop. Datanodes, 
MapReduce shuffle and possibly HBase short circuit reads. Given this, a view 
given from Datanode alone would not reflect the complete reality. Also given 
there are many applications on HDFS that are reading/writing to disks as well, 
the view of a single application (in this case HBase or MapReduce) is also 
incomplete. While an application can make locally optimized scheduling 
decisions, it still may not result in better scheduling. The improvements one 
sees is going to best-effort and would be unpredictable.

                
> Expose disk-location information for blocks to enable better scheduling
> -----------------------------------------------------------------------
>
>                 Key: HDFS-3672
>                 URL: https://issues.apache.org/jira/browse/HDFS-3672
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to