[ https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431802#comment-13431802 ]
Aaron T. Myers commented on HDFS-3672: -------------------------------------- bq. Why is this API marked @InterfaceAudience.Public. I think we should remove it and just leave InterfaceStability.Unstable I was under the impression that all public classes needed to have an @InterfaceAudience annotation, and all public classes needed to have an @InterfaceStability annotation unless they're marked @InterfaceAudience.Private. Am I wrong about that? bq. Configuration to turn off this functionlity should be on the server side also. Otherwise a client can just enable this functionlality without the admin having control over it. I thought about this a fair bit while reviewing the code. The conclusion that I came to is that the stated reason that Arun wanted this feature disabled by default was "so that people who use this understand that this isn't necessarily supported." A client-side-only config seems to serve that purpose. Making this config server side as well only serves to require the admin enable the config and restart their cluster before some client that wants to try to use this functionality can give it a shot. That seems to me to be a strictly unnecessary pain for both the admin and user that doesn't seem to further Arun's stated goal. For that matter, why would an admin want to prevent clients from calling this API? If you insist on having a server side config for this, I'd like to suggest having two separate configs: a server-side one that defaults to enabled, but so that an admin may consciously disable it, and a client-side config that defaults to disabled so that users of this API must consciously configure their client, to support Arun's stated goal of making sure people are aware that it's an experimental API. > Expose disk-location information for blocks to enable better scheduling > ----------------------------------------------------------------------- > > Key: HDFS-3672 > URL: https://issues.apache.org/jira/browse/HDFS-3672 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.0.0-alpha > Reporter: Andrew Wang > Assignee: Andrew Wang > Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, > hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, > hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch > > > Currently, HDFS exposes on which datanodes a block resides, which allows > clients to make scheduling decisions for locality and load balancing. > Extending this to also expose on which disk on a datanode a block resides > would enable even better scheduling, on a per-disk rather than coarse > per-datanode basis. > This API would likely look similar to Filesystem#getFileBlockLocations, but > also involve a series of RPCs to the responsible datanodes to determine disk > ids. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira