[
https://issues.apache.org/jira/browse/HDFS-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732054#action_12732054
]
dhruba borthakur commented on HDFS-487:
---------------------------------------
The API for pluggable block placement (HDFS-385) provides the pathname of the
file to the block placement policy. The block placement policy can use the
filename to determine what kind of placement algorithm to use for blocks in
that file. This works well in the current NN design. However, if in future, we
separate out the Block Manager from the NN, the Block Manager might not know
the pathname for which the block belongs to. In that case, the Block manager
will not be able to provide the filename when invoking the
pluggable-block-placement-policy API. So, in some sense, using a fileid
(instead of a filename) is future-proofing the API.
Again to emphasize, HDFS-385 does not really need fileids, although it is good
to have. The API designed in HDFS-385 shoudl be marked as "experimental", and
we can change it if/when the Block Manager is separated out from the NN. Which
option do you prefer?
> HDFS should expose a fileid to uniquely identify a file
> -------------------------------------------------------
>
> Key: HDFS-487
> URL: https://issues.apache.org/jira/browse/HDFS-487
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: fileid1.txt
>
>
> HDFS should expose a id that uniquely identifies a file. This helps in
> developing applications that work correctly even when files are moved from
> one directory to another. A typical use-case is to make the Pluggable Block
> Placement Policy (HDFS-385) use fileid instead of filename.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.