[ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628197#comment-13628197
 ] 

Suresh Srinivas commented on HDFS-4489:
---------------------------------------

bq. But if you approach from the view point of owners of existing hardware that 
was spec'ed to hold certain size of namespace, it can be viewed as a decrease 
of capacity.
Again I do not believe anyone runs with NN very tightly configured given the 
nature garbage collection. That said, to make further progress, the following 
optimizations can be done:

# Initialize the map only when this feature is enabled. Should take away 
roughly 1/3 of extra memory.
# Reuse existing bits in INodeId - 
https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12618468&commentId=13508432.
 Should take away roughly 1/3 of extra memory.
# Use first block ID of the file (after ensuring even empty file has an 
associated block) as the InodeID. This is very ugly and mixing two abstractions 
that should not be mixed. I am reluctant to make this optimization.

My vote is to keep the code simple, abstractions clean. If folks think the 
above optimizations is worth pursuing, I will update the patch.
                
> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> --------------------------------------------------------------------
>
>                 Key: HDFS-4489
>                 URL: https://issues.apache.org/jira/browse/HDFS-4489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to