[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

Suresh Srinivas (JIRA) Mon, 29 Apr 2013 14:34:18 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644898#comment-13644898
 ]


Suresh Srinivas commented on HDFS-4489:
---------------------------------------

Summary of results in the tests:
# File dreate tests- perform additional reserved name processing, inode map 
addition and reserved name check. This is where maximum additional work from 
the patch is being done.
#* In the mirco benchmark by just calling create file related methods, the time 
went from 19235.8 to 19789.2 roughly 2.8% different. This can be further 
reduced by turning off map to 1.3%. The patch moves splitting paths into 
components outside the lock. Based on this, further optimizations are possible 
that improves throughput by reducing the synchronized sections. The end result 
with that optimizations can make running times much smaller that what it is 
today.
#* I would also point out that, this is a micro benchmark. The % difference 
observed in this will be dwarfed by RPC times, network round trip time etc. 
Also the system will spend time on other operations which should not be 
affected by this patch.
# File delete tests - performs reseved name processing and only inode map 
deletion.
#* There very little difference in bench mark results.
                
> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> --------------------------------------------------------------------
>
>                 Key: HDFS-4489
>                 URL: https://issues.apache.org/jira/browse/HDFS-4489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>             Fix For: 2.0.5-beta
>
>         Attachments: 4434.optimized.patch
>
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

Reply via email to