[jira] [Commented] (HDFS-7784) load fsimage in parallel

Gang Xie (JIRA) Wed, 28 Dec 2016 18:21:07 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15784285#comment-15784285
 ]


Gang Xie commented on HDFS-7784:
--------------------------------

Found a potential issue while trying to back port this patch to 2.4. Pls 
correct me if I'm wrong:

When ACL is enabled on the file, it will call addAclFeature to add the 
AclFeature to UNIQUE_ACL_FEATURES, which is a hashmap and shared by all the 
files. Since we intrudoced multi threading, I think this could be a problem.

Actually, in my test, trying to load 22G fsimage with 200M inodes, it could not 
 finished the loading fsimage in some hours (without the patch, it could finish 
it in about 30mins). And the jstack show it's busy with UNIQUE_ACL_FEATURES. 
Not sure if the cache is messed up. As the image is huge and need 100G mem to 
profile it. It's hard to open the dump. So, I'm 100% sure about this.

Do we hit similar issue when doing the test?

> load fsimage in parallel
> ------------------------
>
>                 Key: HDFS-7784
>                 URL: https://issues.apache.org/jira/browse/HDFS-7784
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Walter Su
>            Assignee: Walter Su
>            Priority: Minor
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-7784.001.patch, test-20150213.pdf
>
>
> When single Namenode has huge amount of files, without using federation, the 
> startup/restart speed is slow. The fsimage loading step takes the most of the 
> time. fsimage loading can seperate to two parts, deserialization and object 
> construction(mostly map insertion). Deserialization takes the most of CPU 
> time. So we can do deserialization in parallel, and add to hashmap in serial. 
>  It will significantly reduce the NN start time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7784) load fsimage in parallel

Reply via email to