[ 
https://issues.apache.org/jira/browse/HDFS-7454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229285#comment-14229285
 ] 

Haohui Mai commented on HDFS-7454:
----------------------------------

Thakns Chris for the clarification. Just took a skim at the patch -- it looks 
it is deduplicating the {{AclFeature}} instead of {{AclEntry}}. It makes sense 
in Vinay's use case.

As Chris pointed out in HDFS-5620, it requires some complexity to make the 
feature really work. My feeling is that the optimization might be unnecessary 
if we are able to fit an {{AclEntry}} into a int. On a 64-bit JVM, the current 
implementation requires 48 + 8 = 56 bytes per {{AclEntry}} (the size of the 
{{AclEntry}} + a reference to the object). Fitting it into a 4-byte {{int}} 
would reduce the the memory usage of ACL by a factor of 14.

Consider a large cluster that has 192G heap to store 300M files. If all files 
have the default ACLs which contain 3 entries, the scheme can support 300M 
files with default ACLs using 4G memory. With these number, the scheme seems a 
pretty good thing to have before we really thinking of getting into the mud of 
implementing an interner.

> Implement Global ACL Set for memory optimization in NameNode
> ------------------------------------------------------------
>
>                 Key: HDFS-7454
>                 URL: https://issues.apache.org/jira/browse/HDFS-7454
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS-7454-001.patch
>
>
> HDFS-5620 indicated a GlobalAclSet containing unique {{AclFeature}} can be 
> de-duplicated to save the memory in NameNode. However it was not implemented 
> at that time.
> This Jira re-proposes same implementation, along with de-duplication of 
> unique {{AclEntry}} across all ACLs.
> One simple usecase is:
> A mapreduce user's home directory with the set of default ACLs, under which 
> lot of other files/directories could be created when jobs is run. Here all 
> the default ACLs of parent directory will be duplicated till the explicit 
> delete of those ACLs. With de-duplication,only one object will be in memory 
> for the same Entry across all ACLs of all files/directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to