[ https://issues.apache.org/jira/browse/HDFS-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875282#action_12875282 ]
Suresh Srinivas commented on HDFS-1110: --------------------------------------- Thanks Jakob. bq. The NameDictionary.lookup(name, value) method seems a bit odd in its usage. Both times it's used via dictionary.lookup(name, name), which makes me wonder if this is the right API. Do we expect NameDictionary to be used elsewhere such that this abstraction is worth the odd API? I see your point. I decided to use same key and value to reduce object count. I do not want to rule out using NameDictionary for other things and hence it is generic with the possibility of key and value being different. I can add a comment to indicate key and value are the same when doing lookup. Let me know if you think of other alternatives. > Namenode heap optimization - reuse objects for commonly used file names > ----------------------------------------------------------------------- > > Key: HDFS-1110 > URL: https://issues.apache.org/jira/browse/HDFS-1110 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Fix For: 0.22.0 > > Attachments: hdfs-1110.2.patch, hdfs-1110.3.patch, hdfs-1110.patch > > > There are a lot of common file names used in HDFS, mainly created by > mapreduce, such as file names starting with "part". Reusing byte[] > corresponding to these recurring file names will save significant heap space > used for storing the file names in millions of INodeFile objects. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.