[jira] Commented: (HDFS-1110) Namenode heap optimization - reuse objects for commonly used file names

Suresh Srinivas (JIRA) Wed, 12 May 2010 23:15:31 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867064#action_12867064
 ]


Suresh Srinivas commented on HDFS-1110:
---------------------------------------

bq. Maybe you can have only one dictionary where you store the 
count-of-occurances as well. i.e move the count from the transient map to the 
dictionary...
I prefer having two different hashmaps for the following reasons:
# If a single hashmap is used, it grows to the size of all the names, not just 
the names stored in the dictionary. Either we have shrink the map post purging 
entries (by making another copy) or use heap more than necessary to retain a 
map much larger than what it should be.
# In dictionary, I do not intend to track the number of times the name is used. 
This information is unnecessary. All we care about is whether it used more than 
certain threshold and not the exact number of times a name is used.


> Namenode heap optimization - reuse objects for commonly used file names
> -----------------------------------------------------------------------
>
>                 Key: HDFS-1110
>                 URL: https://issues.apache.org/jira/browse/HDFS-1110
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: 0.22.0
>
>         Attachments: hdfs-1110.2.patch, hdfs-1110.patch
>
>
> There are a lot of common file names used in HDFS, mainly created by 
> mapreduce, such as file names starting with "part". Reusing byte[] 
> corresponding to these recurring file names will save significant heap space 
> used for storing the file names in millions of INodeFile objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1110) Namenode heap optimization - reuse objects for commonly used file names

Reply via email to