[ 
https://issues.apache.org/jira/browse/HADOOP-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467150
 ] 

Raghu Angadi commented on HADOOP-803:
-------------------------------------

Few more thoughts: (these are not intended to be included in patch for this 
issue)

A big per file consumer of memory is INode.name. It stores full path. We can 
save hundred or more bytes per file if we store only the file name. Full path 
name can always be constructed from parent. --- (b) . 

Each directory has 'activeBlocks' which is a HashMap for block to INode. We 
already have a global blockMap (block to containingNodes). This also implies 
that every call to getBlock(File) results in recursing from root to the node, 
each of which involves a TreeMap  look up in children map. I think we should 
have just Map : block to { INode, self-ref, containingNodes ... } . This will 
save  HashMap entry (30+ bytes) and block object  (20-30 bytes) for each block. 
It also improves getFile() by many times. This will also let us use ArrayList 
instead of TreeMap for INode.children  (30-40 bytes per file) --- (c)









> Reducing memory consumption on Namenode : Part 1
> ------------------------------------------------
>
>                 Key: HADOOP-803
>                 URL: https://issues.apache.org/jira/browse/HADOOP-803
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.11.0
>
>         Attachments: block-refs-2.patch, block-refs-3.patch, 
> block-refs-5.patch, HADOOP-803-2.patch, HADOOP-803.patch
>
>
> There appears to be some places in Namenode that allow reducing memory 
> consumption without intrusive code or feature changes. This bug is an initial 
> attempt making those changes. Please include your thoughts as well. 
> One change I am planning to make : 
> Currently one copy of each block exists for each of the replicas and one copy 
> for blockMap. I think they are all supposed to be same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to