[ 
http://issues.apache.org/jira/browse/HADOOP-764?page=comments#action_12456276 ] 
            
Raghu Angadi commented on HADOOP-764:
-------------------------------------


As a related note, since we do a blocks.clear() in DatanodeDescriptor when we 
update blocks from , this results in one separate  copy of a block exists for 
each node (i.e. one for each replica) and one coy in NameNode's blockMap(). 
Ideally blocks in DatanodeDescriptor should be references to blocks in global 
blockMap.

This patch decreases number of times blocks.clear() is invoked but over time 
there will be separate copies of blocks. Fix is not to call blocks.clear() but 
update blocks map inline when new blocks are removed or added inside 
processReport().
 


> The memory consumption of processReport() in the namenode can be reduced
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-764
>                 URL: http://issues.apache.org/jira/browse/HADOOP-764
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>         Attachments: processBlockReport3.patch
>
>
> The FSNamesystem.processReport() method converts the blocklist for a datanode 
> into an array by calling node.getBlocks(). Although this memory allocation is 
> transient, it could possibly require the garbage-collector to work that much 
> harder. 
> The method Block.getBlocks() should be deprecated. Code that currently uses 
> this method should instead iterate over the Collection.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to