[ 
https://issues.apache.org/jira/browse/HADOOP-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539751
 ] 

Raghu Angadi commented on HADOOP-2148:
--------------------------------------

> Another observation is that data-nodes do not need the blockMap at all. File 
> names can be derived from the block IDs,
there is no need to hold Block to File mapping in memory.

It has the full path like "dir/subdir2/subdir63/ ....". Also existence of block 
to File mapping indicates that file is still valid.

currently getBlockFile() is expected to throw IOException when there is no 
mapping. yes, removing double look up would be good.


> Inefficient FSDataset.getBlockFile()
> ------------------------------------
>
>                 Key: HADOOP-2148
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2148
>             Project: Hadoop
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.16.0
>
>
> FSDataset.getBlockFile() first verifies that the block is valid and then 
> returns the file name corresponding to the block.
> Doing that it performs the data-node blockMap lookup twice. Only one lookup 
> is needed here. 
> This is important since the data-node blockMap is big.
> Another observation is that data-nodes do not need the blockMap at all. File 
> names can be derived from the block IDs,
> there is no need to hold Block to File mapping in memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to