[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

Amir Langer (JIRA) Tue, 15 Jul 2014 11:54:19 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062498#comment-14062498
 ]


Amir Langer commented on HDFS-6658:
-----------------------------------

We explored the idea of off-heap memory for the Namenode.
It makes sense for Inodes and there was already some work on that done at 
Hortonworks.
For the blocks however there is a problem - Blocks data has two very different 
access patterns.
Clients will typically access a few blocks (from same or similar files) and 
mostly the recent ones, while block reports can scan the entire block space.
This means there is no locality of reference and caching is not going to work.
If we don't have caching, we need to cope with the added latency of off-heap 
memory - It is after all backed up by a file.
>From our measurements - this cost seems too high with some block reports seem 
>to never be able to finish. (Just think of the cost of the off-heap management 
>keep needing to load pages from the file into its memory and its page caching 
>not having any effect).



> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: Namenode Memory Optimizations - Block replicas list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

Reply via email to