[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

Colin Patrick McCabe (JIRA) Mon, 14 Jul 2014 15:42:19 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061374#comment-14061374
 ]


Colin Patrick McCabe commented on HDFS-6658:
--------------------------------------------

bq. As opposed to block-management-as-a-service (HDFS-5477), this optimization 
is very scoped (data structure modification), and introduces minimal risk. The 
saving is about 20% of block management footprint, or about 10% of the total NN 
footprint.

I was not proposing block management as a service.  I was just proposing using 
off-heap memory.  Memory that is not managed by the JVM and is not subject to 
garbage collection.  That is why I mentioned {{allocDirect}}, 
{{Unsafe.getInt}}, {{Unsafe.getLong}}.  There is more information about 
off-heap memory here: 
http://stackoverflow.com/questions/6091615/difference-between-on-heap-and-off-heap

bq. The design in HDFS-5477 details why off-heap swap space management is not 
an option in high-end settings (terabytes of metadata). If the off-heap memory 
is managed on SSD, this is still two orders of magnitude slower than DDR3. In 
this setting, block reports in large clusters cannot be sustained because they 
have no locality of reference.

I'm not proposing swap space management.  SSDs have nothing to do with what I'm 
proposing, which is just using memory not managed by the JVM.

> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: Namenode Memory Optimizations - Block replicas list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

Reply via email to