[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061374#comment-14061374 ]
Colin Patrick McCabe commented on HDFS-6658: -------------------------------------------- bq. As opposed to block-management-as-a-service (HDFS-5477), this optimization is very scoped (data structure modification), and introduces minimal risk. The saving is about 20% of block management footprint, or about 10% of the total NN footprint. I was not proposing block management as a service. I was just proposing using off-heap memory. Memory that is not managed by the JVM and is not subject to garbage collection. That is why I mentioned {{allocDirect}}, {{Unsafe.getInt}}, {{Unsafe.getLong}}. There is more information about off-heap memory here: http://stackoverflow.com/questions/6091615/difference-between-on-heap-and-off-heap bq. The design in HDFS-5477 details why off-heap swap space management is not an option in high-end settings (terabytes of metadata). If the off-heap memory is managed on SSD, this is still two orders of magnitude slower than DDR3. In this setting, block reports in large clusters cannot be sustained because they have no locality of reference. I'm not proposing swap space management. SSDs have nothing to do with what I'm proposing, which is just using memory not managed by the JVM. > Namenode memory optimization - Block replicas list > --------------------------------------------------- > > Key: HDFS-6658 > URL: https://issues.apache.org/jira/browse/HDFS-6658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.4.1 > Reporter: Amir Langer > Assignee: Amir Langer > Attachments: Namenode Memory Optimizations - Block replicas list.docx > > > Part of the memory consumed by every BlockInfo object in the Namenode is a > linked list of block references for every DatanodeStorageInfo (called > "triplets"). > We propose to change the way we store the list in memory. > Using primitive integer indexes instead of object references will reduce the > memory needed for every block replica (when compressed oops is disabled) and > in our new design the list overhead will be per DatanodeStorageInfo and not > per block replica. > see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.2#6252)