[ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064095#comment-14064095
 ] 

Nathan Roberts commented on HDFS-6658:
--------------------------------------

{quote}
I guess my argument is that (in the short or medium term) we don't actually 
need to reduce the amount of RAM the NameNode uses. I've seen machines with 300 
GB of RAM, and sizes continue to increase at a steady clip every year. We do 
need to reduce the amount of Java heap that the NameNode uses, since otherwise 
we get 10 minute long GC pauses.
{quote}
This is a pretty sizable improvement though so it seems well worth considering. 
* One thing I'm concerned about is the increased RAM requirements that have 
been going on in the NN. For example, moving from 0.23 releases to 2.x releases 
requires about 9% more RAM (I'm assuming it's something similar when going from 
1.x to 2.x). This is a pretty big deal and can cause some folks to fail their 
upgrade if they were living close to the edge. In my opinion we need to be very 
careful whenever we increase the RAM requirements of the NN. For every increase 
there should be a corresponding optimization so the net increase stays as close 
to 0 as possible. Otherwise, some upgrades will certainly fail. 
* I'm not totally convinced of the long GC argument. It's true that a worst 
case full-gc will be much longer. However, isn't it also the case that we 
should almost never be doing worst case full-GCs? On a large and busy NN, we 
see a GC greater than 2 seconds maybe once every couple of days. Usually the 
big outliers are the result of a very large application doing something bad - 
in which case even if you solve the GC problem, something else is liable to 
cause the NN to be unresponsive. 


> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: Namenode Memory Optimizations - Block replicas list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to