[ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060023#comment-14060023
 ] 

Amir Langer commented on HDFS-6658:
-----------------------------------

Hi [~kihwal] - In response to the scenario of massive block deletes without any 
block adds that folllow it which leaves a lot of empty array references:
Yes - you're right and there is currently nothing in the code that takes care 
of it.
We can introduce a check that removes a whole chunk if it is empty, or, copies 
some references around in order to use less memory. (Some algorithm similar to 
defragmentation). 
However, this will either take a lot of latency (if done as part of a client 
call), or will require a monitor thread and then will force us to turn 
everything into being thread-safe which will add latency to all calls again. In 
short, any solution cost is high.

The reason I was reluctant to pay it, is that if you consider the scenario when 
it happens - once we deleted a lot of blocks - we shouldn't really have a big 
memory shortage (even if you still have the arrays, you cleared all those block 
instances which is a lot more). 
We're actually OK if we don't really need to add blocks (i.e. there isn't much 
of a benefit). 
And once we do need to add blocks - then the problem of sparse arrays goes away 
anyway.
In short, yes, it is there  - but I believe the cost does not justify the 
benefits.


> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: Namenode Memory Optimizations - Block replicas list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a 
> linked list of block references for every DatanodeStorageInfo (called 
> "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the 
> memory needed for every block replica (when compressed oops is disabled) and 
> in our new design the list overhead will be per DatanodeStorageInfo and not 
> per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to