[ 
https://issues.apache.org/jira/browse/HDFS-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121339#comment-14121339
 ] 

Amir Langer commented on HDFS-6659:
-----------------------------------

Hi [~vinayrpet]

1. It will be used by the patch I'm preparing now for the third subtask (HDFS 
6661) 
2. Yes you are right. As [~nroberts] pointed out earlier on the umbrella JIRA 
(HDFS-6658), 
the case of many blocks deleted followed by no new blocks at all for a long 
duration will leave gaps that need to be cleaned. 
We haven't addressed this cleanup yet as we see it as an edge case to be dealt 
with separately. 
We left it as a future JIRA where the question of the cost of this cleanup and 
how to approach it needs to be decided.


> Create a Block List
> -------------------
>
>                 Key: HDFS-6659
>                 URL: https://issues.apache.org/jira/browse/HDFS-6659
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>              Labels: perfomance
>         Attachments: HDFS-6659.patch
>
>
> BlockList - An efficient array based list that can extend its capacity with 
> two main features:
> 1. Gaps (result of remove operations) are managed internally without the need 
> for extra memory - We create a linked list of gaps by using the array index 
> as references + An int to the head of the gaps list. In every insert 
> operation, we first use any available gap before extending the array.
> 2. Array extension is done by chaining different arrays, not by allocating a 
> larger array and copying all its data across. This is a lot less heavy in 
> terms of latency for that particular call. It also avoids having large amount 
> of contiguous heap space and so behaves nicer with garbage collection.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to