[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

Yi Liu (JIRA) Tue, 22 Sep 2015 01:37:27 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902232#comment-14902232
 ]


Yi Liu commented on HDFS-9053:
------------------------------

Update the patch to address the comments, also cleanup the 
javac/checkstyle/whitespace.
{quote}
4. Optional: in insertElement maybe we can copy elements only once if we need 
to expand the array.
{quote}
We still need to have two copy even if we handle the insertElement and expand 
together, and the steps are: 1) allocate new heap memory, 2) copy the elements 
before insertion point to the new memory, 3) set inserted element and copy the 
elements after insertion point to the new memory. It also makes logic a bit 
complex.  The current insertElement is the same behavior as insertion of 
ArrayList. So maybe we can keep current behavior, how do you think?

Thanks Jing.

> Support large directories efficiently using B-Tree
> --------------------------------------------------
>
>                 Key: HDFS-9053
>                 URL: https://issues.apache.org/jira/browse/HDFS-9053
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>            Priority: Critical
>         Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

Reply via email to