[ 
https://issues.apache.org/jira/browse/HDFS-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529524#comment-14529524
 ] 

Konstantin Shvachko commented on HDFS-8286:
-------------------------------------------

Hey guys, I read the design doc, and is wondering  _what is the exact goal of 
this jira?_
>From the design and the descriptions it is not quite clear if you propose to 
>rebase single NameNode on LevelDB, by replacing say {{FSDirectory}} with the 
>KV store, or target building a distributed namepsace service.
I am asking because I've always been interested in evolving HDFS towards 
distributing its namepsace in general, and using KV stores for it, in  
particular. [The Giraffa project|https://github.com/GiraffaFS/giraffa] has been 
dedicated to this goal for a few years now, [as most of you are probably well 
aware 
of|http://www.slideshare.net/Hadoop_Summit/dynamic-namespace-partitioning-with-giraffa-file-system].

Notes on the design document:
# You probably want _a support for a more generic notion of a {{Key}}_.
Your definition of {{key = <parentId, fileName>}} is well understood, and was 
probably first introduced around 1995 in treeFS, the predecessor of reiserFS, 
the predecessor of Btrfs, with the latter mentioned in your design. It keeps 
files of the same directory close to each other (locality).
But in larger storage systems more flexibility in defining locality may be 
needed. E.g. of using two-level keys {{<ppid, pid, file>}}, (which includes the 
locality of adjacent directories), or three-level keys, or full-path keys as in 
Ceph.
E.g., Giraffa introduces a generic Key interface, which allows different 
implementations including the one you describe.
And your design of KV-implementation of snapshots seems to go along these lines.
# _What motivates the choice of levelDB?_
It is a well recognized KV storage library. But it is not a distributed 
KV-store. So, what is the plan here?
In Giraffa the KV store is designed to be pluggable and we currently use HBase 
implementation. We also considered: levelDB, [mapDB|http://www.mapdb.org], 
[Redis|https://github.com/GiraffaFS/giraffa/wiki/Redis:-applicability-to-Giraffa],
 GemFire aka [Apache incubator 
Geode|https://wiki.apache.org/incubator/GeodeProposal], [Apache incubator 
Ignite|http://ignite.incubator.apache.org], [Prevayler|http://prevayler.org/], 
among a few others.
# The HA support paragraph talks about a single active NN and a standby NN. It 
is not clear _what is proposed for a distributed namespace, if anything?_

So, back to the starting question - what is the main goal for the issue? We may 
find some forms of collaboration between the projects.

> Scaling out the namespace using KV store
> ----------------------------------------
>
>                 Key: HDFS-8286
>                 URL: https://issues.apache.org/jira/browse/HDFS-8286
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>         Attachments: hdfs-kv-design.pdf
>
>
> Currently the NN keeps the namespace in the memory. To improve the 
> scalability of the namespace, users can scale up by using more RAM or scale 
> out using Federation (i.e., statically partitioning the namespace).
> We would like to remove the limitation of scaling the global namespace. Our 
> vision is that that HDFS should adopt a scalable underlying architecture that 
> allows the global namespace scales linearly.
> We propose to implement the HDFS namespace on top of a key-value (KV) store. 
> Adopting the KV store interfaces allows HDFS to leverage the capability of 
> modern KV store and to become much easier to scale. Going forward, the 
> architecture allows distributing the namespace across multiple machines, or  
> storing only the working set in the memory (HDFS-5389), both of which allows  
> HDFS to manage billions of files using the commodity hardware available today.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to