[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225532#comment-16225532
 ] 

Jitendra Nath Pandey commented on HDFS-7240:
--------------------------------------------

[~shv] Thank you for taking out time to review ozone. I appreciate your 
comments and questions.

{quote} 
There are two main limitations in HDFS
a) The throughput of Namespace operations. Which is limited by the number of 
RPCs the NameNode can handle
b) The number of objects (files + blocks) the system can maintain. Which is 
limited by the memory size of the NameNode.
{quote}

   I agree completely. We believe ozone attempts to address both these issues 
for HDFS.
   
   Let us look at the Number of objects problem. Ozone directly addresses the 
scalability of number of blocks by introducing storage containers that can hold 
multiple blocks together. The earlier efforts on this were complicated by the 
fact that block manager and namespace are intertwined in HDFS Namenode. There 
have been efforts in past to separate block manager from namespace for e.g. 
HDFS-5477. Ozone addresses this problem by cleanly separating the block layer.  
Separation of block layer also addresses the file/directories scalability 
because it frees up the blockmap from the namenode.
   
   Separate block layer relieves namenode from handling block reports, IBRs, 
heartbeats, replication monitor etc, and thus reduces the contention on 
FSNamesystem lock and significantly reduces the GC pressure on the namenode. 
These improvements will greatly help the RPC performance of the Namenode.

bq. Ozone is probably just the first step in rebuilding HDFS under a new 
architecture. With the next steps presumably being HDFS-10419 and HDFS-11118. 
The design doc for the new architecture has never been published. 
   We do believe that Namenode can leverage the ozone’s storage container 
layer, however, that is also a big effort. We would like to first have block 
layer stabilized in ozone before taking that up. However, we would certainly 
support any community effort on that, and in fact it was brought up in last BoF 
session at the summit.

   Big data is evolving rapidly. We see our customers needing scalable file 
systems, Objects stores(like S3) and Block Store(for docker and VMs). Ozone 
improves HDFS in two ways. It addresses throughput and scale issues of HDFS, 
and enriches it with newer capabilities.


bq. Ozone is a big enough system to deserve its own project.

I took a quick look at the core code in ozone and the cloc command reports 
22,511 lines of functionality changes in Java.

This patch also brings in web framework code like Angular.js and that brings in 
bunch of css and js files that contribute to the size of the patch, and the 
rest are test and documentation changes.

I hope this addresses your concerns.


> Object store in HDFS
> --------------------
>
>                 Key: HDFS-7240
>                 URL: https://issues.apache.org/jira/browse/HDFS-7240
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: HDFS-7240.001.patch, HDFS-7240.002.patch, 
> HDFS-7240.003.patch, HDFS-7240.003.patch, HDFS-7240.004.patch, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to