Hi Sanjay, Read your doc. I clearly see the value of Ozone with your use cases, but I agree with Stack and others the question why it should be a part of Hadoop isn't clear. More details in the jira:
https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16239313&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16239313 Thanks, --Konstantin On Fri, Nov 3, 2017 at 1:56 PM, sanjay Radia <sanjayo...@gmail.com> wrote: > Konstantine, > Thanks for your comments, questions and feedback. I have attached a > document to the HDFS-7240 jira > that explains a design for scaling HDFS and how Ozone paves the way > towards the full solution. > > > https://issues.apache.org/jira/secure/attachment/ > 12895963/HDFS%20Scalability%20and%20Ozone.pdf > > > sanjay > > > > > On Oct 28, 2017, at 2:00 PM, Konstantin Shvachko <shv.had...@gmail.com> > wrote: > > Hey guys, > > It is an interesting question whether Ozone should be a part of Hadoop. > There are two main reasons why I think it should not. > > 1. With close to 500 sub-tasks, with 6 MB of code changes, and with a > sizable community behind, it looks to me like a whole new project. > It is essentially a new storage system, with different (than HDFS) > architecture, separate S3-like APIs. This is really great - the World sure > needs more distributed file systems. But it is not clear why Ozone should > co-exist with HDFS under the same roof. > > 2. Ozone is probably just the first step in rebuilding HDFS under a new > architecture. With the next steps presumably being HDFS-10419 and > HDFS-11118. > The design doc for the new architecture has never been published. I can > only assume based on some presentations and personal communications that > the idea is to use Ozone as a block storage, and re-implement NameNode, so > that it stores only a partial namesapce in memory, while the bulk of it > (cold data) is persisted to a local storage. > Such architecture makes me wonder if it solves Hadoop's main problems. > There are two main limitations in HDFS: > a. The throughput of Namespace operations. Which is limited by the number > of RPCs the NameNode can handle > b. The number of objects (files + blocks) the system can maintain. Which > is limited by the memory size of the NameNode. > The RPC performance (a) is more important for Hadoop scalability than the > object count (b). The read RPCs being the main priority. > The new architecture targets the object count problem, but in the expense > of the RPC throughput. Which seems to be a wrong resolution of the > tradeoff. > Also based on the use patterns on our large clusters we read up to 90% of > the data we write, so cold data is a small fraction and most of it must be > cached. > > To summarize: > - Ozone is a big enough system to deserve its own project. > - The architecture that Ozone leads to does not seem to solve the intrinsic > problems of current HDFS. > > I will post my opinion in the Ozone jira. Should be more convenient to > discuss it there for further reference. > > Thanks, > --Konstantin > > > > On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei <cheersy...@hotmail.com> > wrote: > > Hello everyone, > > > I would like to start this thread to discuss merging Ozone (HDFS-7240) to > trunk. This feature implements an object store which can co-exist with > HDFS. Ozone is disabled by default. We have tested Ozone with cluster sizes > varying from 1 to 100 data nodes. > > > > The merge payload includes the following: > > 1. All services, management scripts > 2. Object store APIs, exposed via both REST and RPC > 3. Master service UIs, command line interfaces > 4. Pluggable pipeline Integration > 5. Ozone File System (Hadoop compatible file system implementation, > passes all FileSystem contract tests) > 6. Corona - a load generator for Ozone. > 7. Essential documentation added to Hadoop site. > 8. Version specific Ozone Documentation, accessible via service UI. > 9. Docker support for ozone, which enables faster development cycles. > > > To build Ozone and run ozone using docker, please follow instructions in > this wiki page. https://cwiki.apache.org/confl > uence/display/HADOOP/Dev+cluster+with+docker. > > > We have built a passionate and diverse community to drive this feature > development. As a team, we have achieved significant progress in past 3 > years since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we > have resolved almost 400 JIRAs by 20+ contributors/committers from > different countries and affiliations. We also want to thank the large > number of community members who were supportive of our efforts and > contributed ideas and participated in the design of ozone. > > > Please share your thoughts, thanks! > > > -- Weiwei Yang > > > > On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei <cheersy...@hotmail.com> > wrote: > > Hello everyone, > > > I would like to start this thread to discuss merging Ozone (HDFS-7240) to > trunk. This feature implements an object store which can co-exist with > HDFS. Ozone is disabled by default. We have tested Ozone with cluster sizes > varying from 1 to 100 data nodes. > > > > The merge payload includes the following: > > 1. All services, management scripts > 2. Object store APIs, exposed via both REST and RPC > 3. Master service UIs, command line interfaces > 4. Pluggable pipeline Integration > 5. Ozone File System (Hadoop compatible file system implementation, > passes all FileSystem contract tests) > 6. Corona - a load generator for Ozone. > 7. Essential documentation added to Hadoop site. > 8. Version specific Ozone Documentation, accessible via service UI. > 9. Docker support for ozone, which enables faster development cycles. > > > To build Ozone and run ozone using docker, please follow instructions in > this wiki page. https://cwiki.apache.org/confluence/display/HADOOP/Dev+ > cluster+with+docker. > > > We have built a passionate and diverse community to drive this feature > development. As a team, we have achieved significant progress in past 3 > years since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we > have resolved almost 400 JIRAs by 20+ contributors/committers from > different countries and affiliations. We also want to thank the large > number of community members who were supportive of our efforts and > contributed ideas and participated in the design of ozone. > > > Please share your thoughts, thanks! > > > -- Weiwei Yang > > >