Hi Rahul, Welcome to hadoop world.
Apart from the gautham mentioned, you can check the following also. https://livebook.manning.com/book/hadoop-in-action/part-1/ Go through the following wiki for contributions https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute Please subscribe to the hadoop mailing list[1], and shoot your queries there from next time. 1. https://hadoop.apache.org/mailing_lists.html On Sun, Jun 12, 2022 at 10:42 PM Gautham Banasandra <gaur...@apache.org> wrote: > Hi Rahul, > > I was looking for something more detailed and low-level like how the code > > for the various services in HDFS is organized, entrypoints etc. > > I found this book useful to get a good idea of Hadoop in general - Apache > Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache > Hadoop™ 2 [Book] (oreilly.com) > <https://www.oreilly.com/library/view/apache-hadooptm-yarn/9780133441925/ > >. > > In my opinion, you get into Open Source contributions by just doing so. You > don't have to know HDFS in detail to start contributing to it. Now that > you've gone through the Hadoop documentation, try setting up Hadoop in > pseudo-distributed mode. If you notice any glitch, try fixing it and send > out a PR. You never know what issue you'll find. I ran into this when I > tried compiling Hadoop on Windows - [HDFS-15385] Upgrade boost library to > 1.72 - ASF JIRA (apache.org) > <https://issues.apache.org/jira/browse/HDFS-15385> (And yes, this was my > first PR to Hadoop). Then use Docker and set up the Hadoop cluster with > multiple nodes. Once you're able to do this, try browsing > issues.apache.org > and you'll find tons of issues that you can work on. There's always so much > work to do in Open Source and the thing that I like the most is that > "there's no deadline on anything" :) So, you can really work on some > awesome stuff, own it, perfect it and share it with the world. > > Best of luck. > > Thanks, > --Gautham > > On Sun, 12 Jun 2022 at 16:34, Rahul Bhardwaj <rahul265...@gmail.com> > wrote: > > > Hi all, > > I am a newbie wanting to start contributing to the hadoop ecosystem. I > want > > to start by contributing to HDFS and was looking for resources to > > understand the architecture and I just found this - > > > > > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html > > which is a fairly high level documentation. I was looking for something > > more detailed and low-level like how the code for the various services in > > HDFS is organized, entrypoints etc. Can someone point me to such > resources? > > Also is there a slack workspace for such discussions? Not sure if this > > mailing list is the right forum for such doubts. > > > -- --Brahma Reddy Battula