Re: Resources for understanding Hadoop

Brahma Reddy Battula Thu, 23 Jun 2022 10:05:14 -0700

Please go through the following link.
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html


*Use the following command to start the cluster.*

$HADOOP_HOME/bin/hdfs --daemon start namenode

$HADOOP_HOME/bin/hdfs --daemon start datanode


On Thu, Jun 23, 2022 at 10:26 PM Rahul Bhardwaj <rahul265...@gmail.com>
wrote:

> Yeah, I followed this and can see that the .class files have been
> generated, but not sure how to run it. In the wiki I shared, "start-dfs.sh"
> has been used. So i tried using the same script here from
> hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh, but this errors
> out. In BUILDING.txt I didnt find instructions on how to run the hadoop
> daemons.
>
> On Thu, 23 Jun 2022 at 21:53, Brahma Reddy Battula <bra...@apache.org>
> wrote:
>
>>
>> Please go through the following
>>
>> https://github.com/apache/hadoop/blob/trunk/BUILDING.txt
>>
>> and a specific command to generate the distribution which can be run
>> after your changes.
>> mvn package -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
>>
>> Hope this helps.
>>
>>
>>
>>
>> On Thu, Jun 23, 2022 at 9:41 PM Rahul Bhardwaj <rahul265...@gmail.com>
>> wrote:
>>
>>> I am following this wiki
>>> <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html>
>>>  to
>>> build and run hadoop locally in pseudo-dsitributed mode. But I am unable to
>>> figure out how to build my changes and generate similar binaries so that I
>>> can test my changes locally. Is there some documentation on how to do this?
>>>
>>> On Mon, 13 Jun 2022 at 00:26, Brahma Reddy Battula <bra...@apache.org>
>>> wrote:
>>>
>>>> Hi Rahul,
>>>>
>>>> Welcome to hadoop world.
>>>>
>>>> Apart from the gautham mentioned, you can check the following also.
>>>> https://livebook.manning.com/book/hadoop-in-action/part-1/
>>>>
>>>> Go through the following wiki for contributions
>>>> https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
>>>>
>>>>
>>>> Please subscribe to the hadoop mailing list[1], and shoot your queries
>>>> there from next time.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 1. https://hadoop.apache.org/mailing_lists.html
>>>>
>>>> On Sun, Jun 12, 2022 at 10:42 PM Gautham Banasandra <gaur...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Rahul,
>>>>>
>>>>> I was looking for something more detailed and low-level like how the
>>>>> code
>>>>> > for the various services in HDFS is organized, entrypoints etc.
>>>>>
>>>>> I found this book useful to get a good idea of Hadoop in general -
>>>>> Apache
>>>>> Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache
>>>>> Hadoop™ 2 [Book] (oreilly.com)
>>>>> <
>>>>> https://www.oreilly.com/library/view/apache-hadooptm-yarn/9780133441925/
>>>>> >.
>>>>>
>>>>> In my opinion, you get into Open Source contributions by just doing
>>>>> so. You
>>>>> don't have to know HDFS in detail to start contributing to it. Now that
>>>>> you've gone through the Hadoop documentation, try setting up Hadoop in
>>>>> pseudo-distributed mode. If you notice any glitch, try fixing it and
>>>>> send
>>>>> out a PR. You never know what issue you'll find. I ran into this when I
>>>>> tried compiling Hadoop on Windows - [HDFS-15385] Upgrade boost library
>>>>> to
>>>>> 1.72 - ASF JIRA (apache.org)
>>>>> <https://issues.apache.org/jira/browse/HDFS-15385> (And yes, this was
>>>>> my
>>>>> first PR to Hadoop). Then use Docker and set up the Hadoop cluster with
>>>>> multiple nodes. Once you're able to do this, try browsing
>>>>> issues.apache.org
>>>>> and you'll find tons of issues that you can work on. There's always so
>>>>> much
>>>>> work to do in Open Source and the thing that I like the most is that
>>>>> "there's no deadline on anything" :) So, you can really work on some
>>>>> awesome stuff, own it, perfect it and share it with the world.
>>>>>
>>>>> Best of luck.
>>>>>
>>>>> Thanks,
>>>>> --Gautham
>>>>>
>>>>> On Sun, 12 Jun 2022 at 16:34, Rahul Bhardwaj <rahul265...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi all,
>>>>> > I am a newbie wanting to start contributing to the hadoop ecosystem.
>>>>> I want
>>>>> > to start by contributing to HDFS and was looking for resources to
>>>>> > understand the architecture and I just found this -
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
>>>>> > which is a fairly high level documentation. I was looking for
>>>>> something
>>>>> > more detailed and low-level like how the code for the various
>>>>> services in
>>>>> > HDFS is organized, entrypoints etc. Can someone point me to such
>>>>> resources?
>>>>> > Also is there a slack workspace for such discussions? Not sure if
>>>>> this
>>>>> > mailing list is the right forum for such doubts.
>>>>> >
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> --Brahma Reddy Battula
>>>>
>>>
>>
>> --
>>
>>
>>
>> --Brahma Reddy Battula
>>
>

-- 



--Brahma Reddy Battula

Re: Resources for understanding Hadoop

Reply via email to