Re: how to use Hadoop echosystem

2019-11-13 Thread Jeff Hubbs

I should have included Scala in that language list.

On 11/13/19 10:10 AM, Jeff Hubbs wrote:

I'm assuming you mean "Hadoop ecosystem." :)

Here's what I know. Hadoop is a collection of daemons (a.k.a. 
"services") that are typically run across multiple machines to form a 
Hadoop cluster.


Central to the whole idea of Hadoop is the presence of an HDFS (Hadoop 
Distributed File System): a virtualized filesystem that occupies real 
disk space across multiple machines. The point of Hadoop is to 
parallelize computing operations such that the computing (which is 
easy to relocate) tends to take place where the data (which is hard to 
relocate) is.


The basic form of a Hadoop cluster in the 3.x series is to have one 
NameNode daemon, one ResourceManager daemon, a JobHistoryServer 
daemon, a WebAppProxyServer daemon, and both a NodeManager daemon and 
a DataNode daemon for each machine that participates in HDFS. When I 
built Hadoop clusters, I placed a copy of the Hadoop distribution on 
each machine in the cluster and started the daemons on each one 
according to the cluster role or roles I'd assigned to them. I'm 
recalling that all nodes need to know, through the Hadoop config files 
(*-site.xml), which machine is running the NodeName daemon and which 
one is running the ResourceManager daemon. In general, all nodes need 
to have forward and reverse name resolution working.


Once you have everything up, my understanding is that code is written 
for Hadoop clusters in either Java, Python, or Spark.


On 11/13/19 9:04 AM, political science wrote:

Hi,
I am new to Hadoop echo system.I have been able to build Hadoop from 
sources

version 3.1.3
https://hadoop.apache.org/releases.html
https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.1.3/hadoop-3.1.3-src.tar.gz

I read the instructions in BUILDING.txt file and followed them now
 I get to see a shell like this
https://drive.google.com/file/d/1egW0z_0CVNtkH-c9AoyohHYXjBjgk519/view?usp=sharing
I want to know  how to use this.
What command can I use to get back to this thing as I see in screenshot.
How programming is done in this system.
This seems to be some kind of environment of Hadoop but I am new to 
this so I do not know what to do next? I have to do some assignment 
questions for programming like I want to learn also Hadoop from the 
shell I see above what command should I type or how to do programming 
etc in it.


Thanks & Regards
TA







Re: how to use Hadoop echosystem

2019-11-13 Thread Jeff Hubbs

I'm assuming you mean "Hadoop ecosystem." :)

Here's what I know. Hadoop is a collection of daemons (a.k.a. 
"services") that are typically run across multiple machines to form a 
Hadoop cluster.


Central to the whole idea of Hadoop is the presence of an HDFS (Hadoop 
Distributed File System): a virtualized filesystem that occupies real 
disk space across multiple machines. The point of Hadoop is to 
parallelize computing operations such that the computing (which is easy 
to relocate) tends to take place where the data (which is hard to 
relocate) is.


The basic form of a Hadoop cluster in the 3.x series is to have one 
NameNode daemon, one ResourceManager daemon, a JobHistoryServer daemon, 
a WebAppProxyServer daemon, and both a NodeManager daemon and a DataNode 
daemon for each machine that participates in HDFS. When I built Hadoop 
clusters, I placed a copy of the Hadoop distribution on each machine in 
the cluster and started the daemons on each one according to the cluster 
role or roles I'd assigned to them. I'm recalling that all nodes need to 
know, through the Hadoop config files (*-site.xml), which machine is 
running the NodeName daemon and which one is running the ResourceManager 
daemon. In general, all nodes need to have forward and reverse name 
resolution working.


Once you have everything up, my understanding is that code is written 
for Hadoop clusters in either Java, Python, or Spark.


On 11/13/19 9:04 AM, political science wrote:

Hi,
I am new to Hadoop echo system.I have been able to build Hadoop from 
sources

version 3.1.3
https://hadoop.apache.org/releases.html
https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.1.3/hadoop-3.1.3-src.tar.gz

I read the instructions in BUILDING.txt file and followed them now
 I get to see a shell like this
https://drive.google.com/file/d/1egW0z_0CVNtkH-c9AoyohHYXjBjgk519/view?usp=sharing
I want to know  how to use this.
What command can I use to get back to this thing as I see in screenshot.
How programming is done in this system.
This seems to be some kind of environment of Hadoop but I am new to 
this so I do not know what to do next? I have to do some assignment 
questions for programming like I want to learn also Hadoop from the 
shell I see above what command should I type or how to do programming 
etc in it.


Thanks & Regards
TA