RE: Running Hadoop v2 clustered mode MR on an NFS mounted filesystem

java8964 Fri, 20 Dec 2013 06:31:20 -0800

I believe the "-fs local" should be removed too. The reason is that even you 
have a dedicated JobTracker after removing "-jt local", but with "-fs local", I 
believe that all the mappers will be run sequentially.
"-fs local" will force the mapreducer run in "local" mode, which is really a 
test mode.
What you can do is to remove both "-fs local -jt local", but give the FULL URI 
of the input and output path, to tell Hadoop that they are local filesystem 
instead of HDFS.
"hadoop jar 
/hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
wordcount file:///hduser/mount_point file:///results"
Keep in mind followings:
1) The NFS mount need to be available in all your Task Nodes, and mounted in 
the same way.2) Even you can do that, but your sharing storage will be your 
bottleneck. NFS won't work well for scalability. 
Yong

Date: Fri, 20 Dec 2013 09:01:32 -0500
Subject: Re: Running Hadoop v2 clustered mode MR on an NFS mounted filesystem
From: dsui...@rdx.com
To: user@hadoop.apache.org

I think most of your problem is coming from the options you are setting:
"hadoop jar 
/hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
wordcount -fs local -jt local /hduser/mount_point/  /results"

You appear to be directing your namenode to run jobs in the LOCAL job runner 
and directing it to read from the LOCAL filesystem. Drop the -jt argument and 
it should run in distributed mode if your cluster is set up right. You don't 
need to do anything special to point Hadoop towards a NFS location, other than 
set up the NFS location properly and make sure if you are directing to it by 
name that it will resolve to the right address. Hadoop doesn't care where it 
is, as long as it can read from and write to it. The fact that you are telling 
it to read/write from/to a NFS location that happens to be mounted as a local 
filesystem object doesn't matter - you could direct it to the local /hduser/ 
path and set the -fs local option, and it would end up on the NFS mount, 
because that's where the NFS mount actually exists, or you could direct it to 
the absolute network location of the folder that you want, it shouldn't make a 
difference.
Devin SuiterJr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com

On Fri, Dec 20, 2013 at 5:27 AM, Atish Kathpal <atish.kath...@gmail.com> wrote:

Hello 
The picture below describes the deployment architecture I am trying to achieve. 
However, when I run the wordcount example code with the below configuration, by 
issuing the command from the master node, I notice only the master node 
spawning map tasks and completing the submitted job. Below is the command I 
used:

hadoop jar 
/hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
wordcount -fs local -jt local /hduser/mount_point/  /results
Question: How can I leverage both the hadoop nodes for running MR, while 
serving my data from the common NFS mount point running my filesystem at the 
backend? Has any one tried such a setup before?

Thanks!

<<inline: image.png>>

RE: Running Hadoop v2 clustered mode MR on an NFS mounted filesystem

Reply via email to