Re: Using NFS without HDFS

slitz Fri, 11 Apr 2008 11:35:21 -0700

Thank you for the file:/// tip, i was not including it in the paths.
I'm running the example with this line -> bin/hadoop jar
hadoop-*-examples.jar grep file:///home/slitz/warehouse/input
file:///home/slitz/warehouse/output 'dfs[a-z.]+'


But i'm getting the same error as before, i'm getting

org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : *
/home/slitz/hadoop-0.15.3/grep-temp-1030179831*
at
org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:154)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:508)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
(...stack continues...)

i think the problem may be the input path, it should be pointing to some
path in the nfs share, right?

the grep-temp-* dir is being created in the HADOOP_HOME of Box A (
192.168.2.3).

slitz

On Fri, Apr 11, 2008 at 4:06 PM, Luca <[EMAIL PROTECTED]> wrote:

> slitz wrote:
>
> > I've read in the archive that it should be possible to use any
> > distributed
> > filesystem since the data is available to all nodes, so it should be
> > possible to use NFS, right?
> > I've also read somewere in the archive that this shoud be possible...
> >
> >
> As far as I know, you can refer to any file on a mounted file system
> (visible from all compute nodes) using the prefix file:// before the full
> path, unless another prefix has been specified.
>
> Cheers,
> Luca
>
>
>
> > slitz
> >
> >
> > On Fri, Apr 11, 2008 at 1:43 PM, Peeyush Bishnoi <[EMAIL PROTECTED]
> > >
> > wrote:
> >
> >  Hello ,
> > >
> > > To execute Hadoop Map-Reduce job input data should be on HDFS not on
> > > NFS.
> > >
> > > Thanks
> > >
> > > ---
> > > Peeyush
> > >
> > >
> > >
> > > On Fri, 2008-04-11 at 12:40 +0100, slitz wrote:
> > >
> > >  Hello,
> > > > I'm trying to assemble a simple setup of 3 nodes using NFS as
> > > >
> > > Distributed
> > >
> > > > Filesystem.
> > > >
> > > > Box A: 192.168.2.3, this box is either the NFS server and working as
> > > > a
> > > >
> > > slave
> > >
> > > > node
> > > > Box B: 192.168.2.30, this box is only JobTracker
> > > > Box C: 192.168.2.31, this box is only slave
> > > >
> > > > Obviously all three nodes can access the NFS shared, and the path to
> > > > the
> > > > share is /home/slitz/warehouse in all three.
> > > >
> > > > My hadoop-site.xml file were copied over all nodes and looks like
> > > > this:
> > > >
> > > > <configuration>
> > > >
> > > > <property>
> > > >
> > > > <name>fs.default.name</name>
> > > >
> > > >  <value>local</value>
> > > >
> > > > <description>
> > > >
> > > >  The name of the default file system. Either the literal string
> > > >
> > > > "local" or a host:port for NDFS.
> > > >
> > > >  </description>
> > > >
> > > > </property>
> > > >
> > > >  <property>
> > > >
> > > > <name>mapred.job.tracker</name>
> > > >
> > > >  <value>192.168.2.30:9001</value>
> > > >
> > > > <description>
> > > >
> > > >  The host and port that the MapReduce job
> > > >
> > > > tracker runs at. If "local", then jobs are
> > > >
> > > >  run in-process as a single map and reduce task.
> > > >
> > > > </description>
> > > >
> > > >  </property>
> > > >
> > > > <property>
> > > >
> > > > <name>mapred.system.dir</name>
> > > >
> > > >  <value>/home/slitz/warehouse/hadoop_service/system</value>
> > > >
> > > > <description>omgrotfcopterlol.</description>
> > > >
> > > >  </property>
> > > >
> > > > </configuration>
> > > >
> > > >
> > > > As one can see, i'm not using HDFS at all.
> > > > (Because all the free space i have is located in only one node, so
> > > > using
> > > > HDFS would be unnecessary overhead)
> > > >
> > > > I've copied the input folder from hadoop to
> > > > /home/slitz/warehouse/input.
> > > > When i try to run the example line
> > > >
> > > > bin/hadoop jar hadoop-*-examples.jar grep
> > > > /home/slitz/warehouse/input/
> > > > /home/slitz/warehouse/output 'dfs[a-z.]+'
> > > >
> > > > the job starts and finish okay but at the end i get this error:
> > > >
> > > > org.apache.hadoop.mapred.InvalidInputException: Input path doesn't
> > > > exist
> > > >
> > > :
> > >
> > > > /home/slitz/hadoop-0.15.3/grep-temp-141595661
> > > > at
> > > >
> > > > org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:154)
> > >
> > > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:508)
> > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
> > > > (...the error stack continues...)
> > > >
> > > > i don't know why the input path being looked is in the local path
> > > > /home/slitz/hadoop(...) instead of /home/slitz/warehouse/(...)
> > > >
> > > > Maybe something is missing in my hadoop-site.xml?
> > > >
> > > >
> > > >
> > > > slitz
> > > >
> > >
> >
>
>

Re: Using NFS without HDFS

Reply via email to