I've read in the archive that it should be possible to use any distributed
filesystem since the data is available to all nodes, so it should be
possible to use NFS, right?
I've also read somewere in the archive that this shoud be possible...


slitz


On Fri, Apr 11, 2008 at 1:43 PM, Peeyush Bishnoi <[EMAIL PROTECTED]>
wrote:

> Hello ,
>
> To execute Hadoop Map-Reduce job input data should be on HDFS not on
> NFS.
>
> Thanks
>
> ---
> Peeyush
>
>
>
> On Fri, 2008-04-11 at 12:40 +0100, slitz wrote:
>
> > Hello,
> > I'm trying to assemble a simple setup of 3 nodes using NFS as
> Distributed
> > Filesystem.
> >
> > Box A: 192.168.2.3, this box is either the NFS server and working as a
> slave
> > node
> > Box B: 192.168.2.30, this box is only JobTracker
> > Box C: 192.168.2.31, this box is only slave
> >
> > Obviously all three nodes can access the NFS shared, and the path to the
> > share is /home/slitz/warehouse in all three.
> >
> > My hadoop-site.xml file were copied over all nodes and looks like this:
> >
> > <configuration>
> >
> > <property>
> >
> > <name>fs.default.name</name>
> >
> >  <value>local</value>
> >
> > <description>
> >
> >  The name of the default file system. Either the literal string
> >
> > "local" or a host:port for NDFS.
> >
> >  </description>
> >
> > </property>
> >
> >  <property>
> >
> > <name>mapred.job.tracker</name>
> >
> >  <value>192.168.2.30:9001</value>
> >
> > <description>
> >
> >  The host and port that the MapReduce job
> >
> > tracker runs at. If "local", then jobs are
> >
> >  run in-process as a single map and reduce task.
> >
> > </description>
> >
> >  </property>
> >
> > <property>
> >
> > <name>mapred.system.dir</name>
> >
> >  <value>/home/slitz/warehouse/hadoop_service/system</value>
> >
> > <description>omgrotfcopterlol.</description>
> >
> >  </property>
> >
> > </configuration>
> >
> >
> > As one can see, i'm not using HDFS at all.
> > (Because all the free space i have is located in only one node, so using
> > HDFS would be unnecessary overhead)
> >
> > I've copied the input folder from hadoop to /home/slitz/warehouse/input.
> > When i try to run the example line
> >
> > bin/hadoop jar hadoop-*-examples.jar grep /home/slitz/warehouse/input/
> > /home/slitz/warehouse/output 'dfs[a-z.]+'
> >
> > the job starts and finish okay but at the end i get this error:
> >
> > org.apache.hadoop.mapred.InvalidInputException: Input path doesn't exist
> :
> > /home/slitz/hadoop-0.15.3/grep-temp-141595661
> > at
> >
> org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:154)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:508)
> > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
> > (...the error stack continues...)
> >
> > i don't know why the input path being looked is in the local path
> > /home/slitz/hadoop(...) instead of /home/slitz/warehouse/(...)
> >
> > Maybe something is missing in my hadoop-site.xml?
> >
> >
> >
> > slitz
>

Reply via email to