I will try this. For the HDFS, The WebUI M/R Admin, using 50030 (from the Apache example) shows 2 nodes registered. All of the jobs shows completed only on one of the nodes. I will package up a set of clean logs.
Thanks -s On Mon, Jul 23, 2012 at 2:08 PM, Harsh J <ha...@cloudera.com> wrote: > Steve, > > If you're going to use NFS, make sure your "hadoop.tmp.dir" property > points to the mount point that is NFS. Can you change that property > and restart the cluster and retry? > > Regarding the HDFS issue, its hard to tell without logs. Did you see > two nodes alive in the Web UI after configuring HDFS for two nodes and > configuring MR to use HDFS? > > On Mon, Jul 23, 2012 at 11:23 PM, Steve Sonnenberg <steveis...@gmail.com> > wrote: > > Thanks Harsh, > > > > 1) I was using NFS > > 2) I don't believe that anything under /tmp is distributed even when > running > > 3) When I use HDFS, it doesn't attempt to send ANY jobs to my second node > > > > Any clues? > > > > -steve > > > > > > On Fri, Jul 20, 2012 at 11:52 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >> A 2-node cluster is a fully-distributed cluster and cannot use a > >> file:/// FileSystem as thats not a distributed filesystem (unless its > >> an NFS mount). This explains why some of your tasks aren't able to > >> locate an earlier written file on the /tmp dir thats probably > >> available on the JT node alone, not the TT nodes. > >> > >> Use hdfs:// FS for fully-distributed operation. > >> > >> On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg < > steveis...@gmail.com> > >> wrote: > >> > I have a 2-node Fedora system and in cluster mode, I have the > following > >> > issue that I can't resolve. > >> > > >> > Hadoop 1.0.3 > >> > I'm running with filesystem, file:/// and invoking the simple 'grep' > >> > example > >> > > >> > hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir > >> > simple-pattern > >> > > >> > The initiator displays > >> > > >> > Error initializing attempt_201207201103_0003_m_000004_0: > >> > java.io.FileNotFoundException: File > >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken > >> > does > >> > not exist. > >> > getFileStatus(RawLocalFileSystem.java) > >> > localizeJobTokenFile(TaskTracker.java:4268) > >> > initializeJob(TaskTracker.java:1177) > >> > localizeJob > >> > run > >> > > >> > The /tmp/hadoop-hadoop/mapred/system directory only contains a > >> > 'jobtracker.info' file (on all systems) > >> > > >> > On the target system, in the tasktracker log file, I get the > following: > >> > > >> > 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker: > Got > >> > heartbeatResponse from JobTracker with responseId: 641 and 1 actions > >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: > >> > LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0 > >> > task's > >> > state:UNASSIGNED > >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: > >> > Trying to > >> > launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots > >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In > >> > TaskLauncher, current free slots : 2 and trying to launch > >> > attempt_201207201103_0003_m_000006_0 which needs 1 slots > >> > 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker: > Error > >> > initializing attempt_201207201103_0003_m_000006_0: > >> > java.io.FileNotFoundException: File > >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken > >> > does > >> > not exist. > >> > at > >> > > >> > > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) > >> > at > >> > > >> > > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) > >> > at > >> > > >> > > org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268) > >> > at > >> > > >> > > org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177) > >> > at > >> > > org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118) > >> > at > >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430) > >> > at java.lang.Thread.run(Thread.java:636) > >> > > >> > 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus: > >> > Trying to > >> > set finish time for task attempt_201207201103_0003_m_000006_0 when no > >> > start > >> > time is set, stackTrace is : java.lang.Exception > >> > at > >> > org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145) > >> > at > >> > > >> > > org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142) > >> > at > >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440) > >> > at java.lang.Thread.run(Thread.java:636) > >> > > >> > On both systems, ownership of all files directories under > >> > /tmp/hadoop-hadoop > >> > is the user/group hadoop/hadoop. > >> > > >> > > >> > Any ideas? > >> > > >> > Thanks > >> > > >> > > >> > -- > >> > Steve Sonnenberg > >> > > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > > > -- > > Steve Sonnenberg > > > > > > -- > Harsh J > -- Steve Sonnenberg