Re: Could Not Find file.out.index (Help starting Hadoop!)
If I replace the mapred.job.tracker in hadoop-site with local, then the job seems to work: [EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar wordcount books booksOutput 08/11/14 12:06:13 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 08/11/14 12:06:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/14 12:06:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/14 12:06:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/14 12:06:13 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/14 12:06:14 INFO mapred.JobClient: Running job: job_local_0001 08/11/14 12:06:14 INFO mapred.MapTask: numReduceTasks: 1 08/11/14 12:06:14 INFO mapred.MapTask: io.sort.mb = 100 08/11/14 12:06:14 INFO mapred.MapTask: data buffer = 79691776/99614720 08/11/14 12:06:14 INFO mapred.MapTask: record buffer = 262144/327680 08/11/14 12:06:14 INFO mapred.MapTask: Starting flush of map output 08/11/14 12:06:14 INFO mapred.MapTask: bufstart = 0; bufend = 1086784; bufvoid = 99614720 08/11/14 12:06:14 INFO mapred.MapTask: kvstart = 0; kvend = 109855; length = 327680 08/11/14 12:06:14 INFO mapred.MapTask: Index: (0, 267034, 267034) 08/11/14 12:06:14 INFO mapred.MapTask: Finished spill 0 08/11/14 12:06:15 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hadoop/books/one.txt:0+662001 08/11/14 12:06:15 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_00_0' done. 08/11/14 12:06:15 INFO mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_00_0' to hdfs://localhost:54310/user/hadoop/booksOutput 08/11/14 12:06:15 INFO mapred.JobClient: map 100% reduce 0% 08/11/14 12:06:15 INFO mapred.MapTask: numReduceTasks: 1 08/11/14 12:06:15 INFO mapred.MapTask: io.sort.mb = 100 08/11/14 12:06:15 INFO mapred.MapTask: data buffer = 79691776/99614720 08/11/14 12:06:15 INFO mapred.MapTask: record buffer = 262144/327680 08/11/14 12:06:15 INFO mapred.MapTask: Spilling map output: buffer full = false and record full = true 08/11/14 12:06:15 INFO mapred.MapTask: bufstart = 0; bufend = 2545957; bufvoid = 99614720 08/11/14 12:06:15 INFO mapred.MapTask: kvstart = 0; kvend = 262144; length = 327680 08/11/14 12:06:15 INFO mapred.MapTask: Starting flush of map output 08/11/14 12:06:16 INFO mapred.MapTask: Index: (0, 717078, 717078) 08/11/14 12:06:16 INFO mapred.MapTask: Finished spill 0 08/11/14 12:06:16 INFO mapred.MapTask: bufstart = 2545957; bufend = 2601773; bufvoid = 99614720 08/11/14 12:06:16 INFO mapred.MapTask: kvstart = 262144; kvend = 267975; length = 327680 08/11/14 12:06:16 INFO mapred.MapTask: Index: (0, 23156, 23156) 08/11/14 12:06:16 INFO mapred.MapTask: Finished spill 1 08/11/14 12:06:16 INFO mapred.Merger: Merging 2 sorted segments 08/11/14 12:06:16 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 740234 bytes 08/11/14 12:06:16 INFO mapred.MapTask: Index: (0, 740232, 740232) 08/11/14 12:06:16 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hadoop/books/three.txt:0+1539989 08/11/14 12:06:16 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_01_0' done. 08/11/14 12:06:16 INFO mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_01_0' to hdfs://localhost:54310/user/hadoop/booksOutput 08/11/14 12:06:17 INFO mapred.MapTask: numReduceTasks: 1 08/11/14 12:06:17 INFO mapred.MapTask: io.sort.mb = 100 08/11/14 12:06:17 INFO mapred.MapTask: data buffer = 79691776/99614720 08/11/14 12:06:17 INFO mapred.MapTask: record buffer = 262144/327680 08/11/14 12:06:17 INFO mapred.MapTask: Starting flush of map output 08/11/14 12:06:17 INFO mapred.MapTask: bufstart = 0; bufend = 2387689; bufvoid = 99614720 08/11/14 12:06:17 INFO mapred.MapTask: kvstart = 0; kvend = 251356; length = 327680 08/11/14 12:06:18 INFO mapred.MapTask: Index: (0, 466648, 466648) 08/11/14 12:06:18 INFO mapred.MapTask: Finished spill 0 08/11/14 12:06:18 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hadoop/books/two.txt:0+1391690 08/11/14 12:06:18 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_02_0' done. 08/11/14 12:06:18 INFO mapred.TaskRunner: Saved output of task 'attempt_local_0001_m_02_0' to hdfs://localhost:54310/user/hadoop/booksOutput 08/11/14 12:06:18 INFO mapred.ReduceTask: Initiating final on-disk merge with 3 files 08/11/14 12:06:18 INFO mapred.Merger: Merging 3 sorted segments 08/11/14 12:06:18 INFO mapred.Merger: Down to the last merge-pass, with 3 segments left of total size: 1473914 bytes 08/11/14 12:06:18 INFO mapred.LocalJobRunner: reduce reduce 08/11/14 12:06:18 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_00_0' done. 08/11/14 12:06:18 INFO mapred.TaskRunner: Saved output of task 'attempt_local_0001_r_00_0' to hdfs://localhost:54310/user/hadoop/booksOutput 08/11/14 12:06:19 INFO mapred.JobClient: Job complete: job_local_0001 08/11/14 12:06:19 INFO mapred.JobClient: Counters: 13 08/11/14 12:06:19 INFO mapred.JobClient: File
Could Not Find file.out.index (Help starting Hadoop!)
Hello everybody, I’m sorry if this has already been covered somewhere else, but I’ve been searching the web for weeks to no avail. :( Anyway, I am attempting to set up a single-node cluster following the directions here: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster) . I get everything set up fine and attempt to start the first example program (running the wordcount job). I format the namenode, start-all, and copy the files from local. I then try to execute the example job, but this is the output: [EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar wordcoun t books booksOutput 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:42 INFO mapred.JobClient: Running job: job_200811131821_0001 08/11/13 18:21:43 INFO mapred.JobClient: map 0% reduce 0% 08/11/13 18:21:49 INFO mapred.JobClient: map 66% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: map 100% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: Task Id : attempt_200811131821_0001_m_01_0, Status : FAILED Map output lost, rescheduling: getMapOutput(attempt_200811131821_0001_m_01_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200811131821_0001/attempt_200811131821_0001_m_01_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2402) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427) at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567) at org.mortbay.http.HttpContext.handle(HttpContext.java:1565) at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635) at org.mortbay.http.HttpContext.handle(HttpContext.java:1517) at org.mortbay.http.HttpServer.service(HttpServer.java:954) at org.mortbay.http.HttpConnection.service(HttpConnection.java:814) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534): So apparently the map output is being lost, and reduce can not find it. Upon viewing the logs, I also find this error in the secondary namenode log file: 2008-11-13 17:41:40,518 WARN org.mortbay.jetty.servlet.WebApplicationContext: Configuration error on file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary java.io.FileNotFoundException: file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary at org.mortbay.jetty.servlet.WebApplicationContext.resolveWebApp(WebApplicationContext.java:266) at org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:449) at org.mortbay.util.Container.start(Container.java:72) at org.mortbay.http.HttpServer.doStart(HttpServer.java:753) at org.mortbay.util.Container.start(Container.java:72) at org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:207) at org.apache.hadoop.dfs.SecondaryNameNode.initialize(SecondaryNameNode.java:156) at org.apache.hadoop.dfs.SecondaryNameNode.init(SecondaryNameNode.java:108) at org.apache.hadoop.dfs.SecondaryNameNode.main(SecondaryNameNode.java:460) I have the following defined in hadoop-site: property namehadoop.tmp.dir/name valuetmp/value descriptionA base for other temporary directories/description /property property namefs.default.name/name valuehdfs://localhost:54310/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem./description /property property namemapred.job.tracker/name valuelocalhost:54311/value descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task. /description /property property namedfs.replication/name