Re: Could Not Find file.out.index (Help starting Hadoop!)
Systems 08/11/14 12:06:19 INFO mapred.JobClient: HDFS bytes read=10104027 08/11/14 12:06:19 INFO mapred.JobClient: HDFS bytes written=1299408 08/11/14 12:06:19 INFO mapred.JobClient: Local bytes read=4082248 08/11/14 12:06:19 INFO mapred.JobClient: Local bytes written=6548037 08/11/14 12:06:19 INFO mapred.JobClient: Map-Reduce Framework 08/11/14 12:06:19 INFO mapred.JobClient: Reduce input groups=82301 08/11/14 12:06:19 INFO mapred.JobClient: Combine output records=102297 08/11/14 12:06:19 INFO mapred.JobClient: Map input records=77934 08/11/14 12:06:19 INFO mapred.JobClient: Reduce output records=82301 08/11/14 12:06:19 INFO mapred.JobClient: Map output bytes=6076246 08/11/14 12:06:19 INFO mapred.JobClient: Map input bytes=3593680 08/11/14 12:06:19 INFO mapred.JobClient: Combine input records=629186 08/11/14 12:06:19 INFO mapred.JobClient: Map output records=629186 08/11/14 12:06:19 INFO mapred.JobClient: Reduce input records=102297 However, I do get the following error in the jobtracker log: 2008-11-14 12:05:55,663 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.RuntimeException: Not a host:port pair: local And this in the tasktracker log: 2008-11-14 12:05:56,042 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.RuntimeException: Not a host:port pair: local KevinAWorkman wrote: Hello everybody, I’m sorry if this has already been covered somewhere else, but I’ve been searching the web for weeks to no avail. :( Anyway, I am attempting to set up a single-node cluster following the directions here: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster) . I get everything set up fine and attempt to start the first example program (running the wordcount job). I format the namenode, start-all, and copy the files from local. I then try to execute the example job, but this is the output: [EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar wordcoun t books booksOutput 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:42 INFO mapred.JobClient: Running job: job_200811131821_0001 08/11/13 18:21:43 INFO mapred.JobClient: map 0% reduce 0% 08/11/13 18:21:49 INFO mapred.JobClient: map 66% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: map 100% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: Task Id : attempt_200811131821_0001_m_01_0, Status : FAILED Map output lost, rescheduling: getMapOutput(attempt_200811131821_0001_m_01_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200811131821_0001/attempt_200811131821_0001_m_01_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2402) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427) at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567) at org.mortbay.http.HttpContext.handle(HttpContext.java:1565) at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635) at org.mortbay.http.HttpContext.handle(HttpContext.java:1517) at org.mortbay.http.HttpServer.service(HttpServer.java:954) at org.mortbay.http.HttpConnection.service(HttpConnection.java:814) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534): So apparently the map output is being lost, and reduce can not find it. Upon viewing the logs, I also find this error in the secondary namenode log file: 2008-11-13 17:41:40,518 WARN org.mortbay.jetty.servlet.WebApplicationContext: Configuration error on file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary java.io.FileNotFoundException: file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary at org.mortbay.jetty.servlet.WebApplicationContext.resolveWebApp(WebApplicationContext.java:266
Could Not Find file.out.index (Help starting Hadoop!)
Hello everybody, I’m sorry if this has already been covered somewhere else, but I’ve been searching the web for weeks to no avail. :( Anyway, I am attempting to set up a single-node cluster following the directions here: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster) . I get everything set up fine and attempt to start the first example program (running the wordcount job). I format the namenode, start-all, and copy the files from local. I then try to execute the example job, but this is the output: [EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar wordcoun t books booksOutput 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process : 3 08/11/13 18:21:42 INFO mapred.JobClient: Running job: job_200811131821_0001 08/11/13 18:21:43 INFO mapred.JobClient: map 0% reduce 0% 08/11/13 18:21:49 INFO mapred.JobClient: map 66% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: map 100% reduce 0% 08/11/13 18:21:52 INFO mapred.JobClient: Task Id : attempt_200811131821_0001_m_01_0, Status : FAILED Map output lost, rescheduling: getMapOutput(attempt_200811131821_0001_m_01_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200811131821_0001/attempt_200811131821_0001_m_01_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2402) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427) at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567) at org.mortbay.http.HttpContext.handle(HttpContext.java:1565) at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635) at org.mortbay.http.HttpContext.handle(HttpContext.java:1517) at org.mortbay.http.HttpServer.service(HttpServer.java:954) at org.mortbay.http.HttpConnection.service(HttpConnection.java:814) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534): So apparently the map output is being lost, and reduce can not find it. Upon viewing the logs, I also find this error in the secondary namenode log file: 2008-11-13 17:41:40,518 WARN org.mortbay.jetty.servlet.WebApplicationContext: Configuration error on file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary java.io.FileNotFoundException: file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary at org.mortbay.jetty.servlet.WebApplicationContext.resolveWebApp(WebApplicationContext.java:266) at org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:449) at org.mortbay.util.Container.start(Container.java:72) at org.mortbay.http.HttpServer.doStart(HttpServer.java:753) at org.mortbay.util.Container.start(Container.java:72) at org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:207) at org.apache.hadoop.dfs.SecondaryNameNode.initialize(SecondaryNameNode.java:156) at org.apache.hadoop.dfs.SecondaryNameNode.init(SecondaryNameNode.java:108) at org.apache.hadoop.dfs.SecondaryNameNode.main(SecondaryNameNode.java:460) I have the following defined in hadoop-site: property namehadoop.tmp.dir/name valuetmp/value descriptionA base for other temporary directories/description /property property namefs.default.name/name valuehdfs://localhost:54310/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem./description /property property namemapred.job.tracker/name valuelocalhost:54311/value descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task. /description /property property namedfs.replication/name