Re: Could Not Find file.out.index (Help starting Hadoop!)

2008-11-14 Thread KevinAWorkman
 Systems
08/11/14 12:06:19 INFO mapred.JobClient: HDFS bytes read=10104027
08/11/14 12:06:19 INFO mapred.JobClient: HDFS bytes written=1299408
08/11/14 12:06:19 INFO mapred.JobClient: Local bytes read=4082248
08/11/14 12:06:19 INFO mapred.JobClient: Local bytes written=6548037
08/11/14 12:06:19 INFO mapred.JobClient:   Map-Reduce Framework
08/11/14 12:06:19 INFO mapred.JobClient: Reduce input groups=82301
08/11/14 12:06:19 INFO mapred.JobClient: Combine output records=102297
08/11/14 12:06:19 INFO mapred.JobClient: Map input records=77934
08/11/14 12:06:19 INFO mapred.JobClient: Reduce output records=82301
08/11/14 12:06:19 INFO mapred.JobClient: Map output bytes=6076246
08/11/14 12:06:19 INFO mapred.JobClient: Map input bytes=3593680
08/11/14 12:06:19 INFO mapred.JobClient: Combine input records=629186
08/11/14 12:06:19 INFO mapred.JobClient: Map output records=629186
08/11/14 12:06:19 INFO mapred.JobClient: Reduce input records=102297

However, I do get the following error in the jobtracker log:

2008-11-14 12:05:55,663 FATAL org.apache.hadoop.mapred.JobTracker:
java.lang.RuntimeException: Not a host:port pair: local

And this in the tasktracker log:

2008-11-14 12:05:56,042 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because java.lang.RuntimeException: Not a host:port pair:
local




KevinAWorkman wrote:
 
 Hello everybody,
 
 I’m sorry if this has already been covered somewhere else, but I’ve been
 searching the web for weeks to no avail. :(
 
 Anyway, I am attempting to set up a single-node cluster following the
 directions here:
 http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)
 .  I get everything set up fine and attempt to start the first example
 program (running the wordcount job).  I format the namenode, start-all,
 and copy the files from local.  I then try to execute the example job, but
 this is the output:
 
 [EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar
 wordcoun
 t books booksOutput
 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to
 process : 3
 08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to
 process : 3
 08/11/13 18:21:42 INFO mapred.JobClient: Running job:
 job_200811131821_0001
 08/11/13 18:21:43 INFO mapred.JobClient:  map 0% reduce 0%
 08/11/13 18:21:49 INFO mapred.JobClient:  map 66% reduce 0%
 08/11/13 18:21:52 INFO mapred.JobClient:  map 100% reduce 0%
 08/11/13 18:21:52 INFO mapred.JobClient: Task Id :
 attempt_200811131821_0001_m_01_0, Status : FAILED
 Map output lost, rescheduling:
 getMapOutput(attempt_200811131821_0001_m_01_0,0) failed :
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
 taskTracker/jobcache/job_200811131821_0001/attempt_200811131821_0001_m_01_0/output/file.out.index
 in any of the configured local directories
 at
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
 at
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
 at
 org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2402)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
 at
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
 at
 org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
 at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
 at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
 at
 org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
 at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
 at org.mortbay.http.HttpServer.service(HttpServer.java:954)
 at
 org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
 at
 org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
 at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
 at
 org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
 at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
 at
 org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534):
 
 
 
 So apparently the map output is being lost, and reduce can not find it. 
 Upon viewing the logs, I also find this error in the secondary namenode
 log file:
 
 2008-11-13 17:41:40,518 WARN
 org.mortbay.jetty.servlet.WebApplicationContext: Configuration error on
 file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary
 java.io.FileNotFoundException:
 file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary
   at
 org.mortbay.jetty.servlet.WebApplicationContext.resolveWebApp(WebApplicationContext.java:266

Could Not Find file.out.index (Help starting Hadoop!)

2008-11-13 Thread KevinAWorkman

Hello everybody,

I’m sorry if this has already been covered somewhere else, but I’ve been
searching the web for weeks to no avail. :(

Anyway, I am attempting to set up a single-node cluster following the
directions here:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)
.  I get everything set up fine and attempt to start the first example
program (running the wordcount job).  I format the namenode, start-all, and
copy the files from local.  I then try to execute the example job, but this
is the output:

[EMAIL PROTECTED] hadoop-0.18.1]$ bin/hadoop jar hadoop-0.18.1-examples.jar
wordcoun
t books booksOutput
08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process
: 3
08/11/13 18:21:41 INFO mapred.FileInputFormat: Total input paths to process
: 3
08/11/13 18:21:42 INFO mapred.JobClient: Running job: job_200811131821_0001
08/11/13 18:21:43 INFO mapred.JobClient:  map 0% reduce 0%
08/11/13 18:21:49 INFO mapred.JobClient:  map 66% reduce 0%
08/11/13 18:21:52 INFO mapred.JobClient:  map 100% reduce 0%
08/11/13 18:21:52 INFO mapred.JobClient: Task Id :
attempt_200811131821_0001_m_01_0, Status : FAILED
Map output lost, rescheduling:
getMapOutput(attempt_200811131821_0001_m_01_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_200811131821_0001/attempt_200811131821_0001_m_01_0/output/file.out.index
in any of the configured local directories
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2402)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
at
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
at
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534):



So apparently the map output is being lost, and reduce can not find it. 
Upon viewing the logs, I also find this error in the secondary namenode log
file:

2008-11-13 17:41:40,518 WARN
org.mortbay.jetty.servlet.WebApplicationContext: Configuration error on
file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary
java.io.FileNotFoundException:
file:/home/staff/hadoop/hadoop-0.18.1/webapps/secondary
at
org.mortbay.jetty.servlet.WebApplicationContext.resolveWebApp(WebApplicationContext.java:266)
at
org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:449)
at org.mortbay.util.Container.start(Container.java:72)
at org.mortbay.http.HttpServer.doStart(HttpServer.java:753)
at org.mortbay.util.Container.start(Container.java:72)
at
org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:207)
at
org.apache.hadoop.dfs.SecondaryNameNode.initialize(SecondaryNameNode.java:156)
at
org.apache.hadoop.dfs.SecondaryNameNode.init(SecondaryNameNode.java:108)
at 
org.apache.hadoop.dfs.SecondaryNameNode.main(SecondaryNameNode.java:460)

I have the following defined in hadoop-site:

property
   namehadoop.tmp.dir/name
   valuetmp/value
   descriptionA base for other temporary directories/description
/property

property
  namefs.default.name/name
  valuehdfs://localhost:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
/property
 
property
  namemapred.job.tracker/name
  valuelocalhost:54311/value
  descriptionThe host and port that the MapReduce job tracker runs
  at.  If local, then jobs are run in-process as a single map
  and reduce task.
  /description
/property
 
property
  namedfs.replication/name