Re: Is there any way to debug the hadoop job in eclipse

2009-06-06 Thread Guanghua
This link may help you.
http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms

2009/6/7 jason hadoop 

> The chapters are available for download now.
>
> On Sat, Jun 6, 2009 at 3:33 AM, zhang jianfeng  wrote:
>
> > Is there any resource on internet that I can get as soon as possible ?
> >
> >
> >
> > On Fri, Jun 5, 2009 at 6:43 PM, jason hadoop 
> > wrote:
> >
> > > chapter 7 of my book goes into details of hour to debug with eclipse
> > >
> > > On Fri, Jun 5, 2009 at 3:40 AM, zhang jianfeng 
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Some jobs I submit to hadoop failed, but I can not see what's the
> > > problem.
> > > > So is there any way to debug the hadoop job in eclipse, such as the
> > > remote
> > > > debug.
> > > >
> > > > or others ways to find the job failed reason. I didnot find enough
> > > > information in the job tracker.
> > > >
> > > > Thank you.
> > > >
> > > > Jeff Zhang
> > > >
> > >
> > >
> > >
> > > --
> > > Alpha Chapters of my book on Hadoop are available
> > > http://www.apress.com/book/view/9781430219422
> > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> >
>
>
>
> --
>  Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>


Every time the mapping phase finishes I see this

2009-06-06 Thread Mayuran Yogarajah
There are always a few 'Failed/Killed Task Attempts' and when I view the 
logs for

these I see:

- some that are empty, ie stdout/stderr/syslog logs are all blank
- several that say:

2009-06-06 20:47:15,309 WARN org.apache.hadoop.mapred.TaskTracker: Error 
running child
java.io.IOException: Filesystem closed
at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:195)
at org.apache.hadoop.dfs.DFSClient.access$600(DFSClient.java:59)
at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.close(DFSClient.java:1359)
at java.io.FilterInputStream.close(FilterInputStream.java:159)
at 
org.apache.hadoop.mapred.LineRecordReader$LineReader.close(LineRecordReader.java:103)
at 
org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:301)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:173)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:231)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)



Any idea why this happens? I don't understand why I'd be seeing these 
only as the mappers get to

100%.

thanks


Re: Is there any way to debug the hadoop job in eclipse

2009-06-06 Thread jason hadoop
The chapters are available for download now.

On Sat, Jun 6, 2009 at 3:33 AM, zhang jianfeng  wrote:

> Is there any resource on internet that I can get as soon as possible ?
>
>
>
> On Fri, Jun 5, 2009 at 6:43 PM, jason hadoop 
> wrote:
>
> > chapter 7 of my book goes into details of hour to debug with eclipse
> >
> > On Fri, Jun 5, 2009 at 3:40 AM, zhang jianfeng  wrote:
> >
> > > Hi all,
> > >
> > > Some jobs I submit to hadoop failed, but I can not see what's the
> > problem.
> > > So is there any way to debug the hadoop job in eclipse, such as the
> > remote
> > > debug.
> > >
> > > or others ways to find the job failed reason. I didnot find enough
> > > information in the job tracker.
> > >
> > > Thank you.
> > >
> > > Jeff Zhang
> > >
> >
> >
> >
> > --
> > Alpha Chapters of my book on Hadoop are available
> > http://www.apress.com/book/view/9781430219422
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals


Re: Is there any way to debug the hadoop job in eclipse

2009-06-06 Thread zhang jianfeng
Is there any resource on internet that I can get as soon as possible ?



On Fri, Jun 5, 2009 at 6:43 PM, jason hadoop  wrote:

> chapter 7 of my book goes into details of hour to debug with eclipse
>
> On Fri, Jun 5, 2009 at 3:40 AM, zhang jianfeng  wrote:
>
> > Hi all,
> >
> > Some jobs I submit to hadoop failed, but I can not see what's the
> problem.
> > So is there any way to debug the hadoop job in eclipse, such as the
> remote
> > debug.
> >
> > or others ways to find the job failed reason. I didnot find enough
> > information in the job tracker.
> >
> > Thank you.
> >
> > Jeff Zhang
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>


ReduceTask: No Route To Host

2009-06-06 Thread asif md
Now theres this error showing up.
When i run a job on my 2 node cluster, it hangs at

[ ~]$ hadoop jar $HADOOP_HOME/hadoop-0.18.3-examples.jar wordcount gutenberg
gutenberg-output

09/06/06 01:50:54 INFO mapred.FileInputFormat: Total input paths to process
: 6
09/06/06 01:50:54 INFO mapred.FileInputFormat: Total input paths to process
: 6
09/06/06 01:50:54 INFO mapred.JobClient: Running job: job_200906060149_0001
09/06/06 01:50:55 INFO mapred.JobClient:  map 0% reduce 0%
09/06/06 01:51:01 INFO mapred.JobClient:  map 33% reduce 0%
09/06/06 01:51:02 INFO mapred.JobClient:  map 66% reduce 0%
09/06/06 01:51:03 INFO mapred.JobClient:  map 100% reduce 0%
***
SYSLOG of reduce:

[~]$ cat
/home/utdhadoop1/Hadoop/hadoop-0.18.3/logs/userlogs/attempt_200906060156_0001_r_00_0/syslog
2009-06-06 01:56:52,838 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=SHUFFLE, sessionId=
2009-06-06 01:56:53,254 INFO org.apache.hadoop.mapred.ReduceTask:
ShuffleRamManager: MemoryLimit=78643200, MaxSingleShuffleLimit=19660800
2009-06-06 01:56:53,259 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 Thread started: Thread for merging
on-disk files
2009-06-06 01:56:53,259 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 Thread waiting: Thread for merging
on-disk files
2009-06-06 01:56:53,260 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 Thread started: Thread for merging in
memory files
2009-06-06 01:56:53,260 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 Need another 6 map output(s) where 0 is
already in progress
2009-06-06 01:56:53,264 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0: Got 4 new map-outputs & number of
known map outputs is 4
2009-06-06 01:56:53,265 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 Scheduled 2 of 4 known outputs (0 slow
hosts and 2 dup hosts)
2009-06-06 01:56:53,428 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 copy failed:
attempt_200906060156_0001_m_00_0 from **
2009-06-06 01:56:53,428 WARN org.apache.hadoop.mapred.ReduceTask:
java.net.NoRouteToHostException: No route to host
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1360)
at java.security.AccessController.doPrivileged(Native Method)
at
sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1354)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1008)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1143)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1084)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:997)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:946)
Caused by: java.net.NoRouteToHostException: No route to host
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
at sun.net.www.http.HttpClient.(HttpClient.java:233)
at sun.net.www.http.HttpClient.New(HttpClient.java:306)
at sun.net.www.http.HttpClient.New(HttpClient.java:323)
at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:852)
at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:793)
at
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:718)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1041)
... 4 more

2009-06-06 01:56:54,264 INFO org.apache.hadoop.mapred.ReduceTask: Task
attempt_200906060156_0001_r_00_0: Failed fetch #1 from
attempt_200906060156_0001_m_00_0
2009-06-06 01:56:54,264 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_200906060156_0001_r_00_0 adding host ** to penalty box,
n