Hi,

This problem is killer! I've been strugelling with this for about a month!
It doesn't happen all the time, because of this problem the largest crawl I
could ever done is about 1 million pages.  I have three machines, 3
datanode, 1 data replicate, 1 job tracker, here is what I get:

nameserver tasktracker log file:

060219 142405 task_r_125kgt 0.14583334% reduce > copy >
060219 142406 task_r_125kgt 0.14583334% reduce > copy >
060219 142407 task_m_grycae  Error running child
060219 142407 task_m_grycae java.io.IOException: timed out waiting for
response
060219 142407 task_m_grycae     at org.apache.hadoop.ipc.Client.call(
Client.java:303)
060219 142407 task_m_grycae     at org.apache.hadoop.ipc.RPC$Invoker.invoke(
RPC.java:141)
060219 142407 task_m_grycae     at
org.apache.hadoop.mapred.$Proxy0.progress(Unknown
Source)
060219 142407 task_m_grycae     at
org.apache.hadoop.mapred.Task.reportProgress(Task.java:112)
060219 142407 task_m_grycae     at org.apache.hadoop.mapred.Task$1.setStatus
(Task.java:93)
060219 142407 task_m_grycae     at
org.apache.nutch.fetcher.Fetcher.reportStatus(Fetcher.java:276)
060219 142407 task_m_grycae     at org.apache.nutch.fetcher.Fetcher.run(
Fetcher.java:325)
060219 142407 task_m_grycae     at org.apache.hadoop.mapred.MapTask.run(
MapTask.java:129)
060219 142407 task_m_grycae     at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:637)
060219 142407 task_m_grycae 0.825607% 108745 pages, 5259 errors,
15.6pages/s, 2418 kb/s,
060219 142407 task_r_125kgt 0.14583334% reduce > copy >
060219 142408 task_m_grycae  Parent died.  Exiting task_m_grycae
060219 142408 task_r_125kgt 0.14583334% reduce > copy >
060219 142408 Server connection on port 50050 from xxxxxx: exiting
060219 142408 Server connection on port 50050 from xxxxxx: exiting
060219 142408 task_m_grycae Child Error
java.io.IOException: Task process exit with nonzero status.
        at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
060219 142411 task_m_grycae done; removing files.
060219 142413 task_r_125kgt 0.14583334% reduce > copy >


One of the datanode tasktracker log file:

060219 142611 task_m_2yfbgf  fetching
http://codex.wordpress.org/Managing_Plugins
060219 142611 task_m_2yfbgf  fetching
http://www.scubaboard.com/cms/search.php
060219 142611 task_m_2yfbgf Error reading child output
java.io.IOException: Bad file descriptor
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:194)
        at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java
:411)
        at sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java
:453)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)
        at java.io.InputStreamReader.read(InputStreamReader.java:167)
        at java.io.BufferedReader.fill(BufferedReader.java:136)
        at java.io.BufferedReader.readLine(BufferedReader.java:299)
        at java.io.BufferedReader.readLine(BufferedReader.java:362)
        at org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java
:170)
        at org.apache.hadoop.mapred.TaskRunner.access$100(TaskRunner.java
:29)
        at org.apache.hadoop.mapred.TaskRunner$1.run(TaskRunner.java:137)
060219 142611 task_m_2yfbgf 0.019530244% 2170 pages, 61 errors,
12.3pages/s, 1975 kb/s,
060219 142612 Server connection on port 50051 from xxxxxx: exiting
060219 142612 Server connection on port 50051 from xxxxxx: exiting
060219 142612 task_m_2yfbgf Child Error
java.io.IOException: Task process exit with nonzero status.
        at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
060219 142615 task_m_2yfbgf done; removing files.

The other datanode looks fine.


Thanks, Mike


On 2/16/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
> Gal Nitzan wrote:
> > During fetch all tasktrackers aborting the fetch with:
> >
> > task_m_b45ma2 Child Error
> > java.io.IOException: Task process exit with nonzero status.
> >         at
> > org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
> >
>
> What's reported just before this in this tasktracker's log?
>
> What's reported around this time in the jobtracker's log?
>
> Doug
>

Reply via email to