[ 
https://issues.apache.org/jira/browse/HADOOP-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved HADOOP-2167.
-----------------------------------

    Resolution: Cannot Reproduce

We haven't seen this nor can we seem to repro it. Also HADOOP-2216 led us 
astray...

I'm closing this for now, please re-open if required.

> Reduce tips complete 100%, but job does not complete saying reduces still 
> running.
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-2167
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2167
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amareshwari Sri Ramadasu
>            Assignee: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.16.0
>
>
> Job's reduces are stuck at 99.43% progress and 2 reduces in running state and 
> Job is not complete. 
> But the reduce task list on the job tracker shows they are complete 100% and 
> marked as SUCCEEDED and Finishtime is available jobtasks.jsp and jobhistory 
> also.
> With ipc.client.timeout = 600000, the exceptions on TT's running the reduces 
> are
> On one of the TTs, the logs show the following:
> 2007-11-07 08:34:16,092 INFO org.apache.hadoop.mapred.TaskTracker: Task 
> task_200711070637_0001_r_000150_0 is done.
> 2007-11-07 08:35:34,013 INFO org.apache.hadoop.mapred.TaskTracker: Task 
> task_200711070637_0001_r_000156_0 is done.
> 2007-11-07 08:42:44,751 ERROR org.apache.hadoop.mapred.TaskTracker: Caught 
> exception: java.net.SocketTimeoutException: timedout waiting for rpc response
>         at org.apache.hadoop.ipc.Client.call(Client.java:484)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>         at org.apache.hadoop.mapred.$Proxy0.heartbeat(Unknown Source)
>         at 
> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:897)
>         at 
> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:799)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1193)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2055)
> 2007-11-07 08:42:44,767 INFO org.apache.hadoop.mapred.TaskTracker: Resending 
> 'status' to .................
> On the other TT,
> 2007-11-07 08:40:30,484 INFO org.apache.hadoop.mapred.TaskTracker: Task 
> task_200711070637_0001_r_000160_0 is done.
> 2007-11-07 08:42:45,508 ERROR org.apache.hadoop.mapred.TaskTracker: Caught 
> exception: java.net.SocketTimeoutException: timedout waiting for rpc response
>         at org.apache.hadoop.ipc.Client.call(Client.java:484)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>         at org.apache.hadoop.mapred.$Proxy0.heartbeat(Unknown Source)
>         at 
> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:897)
>         at 
> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:799)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1193)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2055)
> 2007-11-07 08:42:45,508 INFO org.apache.hadoop.mapred.TaskTracker: Resending 
> 'status' to ..........
> On JT logs, the reduce tasks are done successfully:
> 2007-11-07 06:39:09,151 INFO org.apache.hadoop.mapred.JobTracker: Adding task 
> 'task_200711070637_0001_r_000160_0' to tip tip_200711070637_0001_r_000160, 
> for tracker 'x'
> 2007-11-07 08:42:45,708 INFO org.apache.hadoop.mapred.TaskRunner: Saved 
> output of task 'task_200711070637_0001_r_000160_0' to 'y'
> 2007-11-07 08:42:45,708 INFO org.apache.hadoop.mapred.JobInProgress: Task 
> 'task_200711070637_0001_r_000160_0' has completed 
> tip_200711070637_0001_r_000160 successfully.
> This would suggest that if tasks are done before the timeout, the problem 
> occurs in progress update. This is also not consistent since other reduce 
> tasks in the same situation are successful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to