[ 
https://issues.apache.org/jira/browse/HADOOP-6498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802343#action_12802343
 ] 

Vinod K V commented on HADOOP-6498:
-----------------------------------

Though the fix is a proper direction to avoid client hangs, we should first fix 
the original problem. Heartbeat hanging would mean some kind of uncaught 
exception like NPE on the JobTracker side which left unhandled may result in 
inconsistent state. Can you dig through JT logs and see if any such issue 
happened?

> IPC client  bug may cause rpc call hang
> ---------------------------------------
>
>                 Key: HADOOP-6498
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6498
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.18.3, 0.19.0, 0.19.1, 0.19.2, 0.20.0, 0.20.1
>            Reporter: Ruyue Ma
>            Assignee: Ruyue Ma
>            Priority: Critical
>             Fix For: 0.21.0
>
>         Attachments: hadoop-6498.patch
>
>
> I can reproduce some rpc call  hang bug when connection thread of ipc client 
> receives response for outstanding call. 
> The stacks when hang occurs (TaskTracker):
>   Waiting on org.apache.hadoop.ipc.client$c...@1c3cbb4b
>   Stack:
>     java.lang.Object.wait(Native Method)
>     java.lang.Object.wait(Object.java:485)
>     org.apache.hadoop.ipc.Client.call(Client.java:691)
>     org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>     org.apache.hadoop.mapred.$Proxy4.heartbeat(Unknown Source)
>     
> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1250)
>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1082)
>     org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1785)
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2796)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to