[ https://issues.apache.org/jira/browse/MAPREDUCE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040755#comment-13040755 ]
Binglin Chang commented on MAPREDUCE-2364: ------------------------------------------ We encounter the same problem, when TaskTracker download & unJar a very big job.jar in localizeJob(), it stops sending heartbeat and web service hangs too. Our solution for this issue is to add a new lock in RunningJob class called localizing. Instead of holding the whole rjob lock, rjob.localizing is locked. > Shouldn't hold lock on rjob while localizing resources. > ------------------------------------------------------- > > Key: MAPREDUCE-2364 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2364 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.20.203.0 > Reporter: Owen O'Malley > Assignee: Devaraj Das > Fix For: 0.20.203.0 > > > There is a deadlock while localizing resources on the TaskTracker. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira