[ https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron T. Myers moved HADOOP-9970 to MAPREDUCE-5512: --------------------------------------------------- Affects Version/s: (was: 1.3.0) 1.3.0 Key: MAPREDUCE-5512 (was: HADOOP-9970) Project: Hadoop Map/Reduce (was: Hadoop Common) > TaskTracker hung after failed reconnect to the JobTracker > --------------------------------------------------------- > > Key: MAPREDUCE-5512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 1.3.0 > Reporter: Ivan Mitic > Assignee: Ivan Mitic > Attachments: hadoop-tasktracker-RD00155DD09100.log, tt_Hung.txt > > > TaskTracker hung after failed reconnect to the JobTracker. > This is the problematic piece of code: > {code} > this.distributedCacheManager = new TrackerDistributedCacheManager( > this.fConf, taskController); > this.distributedCacheManager.startCleanupThread(); > > this.jobClient = (InterTrackerProtocol) > UserGroupInformation.getLoginUser().doAs( > new PrivilegedExceptionAction<Object>() { > public Object run() throws IOException { > return RPC.waitForProxy(InterTrackerProtocol.class, > InterTrackerProtocol.versionID, > jobTrackAddr, fConf); > } > }); > {code} > In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup > thread will never be stopped, and given that it is a non daemon thread it > will keep TT up forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira