[ https://issues.apache.org/jira/browse/MAPREDUCE-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved MAPREDUCE-1144. ----------------------------------------- Resolution: Won't Fix > JT should not hold lock while writing user history logs to DFS > -------------------------------------------------------------- > > Key: MAPREDUCE-1144 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1144 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 0.20.1 > Reporter: Todd Lipcon > Attachments: MAPREDUCE-1144-branch-1.2.patch > > > I've seen behavior a few times now where the DFS is being slow for one reason > or another, and the JT essentially locks up waiting on it while one thread > tries for a long time to write history files out. The stack trace blocking > everything is: > Thread 210 (IPC Server handler 10 on 7277): > State: WAITING > Blocked count: 171424 > Waited count: 1209604 > Waiting on java.util.LinkedList@407dd154 > Stack: > java.lang.Object.wait(Native Method) > java.lang.Object.wait(Object.java:485) > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3122) > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3202) > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3151) > > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:67) > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:301) > sun.nio.cs.StreamEncoder.close(StreamEncoder.java:130) > java.io.OutputStreamWriter.close(OutputStreamWriter.java:216) > java.io.BufferedWriter.close(BufferedWriter.java:248) > java.io.PrintWriter.close(PrintWriter.java:295) > > org.apache.hadoop.mapred.JobHistory$JobInfo.logFinished(JobHistory.java:1349) > > org.apache.hadoop.mapred.JobInProgress.jobComplete(JobInProgress.java:2167) > > org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2111) > > org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:873) > > org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:3598) > org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:2792) > org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2581) > sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) > We should try not to do external IO while holding the JT lock, and instead > write the data to an in-memory buffer, drop the lock, and then write. -- This message was sent by Atlassian JIRA (v6.2#6252)