[ https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916964#comment-13916964 ]
Ming Ma commented on MAPREDUCE-5465: ------------------------------------ I discussed with Ravi offline and will provide the patch for review soon. The basic approach is to define a new state called FINISHING_CONTAINER for TaskAttemptStateInternal. TaskAttempt will transition to this new state after it receives TaskUmbilicalProtocol's done notification from the task JVM. This will give a chance for the container to exit by itself. Normally the attempt will receive container exit notification via NM -> RM -> AM route; if it doesn't get the notification in time, it will time out and clean up the container via stopContainer. > Container killed before hprof dumps profile.out > ----------------------------------------------- > > Key: MAPREDUCE-5465 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am, mrv2 > Affects Versions: 2.0.3-alpha > Reporter: Radim Kolar > Assignee: Ming Ma > Attachments: MAPREDUCE-5465.patch > > > If there is profiling enabled for mapper or reducer then hprof dumps > profile.out at process exit. It is dumped after task signaled to AM that work > is finished. > AM kills container with finished work without waiting for hprof to finish > dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 > works) , it could not finish dump in time before being killed making entire > dump unusable because cpu and heap stats are missing. > There needs to be better delay before container is killed if profiling is > enabled. -- This message was sent by Atlassian JIRA (v6.1.5#6160)