[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687346#comment-13687346 ]
Xi Fang commented on MAPREDUCE-5330: ------------------------------------ If Signal.TERM is sent to a process, then we wait for a delay. But in Windows the signal kind is ignored - we just kill it (look at Shell#getSignalKillProcessGroupCommand()) {code} public static String[] getSignalKillProcessGroupCommand(int code, String groupId) { if (WINDOWS) { return new String[] { Shell.WINUTILS, "task", "kill", groupId }; } else { return new String[] { "kill", "-" + code , "-" + groupId }; } } {code} Here is a fix. If the OS is Windows and the signal is TERM, then return immediately and let a delayed process killer actually kill this process group. This can give this process group a graceful time to clean up itself. > Killing M/R JVM's leads to metrics not being uploaded > ----------------------------------------------------- > > Key: MAPREDUCE-5330 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 1-win > Environment: Windows > Reporter: Xi Fang > Assignee: Xi Fang > Attachments: MAPREDUCE-5330.patch > > > In MapReduce, we sometimes kill a task's JVM before it naturally shuts down > if we want to launch other tasks (look in > JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map > task process is in the middle of doing some cleanup/finalization after the > task is done, it might be interrupted/killed without giving it a chance. > In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during > closing file systems in a special shutdown hook, we're typically uploading > storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if > this kill happens these metrics get lost. The impact is that for many MR jobs > we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira