[ https://issues.apache.org/jira/browse/MAPREDUCE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hitesh Shah updated MAPREDUCE-3084: ----------------------------------- Attachment: MR-3084.wip.patch Attaching more or less a working version that fixes the issue. Handling the launched event at the killing state is effectively a no-op as the container cleanup event is always handled after a container launch event. The patch effectively ensures that either the container does not come up if it has not yet or kills it if it has. This requires changes in hadoop-common to get around the async nature of the launches . Sid/Vinod, please take a look and let me know if you see something wrong/missing. Given the slightly complex nature of this change, I decided not to incorporate the other missing state transitions into this patch but will instead open a separate jira for those. > race when KILL_CONTAINER is received for a LOCALIZED container > -------------------------------------------------------------- > > Key: MAPREDUCE-3084 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3084 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.0 > Reporter: Siddharth Seth > Assignee: Hitesh Shah > Priority: Blocker > Attachments: MR-3084.wip.patch > > > Depending on when ContainersLaunch starts a container, {{KILL_CONTAINER}} > when container state is {{LOCALIZED}} ({{LAUNCH_CONTAINER}} event already > sent) can end up generating a {{CONTAINER_LAUNCHED}} event - which isn't > handled by ContainerState: {{KILLING}}. Also, the launched container won't be > killed since {{CLEANUP_CONTAINER}} would have already been processed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira