[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3932: --- Status: Open (was: Patch Available) Thanks for the review Sid. I will update the comments, and try to remove the unneeded token stuff. I was lazy and did a copy and paste. MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Robert Joseph Evans Priority: Critical Fix For: 0.23.2 Attachments: MR-3932.txt [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO
[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3932: --- Attachment: MR-3932.txt This patch addresses Sid's comments. MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Robert Joseph Evans Priority: Critical Fix For: 0.23.2 Attachments: MR-3932.txt, MR-3932.txt [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating
[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3932: -- Resolution: Fixed Fix Version/s: (was: 0.23.2) 0.23.3 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, branch-2 and branch-0.23. MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Robert Joseph Evans Priority: Critical Fix For: 0.23.3 Attachments: MR-3932.txt, MR-3932.txt [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_01_0
[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3932: --- Attachment: MR-3932.txt I traced out the State Machine for TaskAttemptImpl and I verified that all states after ASSIGNED, can handle TA_CONTAINER_LAUNCHED and/or TA_CONTAINER_LAUNCH_FAILED depending on how they are returned. I also looked at the JobImpl state Machine ad did a similar thing for the Counter Updates. MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.2 Attachments: MR-3932.txt [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator]
[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3932: --- Assignee: Robert Joseph Evans (was: Vinod Kumar Vavilapalli) Target Version/s: 0.23.3 Status: Patch Available (was: Open) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Robert Joseph Evans Priority: Critical Fix For: 0.23.2 Attachments: MR-3932.txt [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator]
[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amol Kekre updated MAPREDUCE-3932: -- Priority: Critical (was: Major) Making it critical MR tasks failing and crashing the AM when available-resources/headRoom becomes zero --- Key: MAPREDUCE-3932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.2 [~karams] reported this offline. One reduce task gets preempted because of zero headRoom and crashes the AM. {code} 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 44544 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 3 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_06 to attempt_1329995034628_0983_r_00_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_07 to attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1329995034628_0983_01_08 to attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 availableResources(headroom):memory: 0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:20 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_02_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1329995034628_0983_r_01_0 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule... 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator]