[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-04-11 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3932:
---

Status: Open  (was: Patch Available)

Thanks for the review Sid.  I will update the comments, and try to remove the 
unneeded token stuff.  I was lazy and did a copy and paste. 

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 0.23.2

 Attachments: MR-3932.txt


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO 

[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-04-11 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3932:
---

Attachment: MR-3932.txt

This patch addresses Sid's comments.

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 0.23.2

 Attachments: MR-3932.txt, MR-3932.txt


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating 
 

[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-04-11 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3932:
--

   Resolution: Fixed
Fix Version/s: (was: 0.23.2)
   0.23.3
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-0.23.

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 0.23.3

 Attachments: MR-3932.txt, MR-3932.txt


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_01_0
 

[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-04-10 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3932:
---

Attachment: MR-3932.txt

I traced out the State Machine for TaskAttemptImpl and I verified that all 
states after ASSIGNED, can handle TA_CONTAINER_LAUNCHED and/or 
TA_CONTAINER_LAUNCH_FAILED depending on how they are returned.  I also looked 
at the JobImpl state Machine ad did a similar thing for the Counter Updates.

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.2

 Attachments: MR-3932.txt


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 

[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-04-10 Thread Robert Joseph Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3932:
---

Assignee: Robert Joseph Evans  (was: Vinod Kumar Vavilapalli)
Target Version/s: 0.23.3
  Status: Patch Available  (was: Open)

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 0.23.2

 Attachments: MR-3932.txt


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 

[jira] [Updated] (MAPREDUCE-3932) MR tasks failing and crashing the AM when available-resources/headRoom becomes zero

2012-03-02 Thread Amol Kekre (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amol Kekre updated MAPREDUCE-3932:
--

Priority: Critical  (was: Major)

Making it critical

 MR tasks failing and crashing the AM when available-resources/headRoom 
 becomes zero
 ---

 Key: MAPREDUCE-3932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.2


 [~karams] reported this offline. One reduce task gets preempted because of 
 zero headRoom and crashes the AM.
 {code}
 2012-02-23 11:30:15,956 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,959 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
 Scheduling: PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 
 AssignedMaps:0 AssignedReduces:0 completedMaps:4 completedReduces:0 
 containersAllocated:4 containersReleased:0 hostLocalAssigned:0 
 rackLocalAssigned:4 availableResources(headroom):memory: 44544
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:23 AssignedMaps:0 
 AssignedReduces:0 completedMaps:4 completedReduces:0 containersAllocated:4 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated 
 containers 3
 2012-02-23 11:30:16,965 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_06 to 
 attempt_1329995034628_0983_r_00_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_07 to 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned 
 container container_1329995034628_0983_01_08 to 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Assign: 
 PendingReduces:377 ScheduledMaps:6 ScheduledReduces:20 AssignedMaps:0 
 AssignedReduces:3 completedMaps:4 completedReduces:0 containersAllocated:7 
 containersReleased:0 hostLocalAssigned:0 rackLocalAssigned:4 
 availableResources(headroom):memory: 0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all 
 scheduled reduces:20
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 2
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_02_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting 
 attempt_1329995034628_0983_r_01_0
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating 
 schedule...
 2012-02-23 11:30:16,966 INFO [RMCommunicator Allocator]