[ https://issues.apache.org/jira/browse/MAPREDUCE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Varun Saxena updated MAPREDUCE-6514: ------------------------------------ Target Version/s: 2.7.2 > Update ask to indicate to RM that it need not allocate for ramped down > reducers > ------------------------------------------------------------------------------- > > Key: MAPREDUCE-6514 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6514 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster > Affects Versions: 2.7.1 > Reporter: Varun Saxena > Assignee: Varun Saxena > Priority: Critical > > In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled > reduces map and put these reducers to pending. This is not updated in ask. So > RM keeps on assigning and AM is not able to assign as no reducer is > scheduled(check logs below the code). > If this is updated immediately, RM will be able to schedule mappers > immediately which anyways is the intention when we ramp down reducers. > This if not handled can lead to map starvation as pointed out in > MAPREDUCE-6513 > {code} > LOG.info("Ramping down all scheduled reduces:" > + scheduledRequests.reduces.size()); > for (ContainerRequest req : scheduledRequests.reduces.values()) { > pendingReduces.add(req); > } > scheduledRequests.reduces.clear(); > {code} > {noformat} > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000215 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000216, > NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: > hdszzdcxdat6g06u04p:26010, Resource: <memory:4096, vCores:1>, Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000216 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000217, > NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: > hdszzdcxdat6g06u06p:26010, Resource: <memory:4096, vCores:1>, Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)