[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275375#comment-16275375
 ] 

Jason Lowe commented on MAPREDUCE-5124:
---------------------------------------

Sorry, Konstantin, I should have checked with you first.  Please feel free to 
revert it from 2.7.5, and apologies for the omitted CHANGES.txt.  I'll correct 
it when it's recommitted for 2.7.6.

I am pretty confident this is a safe change for 2.7 since it only changes the 
status update code path and doesn't affect the task start, killed, failed, 
succeeded paths.  However I totally understand the logic behind punting this to 
2.7.6.  The patch will get a chance to run on our 2.8 clusters at scale in the 
interim which will increase our confidence in it by then.

> AM lacks flow control for task events
> -------------------------------------
>
>                 Key: MAPREDUCE-5124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.3-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Peter Bacsko
>             Fix For: 2.7.5, 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4
>
>         Attachments: MAPREDUCE-5124-001.patch, MAPREDUCE-5124-002.patch, 
> MAPREDUCE-5124-003.patch, MAPREDUCE-5124-CoalescingPOC-1.patch, 
> MAPREDUCE-5124-CoalescingPOC2.patch, MAPREDUCE-5124-CoalescingPOC3.patch, 
> MAPREDUCE-5124-branch-2.001.patch, MAPREDUCE-5124-branch-2.002.patch, 
> MAPREDUCE-5124-branch-2.7.001.patch, MAPREDUCE-5124-branch-2.7.002.patch, 
> MAPREDUCE-5124-branch-2.7.002.patch, MAPREDUCE-5124-branch-2.8.001.patch, 
> MAPREDUCE-5124-branch-2.8.001.patch, MAPREDUCE-5124-branch-2.8.002.patch, 
> MAPREDUCE-5124-branch-2.8.002.patch, MAPREDUCE-5124-branch-2.9.001.patch, 
> MAPREDUCE-5124-branch-2.9.002.patch, MAPREDUCE-5124-proto.2.txt, 
> MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events 
> from tasks.  If the AM is unable to keep pace with the rate of incoming 
> events for a sufficient period of time then it will eventually exhaust the 
> heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
> processing, but the AM could still get behind if it's starved for CPU and/or 
> handling a very large job with tens of thousands of active tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to