[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261050#comment-16261050
 ] 

Jason Lowe commented on MAPREDUCE-5124:
---------------------------------------

Thanks for updating the patch!  Sorry for the delay.

bq. We already know that counters are always sent from MR tasks, but what about 
Tez?

Sorry for the confusion.  I was referring to Tez as an example of a framework 
that has counters but doesn't send them on every task update.  I didn't mean to 
imply Tez is calling the MR code directly.  We can remove the code that checks 
if the counters are null to simplify things since you found they can't be null 
in practice for other reasons.

This comment was not addressed which will cover the case of a rogue task trying 
to heartbeat:
{quote}
The listener should check if we fail to lookup the ref for the specified 
attempt ID and throw an exception with a useful message.
{quote}

The EventHandler import added in TaskAttemptListenerImpl is unused.

Nit: This whitespace change is more harmful than helpful to readability, IMO.
{noformat}
@@ -427,6 +437,8 @@ public AMFeedback statusUpdate(TaskAttemptID taskAttemptID,
       }
     }
 
+
+
  // Task sends the information about the nextRecordRange to the TT
     
 //    TODO: The following are not needed here, but needed to be set somewhere 
inside AppMaster.
{noformat}

Nit: In StatusUpdater#transition newReportedStatus does not need a separate 
declaration as the code can declare it when it is assigned.

Normally listener would be set to null after closing to avoid a potential 
double-close:
{code}
  public void after() throws IOException {
    if (listener != null) {
      listener.close();
    }
  }
{code}



> AM lacks flow control for task events
> -------------------------------------
>
>                 Key: MAPREDUCE-5124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.3-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Peter Bacsko
>         Attachments: MAPREDUCE-5124-001.patch, 
> MAPREDUCE-5124-CoalescingPOC-1.patch, MAPREDUCE-5124-CoalescingPOC2.patch, 
> MAPREDUCE-5124-CoalescingPOC3.patch, MAPREDUCE-5124-proto.2.txt, 
> MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events 
> from tasks.  If the AM is unable to keep pace with the rate of incoming 
> events for a sufficient period of time then it will eventually exhaust the 
> heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
> processing, but the AM could still get behind if it's starved for CPU and/or 
> handling a very large job with tens of thousands of active tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to