[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events

Tsuyoshi OZAWA (JIRA) Tue, 14 May 2013 09:41:17 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657192#comment-13657192
 ]


Tsuyoshi OZAWA commented on MAPREDUCE-5124:
-------------------------------------------

返信

> When you say it should have the capacity for handling event per second what 
> does that mean? 

It means that AsyncDispatcher#dispatch limits the consuming events per sec to 
avoid the starvation of CPU by a large number of event processing. It just 
limits the number of processing events. If the error should be returned to the 
tasks, it should be done when the producers of queue enqueues or RPC layer as 
Robert described.

> Also I am very nervous about the possibilities of deadlocks if the 
> AsyncDispatcher can block. 

I understood the possibility of deadlock you pointed out. I will change the 
design to deal with this problem at RPC layer. The code as follow is the 
concept code. Is it enough to restrict at server side and retry at client side? 
If the answer is positive, I'll prototype it at RPC layer.

{code:java}
// client side
try {
  // synchronous RPC call
  rpcMethod1();
} catch (TemporaryBuzyException ise) {
  // handling retry
} catch (IOException ioe) {
  // handling SocketTimeoutException or
  // some connection-related eror
}
{code}

{code:java}
// server side(org.apache.hadoop.ipc.Server)
void processData() {
    if (isBusy()) {
      setupResponse(responseBuffer, readParamsFailedCall, RPCProto.Error, 
  TemporaryBuzyException.getClass(), …);
    } else {
      // process RPC usually
    } 
}
{code}

                
> AM lacks flow control for task events
> -------------------------------------
>
>                 Key: MAPREDUCE-5124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.3-alpha, 0.23.5
>            Reporter: Jason Lowe
>         Attachments: MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events 
> from tasks.  If the AM is unable to keep pace with the rate of incoming 
> events for a sufficient period of time then it will eventually exhaust the 
> heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
> processing, but the AM could still get behind if it's starved for CPU and/or 
> handling a very large job with tens of thousands of active tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events

Reply via email to