[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated MAPREDUCE-6129:
--------------------------------
    Affects Version/s: 3.0.0
                       2.3.0
                       2.5.0
                       2.4.1
                       2.5.1

> Job failed due to counter out of limited in MRAppMaster
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-6129
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1
>            Reporter: Min Zhou
>
> Lots of of cluster's job use more than 120 counters, those kind of jobs  
> failed with exception like below
> {noformat}
> 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] 
> org.apache.hadoop.ipc.Server: Unable to read call parameters for client 
> 10.180.216.12on connection protocol 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
> counters: 121 max=120
>       at 
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
>       at 
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
>       at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
>       at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
>       at 
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
>       at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
>       at 
> org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
>       at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
>       at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
>       at 
> org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
>       at 
> org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
>       at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
>       at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
>       at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
>       at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)
> {noformat}
> The class org.apache.hadoop.mapreduce.counters.Limits load the 
> mapred-site.xml on nodemanager node for JobConf if it hasn't been inited. 
> If the mapred-site.xml on nodemanager node is not exist or the 
> mapreduce.job.counters.max hasn't been defined on that file, Class 
> org.apache.hadoop.mapreduce.counters.Limits will just  use the default value 
> 120. 
> Instead, we should read user job's conf file rather than config files on 
> nodemanager for checking counters limits.
> I will submitt a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to