[ https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Min Zhou updated MAPREDUCE-6129: -------------------------------- Affects Version/s: 3.0.0 2.3.0 2.5.0 2.4.1 2.5.1 > Job failed due to counter out of limited in MRAppMaster > ------------------------------------------------------- > > Key: MAPREDUCE-6129 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster > Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1 > Reporter: Min Zhou > > Lots of of cluster's job use more than 120 counters, those kind of jobs > failed with exception like below > {noformat} > 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] > org.apache.hadoop.ipc.Server: Unable to read call parameters for client > 10.180.216.12on connection protocol > org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many > counters: 121 max=120 > at > org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103) > at > org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110) > at > org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175) > at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324) > at > org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314) > at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489) > at > org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140) > at > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157) > at > org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802) > at > org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734) > at > org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494) > at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577) > {noformat} > The class org.apache.hadoop.mapreduce.counters.Limits load the > mapred-site.xml on nodemanager node for JobConf if it hasn't been inited. > If the mapred-site.xml on nodemanager node is not exist or the > mapreduce.job.counters.max hasn't been defined on that file, Class > org.apache.hadoop.mapreduce.counters.Limits will just use the default value > 120. > Instead, we should read user job's conf file rather than config files on > nodemanager for checking counters limits. > I will submitt a patch later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)