[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846428#comment-13846428
 ] 

Ashutosh Chauhan commented on MAPREDUCE-5680:
---------------------------------------------

Couple which I encountered while helping users are: 
* 1. Number of block locations for a given split.
* 2. Number of counters, counter-groups, length of name of counters.

1.  Default for this currently is 10. In CombineSplits case which is very 
common in practice, number of block locations may easily surpass 10.
2. Default for this is total 50 counters across all counter-group with max 
length of name limited to 128.

IMO, both of these limits are no longer required, since if a job is getting 
submitted with absurdly high number of these values, it will crash its own AM 
and it will crash and burn itself, no one else. With limits still existing, 
getBlockLocations() which reports 11 location for a combined split of 4 files 
fails to run or job which needs 51 counters fails to run.

There might be more such limits which can be removed, I haven't looked 
extensively.

> Reconsider limits
> -----------------
>
>                 Key: MAPREDUCE-5680
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5680
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>    Affects Versions: 2.2.0
>            Reporter: Ashutosh Chauhan
>
> Limits were first introduced in 0.20.2xx line with the main goal of 
> protecting  jobtracker from rogue jobs. Now that problem no longer exists in 
> yarn, where each job gets its own MR AM.  So, its good time now to revisit 
> limits and see which of those still make sense.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to