[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-17 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425190#comment-15425190
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks [~jlowe] for the review and commit!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424864#comment-15424864
 ] 

Hudson commented on MAPREDUCE-6690:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10291 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10291/])
MAPREDUCE-6690. Limit the number of resources a single map reduce job (jlowe: 
rev f80a7298325a4626638ee24467e2012442e480d4)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/filecache/ClientDistributedCacheManager.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobResourceUploader.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* (add) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/TestJobResourceUploader.java


> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424828#comment-15424828
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

+1 lgtm.  Committing this.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423747#comment-15423747
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 11s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 118m 49s 
{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 140m 41s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824023/MAPREDUCE-6690-trunk-v7.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 2bad52d01f13 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 27a6e09 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6673/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 U: hadoop-mapreduce-project/hadoop-mapreduce-client |
| Console output | 
h

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423593#comment-15423593
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks for the review [~jlowe]! Attached is a v7 patch. Here are the major 
changes:
# Changes to address your comments around getStringCollection, totalConfigSize* 
and ensuring tests failed in the intended way.
# Changes to make the usage of the word resource vs file consistent throughout 
the patch (i.e. a file is a type of resource).

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch, MAPREDUCE-6690-trunk-v7.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-09 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414184#comment-15414184
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

Thanks for updating the patch!  Looks good overall with just a few nits:

I think the code would be cleaner if we leveraged 
Configuration#getStringCollection to get the conf values rather than checking 
for null and splitting on comma directly.  That method will return an empty 
collection if there are no values for the property, so then we can just remove 
some of the null checks and just loop over the items for each property.  Some 
of the null checks would change to !isEmpty checks to avoid doing unnecessary 
mkdirs, etc., during upload methods, but they could be completely removed in 
the limit checking code.

The totalConfigSize* variables are essentially loop-invariants, so they should 
be computed once in the Limits constructor rather than each addFile call.

Both TestMRJobs and TestJobResourceUploader assume any IOException is OK if the 
job submission is supposed to fail.  The unit tests should verify that the 
expected exception that failed the job submission was related to limits, 
otherwise we could be failing the job submission for the wrong reasons and the 
test would still pass.  I'm thinking something along the lines of checking the 
exception message for limits-related wording, but maybe there's a cleaner way.



> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409997#comment-15409997
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Filed jira to fix unrelated test failure: MAPREDUCE-6747

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409982#comment-15409982
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

TestMapReduceJobControl#testJobControlWithKillJob times out in trunk without 
this patch. The broken test is unrelated. I will file another jira to fix the 
test, but this patch should be ready for review. Thanks!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch, 
> MAPREDUCE-6690-trunk-v6.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-08-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408744#comment-15408744
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 130m 39s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
29s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 156m 39s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12822196/MAPREDUCE-6690-trunk-v6.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux b7ef7637fc03 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 438a9f0 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6659/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
 |
| unit test logs |  
https://builds.apache

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-10 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325223#comment-15325223
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

The whitespace errors were for lines that this patch did not touch. I am not 
sure why they appeared during the run. [~jlowe] the patch should be good as is, 
unless you have additional comments. Thanks!

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch, MAPREDUCE-6690-trunk-v5.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323669#comment-15323669
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 20 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 1s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 122m 4s 
{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 48s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12809313/MAPREDUCE-6690-trunk-v5.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 22ced402a5f8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 9581fb7 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6546/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6546/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoo

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321929#comment-15321929
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
20s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s 
{color} | {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core 
generated 2 new + 2508 unchanged - 1 fixed = 2510 total (was 2509) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 1s {color} | 
{color:red} hadoop-mapreduce-client-core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 114m 22s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapreduce.tools.TestCLI |
|   | hadoop.mapred.TestMRCJCFileOutputCommitter |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12809100/MAPREDUCE-6690-trunk-v4.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux f6e7eb5194a4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1500a0a |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| javadoc | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6544/artifact/patchprocess/diff-javadoc-jav

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321825#comment-15321825
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 4s {color} 
| {color:red} Docker failed to build yetus/hadoop:2c91fd8. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12809100/MAPREDUCE-6690-trunk-v4.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6543/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch, 
> MAPREDUCE-6690-trunk-v4.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321330#comment-15321330
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

bq. I assumed that YARN-5192 would implement the check as part of the submit 
call so that the client gets immediate feedback.

Note that YARN-5192 cannot do the check on application submit.  An application 
submit only requires the resources necessary to get the ApplicationMaster 
localized.  Subsequent containers for the application could have a completely 
different set of resources, and they won't be available in the application 
submission context for validation at submit time.  MapReduce is an app 
framework that happens to localize all resources for all containers, but other 
application frameworks do not always do this.

bq.  I would like to find a way, however, to try to keep the two settings in 
sync if possible.

Agreed it would be annoying for admins to have to keep these in sync, assuming 
nobody would ever want to configure the YARN limit higher than the MapReduce 
limit.

bq. What about having the RM offer up its resource limits through a call?

That would be one way to tackle it.  There have been cases in the past where it 
would have been nice for clients to be able to query config settings via the 
central daemons (i.e.: namenode, resourcemanager, etc.) rather than assume the 
local settings in hdfs-site.xml or yarn-site.xml are the same as what the 
central daemon is using.  That's a somewhat open-ended API change for YARN with 
backwards-compatibility concerns going forward, but maybe it's time we hammered 
out whether or not we're going to do it on a YARN JIRA and if not, what 
clients/users are supposed to do to better keep the client and the server in 
sync.



> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321316#comment-15321316
 ] 

Daniel Templeton commented on MAPREDUCE-6690:
-

Thanks for the clarification, [~jlowe].  I assumed that YARN-5192 would 
implement the check as part of the submit call so that the client gets 
immediate feedback.  The point that I forgot about, though, is that regardless 
the submit only happens after the resources have been uploaded to HDFS.  Given 
that this check specifically targets wide loads, the cases where the 
server-side check would reject the submit are exactly the ones that will waste 
the most time with the upload.

I now see the light.  I would like to find a way, however, to try to keep the 
two settings in sync if possible.  I've seen cases, such as the number of 
concurrent moves in the HDFS mover, where the limit is set on both the client 
and server sides, and it ends up confusing customers.  What about having the RM 
offer up its resource limits through a call?  The client could then query the 
RM's limits and apply those.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321305#comment-15321305
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

Implementing the check in MapReduce allows for fast-failure and more 
accurate/informative errors to the client.  The check in MapReduce can prevent 
an unnecessary upload of one or more resources to the staging area in HDFS 
because the client knows the job is going to fail anyway.  Also YARN-5192 will 
only be able to detect the error when a container starts to localize on a node 
that asks for a resource set that violate the limits.  Since MapReduce 
localizes everything for all containers (including the AM) it will fail under 
YARN-5192 as soon as the AM tries to run on a node, but it might take a while 
for the AM to get scheduled.  As for error reporting, if the violation comes 
from one or more files that were submitted locally then the paths via a 
YARN-5192 check will be for HDFS staging directories rather than the local path 
the client originally specified.  The error also will not be reported to the 
job client submitting the job unless it hangs around to monitor the job after 
submission.  With this check the job client will get the error directly when it 
tries to submit.

If we don't care much about these differences then we can just go with the 
YARN-5192 implementation.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-08 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321189#comment-15321189
 ] 

Daniel Templeton commented on MAPREDUCE-6690:
-

Please suffer me a dumb question: assuming that YARN-5192 is implemented, why 
do we also need this JIRA?  Doesn't having two settings to do the same thing 
from different ends make the system needlessly confusing?

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311736#comment-15311736
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s 
{color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client: The 
patch generated 1 new + 581 unchanged - 0 fixed = 582 total (was 581) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s 
{color} | {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core 
generated 1 new + 2509 unchanged - 0 fixed = 2510 total (was 2509) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 124m 23s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
30s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 144m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapred.TestMRCJCFileOutputCommitter |
|   | hadoop.mapreduce.v2.TestMRJobs |
|   | hadoop.mapred.TestMiniMRChildTask |
|   | hadoop.mapreduce.v2.TestUberAM |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807595/MAPREDUCE-6690-trunk-v3.patch
 |
| JIRA Issue | MAPREDUCE-6690 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux e93aef851bf8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git 

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311624#comment-15311624
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

I filled YARN-5192 to address the server-side YARN feature.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311613#comment-15311613
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Note: I was getting some InvocationTargetException failures during TestMRJobs 
on my local machine. I am submitting the patch anyways to get a run.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch, MAPREDUCE-6690-trunk-v3.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311608#comment-15311608
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

bq. should there be a corresponding YARN feature to reject applications that 
are asking for too much localization?

Yes, totally agree. As per the description, I was thinking of creating a follow 
up jira for a more complete server-side YARN solution. I am thinking of 
something where we can leverage the container launch context and the node 
manager can be smart about not launching containers that will cause too much 
localization. I haven't thought too much about this yet, but I will definitely 
file the jira.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-06-01 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311307#comment-15311307
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

bq.  I will add in the DC items to the check.

This reminds me: should there be a corresponding YARN feature to reject 
applications that are asking for too much localization?  Admins could then 
configure a cluster so it still rejects bad apps from frameworks that do not 
support this type of self-checking or from users who are overriding configs.  
This check in MapReduce would still be useful from the standpoint of avoiding 
large copies to HDFS for staging if we know it's not going to work anyway, but 
the YARN check could catch any type of application before the distributed cache 
bomb hits the cluster nodes when the bad app runs.


> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-31 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308934#comment-15308934
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Thanks for the review [~jlowe]!

bq. Is this intended to apply to all distributed cache items or only those that 
need to be uploaded during job submission?

Yes, it is intended to apply to all distributed cache items as well. Good 
catch! I will add in the DC items to the check. As a side note: the reasoning 
for including DC items is that even though the DC items are in an accessible 
place, they could still cause a significant amount of localization to the YARN 
local cache. The amount of localization is affected by the local cache size and 
the hit rate in the cache, but I chose to go with the most conservative 
approach.

I will also address your other comments.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-24 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298990#comment-15298990
 ] 

Jason Lowe commented on MAPREDUCE-6690:
---

Thanks for the patch, Chris!  Initial comments:

Is this intended to apply to all distributed cache items or only those that 
need to be uploaded during job submission?  Some comments in the JIRA and the 
property descriptions imply it also should apply to items in the distributed 
cache that already reside in HDFS, but it doesn't look like the patch does 
that.  The changes are to JobResourceUploader which AFAIK only gets involved on 
files that potentially need to be copied to the staging area before job 
submission.  I'm not seeing how this affects items already in HDFS elsewhere 
before job submission (i.e.: items already in mapreduce.job.cache.*)

Speaking of mapreduce.job.cache.*, it would be nice if the properties used that 
same prefix since it's related to the distributed cache.  Also I'd personally 
prefer something like mapreduce.job.cache.limit.max-files, 
mapreduce.job.cache.limit.max-file-mb, and 
mapreduce.job.cache.limit.max-total-mb if it's supposed to apply to the entire 
distributed cache.

The TotalNumberOfFilesAndSize API is verbose and error-prone -- is there ever a 
valid reason to call incrementTotalSize without also calling 
incrementTotalNumberOfFiles and findMaxFileSize?  Probably does the wrong thing 
if the client doesn't call them all for each file.  IMHO there should just be 
two APIs, addFile(long filesize) and checkLimit().  Or maybe just one if it's 
OK to throw during addFile() directly.

Suggestion: TotalNumberOfFilesAndSize might be easier to comprehend (and type) 
if named something like LimitsChecker.  Also its constructor can just be passed 
a Configuration.  Then it can hide all the confs and other implementation 
details related to the dist cache limits, and a predicate function like 
hasLimits() can be used to do the early-out checks.  Or maybe we just pass it 
the files directly and it can decide internally whether to visit the paths or 
early-out.

I think it would be very helpful if the file path was shown in the error 
message when something exceeds the single-file limit, otherwise the user has to 
manually track it down among all the files involved.

Nit: Javadocs listing the parameters to a method but no description for any of 
those parameters isn't useful.


> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch, 
> MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279424#comment-15279424
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client: patch 
generated 1 new + 577 unchanged - 0 fixed = 578 total (was 577) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 55s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 109m 9s {color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 36s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 130m 41s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.7.0_95. {color} |
| {color:gr

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275015#comment-15275015
 ] 

Hadoop QA commented on MAPREDUCE-6690:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client: patch 
generated 8 new + 537 unchanged - 0 fixed = 545 total (was 537) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
53s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 22s 
{color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 18s 
{color} | {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core-jdk1.7.0_95
 with JDK v1.7.0_95 generated 1 new + 25 unchanged - 0 fixed = 26 total (was 
25) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 9s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {co

[jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization

2016-05-06 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274729#comment-15274729
 ] 

Chris Trezzo commented on MAPREDUCE-6690:
-

Also note that this isn't perfect in that jobs could be of varying sizes and 
this is really only limiting the per-container amount of localization. At 
least, in the map reduce case, all of the containers localize the same amount, 
not considering differences in what is already in the YARN local cache on 
various node managers.

> Limit the number of resources a single map reduce job can submit for 
> localization
> -
>
> Key: MAPREDUCE-6690
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: MAPREDUCE-6690-trunk-v1.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as 
> part of a single map reduce job. This can cause issues with YARN localization 
> that destabilize the cluster and potentially impact other user jobs. These 
> resources are specified via the files, libjars, archives and jobjar command 
> line arguments or directly through the configuration (i.e. distributed cache 
> api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the 
> option of enforcing resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the 
> server side, but this jira is only covering the map reduce layer on the 
> client side. In practice, having these client side limits will get us a long 
> way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org