[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787516#comment-13787516
 ] 

Hadoop QA commented on YARN-1277:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12607057/YARN-1277.20131005.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2129//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2129//console

This message is automatically generated.

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.20131005.3.patch, YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787512#comment-13787512
 ] 

Hadoop QA commented on YARN-1278:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607059/YARN-1278.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2130//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2130//console

This message is automatically generated.

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Assignee: Hitesh Shah
>Priority: Blocker
> Attachments: YARN-1278.1.patch
>
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned YARN-1278:
-

Assignee: Hitesh Shah

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Assignee: Hitesh Shah
>Priority: Blocker
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-1278:
--

Attachment: YARN-1278.1.patch

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Assignee: Hitesh Shah
>Priority: Blocker
> Attachments: YARN-1278.1.patch
>
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1277:


Attachment: YARN-1277.20131005.3.patch

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.20131005.3.patch, YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787497#comment-13787497
 ] 

Omkar Vinit Joshi commented on YARN-1277:
-

Thanks [~vinodkv]
bq. We aren't removing hadoop.ssl.enabled yet?
no.. that will become the default for HttpServer in general. But for yarn /JHS/ 
MR we will override it based on policies. As per what I understood after 
discussing with [~sureshms] this is for supporting backward compatibility.

bq. Why two methods getResolvedRMWebAppURLWithoutScheme* in YARN's 
WebAppUtils.java? Looks like only one is enough.
It is mainly because we are calling them from two separate modules (YARN and 
MR). For MR we disable the SSL for server but may want it if it is enabled for 
YARN.

bq. AmFilterInitializer changes are import only and can be skipped.
removed..

bq. Earlier we punted on enabling https for proxy-server, but it should be easy 
to add now?
yes.

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787491#comment-13787491
 ] 

Vinod Kumar Vavilapalli commented on YARN-1278:
---

Here's what happened
{code}
2013-10-05 15:03:57,154 INFO  nodemanager.DefaultContainerExecutor 
(DefaultContainerExecutor.java:startLocalizer(105)) - CWD set to 
/grid/0/hdp/yarn/local/usercache/hrt_qa/appcache/application_1380985373054_0001 
= 
file:/grid/0/hdp/yarn/local/usercache/hrt_qa/appcache/application_1380985373054_0001
2013-10-05 15:03:57,251 INFO  localizer.ResourceLocalizationService 
(ResourceLocalizationService.java:update(910)) - DEBUG: FAILED { 
hdfs://HDFS:8020/user/hrt_qa/.staging/job_1380985373054_0001/job.jar, 
1380985387452, PATTERN, (?:classes/|lib/).* }, Rename cannot overwrite non 
empty destination directory 
/grid/4/hdp/yarn/local/usercache/hrt_qa/appcache/application_1380985373054_0001/filecache/10
2013-10-05 15:03:57,252 INFO  localizer.LocalizedResource 
(LocalizedResource.java:handle(196)) - Resource 
hdfs://HDFS:8020/user/hrt_qa/.staging/job_1380985373054_0001/job.jar 
transitioned from DOWNLOADING to FAILED
2013-10-05 15:03:57,253 INFO  container.Container 
(ContainerImpl.java:handle(871)) - Container 
container_1380985373054_0001_02_01 transitioned from LOCALIZING to 
LOCALIZATION_FAILED
2013-10-05 15:03:57,253 INFO  localizer.LocalResourcesTrackerImpl 
(LocalResourcesTrackerImpl.java:handle(137)) - Container 
container_1380985373054_0001_02_01 sent RELEASE event on a resource request 
{ hdfs://HDFS:8020/user/hrt_qa/.staging/job_1380985373054_0001/job.jar, 
1380985387452, PATTERN, (?:classes/|lib/).* } not present in cache.
2013-10-05 15:03:57,254 INFO  localizer.ResourceLocalizationService 
(ResourceLocalizationService.java:processHeartbeat(553)) - Unknown localizer 
with localizerId container_1380985373054_0001_02_01 is sending heartbeat. 
Ordering it to DIE
{code}

Basically, RM restarted, all NMs were forced to resync. And because of 
YARN-1149, now all Applications are removed from NM but deletion of app 
resources is asynchronous. When new AM starts, it tries to download the 
resources all over again but we generate the local destination path based on 
sequence numbers tracked vai LocalResourcesTracker.nextUniqueNumber. Because 
the original apps are removed, those sequence numbers are lost, so the same app 
tries to relocalize and conflicts local paths.

I think on resync, we shouldn't destroy app resources. That is desired anyways 
as there is no need to just relocalize everything because of RM resync.

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Priority: Blocker
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787490#comment-13787490
 ] 

Vinod Kumar Vavilapalli commented on YARN-1277:
---

I'm happy I spent time pushing for polished patches previously at YARN-1280. 
The code is so much cleaner making it to easy to accommodate changes from this 
patch.

Patch looks good overall. Quick comments:
 - We aren't removing hadoop.ssl.enabled yet?
 - Why two methods getResolvedRMWebAppURLWithoutScheme* in YARN's 
WebAppUtils.java? Looks like only one is enough.
 - AmFilterInitializer changes are import only and can be skipped.
 - Earlier we punted on enabling https for proxy-server, but it should be easy 
to add now?

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787481#comment-13787481
 ] 

Xuan Gong commented on YARN-1278:
-

[~vinodkv] If that is the case,  instead of directly using deleteservice to 
delete files, we can rename the file, then call deleteservice to delete them. 
Just like we delete the previous files when NM restart.

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Priority: Blocker
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787477#comment-13787477
 ] 

Hadoop QA commented on YARN-1277:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12607047/YARN-1277.20131005.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2128//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2128//console

This message is automatically generated.

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787461#comment-13787461
 ] 

Omkar Vinit Joshi commented on YARN-1277:
-

summary
* To enable https
** in resourcemanager and nodemanager set yarn.http.policy=HTTPS_ONLY. By 
default it will be HTTP_ONLY
** in job history server mapreduce.jobhistory.http.policy=HTTPS_ONLY. By 
default it will be HTTP_ONLY
** MR AM's configuration to turn on ssl is removed as it is anyway not 
supported. It will be http only. Note. All MR AM's web links will be served via 
proxy server which is still controlled by yarn http policy.
* Test
* Tested RM,NM, JHS and MR with below combinations
** everything default (no policies)..everything is http only. ports too are 
selected accordingly.
** yarn.http.policy is set to HTTPS_ONLY.. RM and NM are started on HTTPS ports 
and all links to them are following https scheme. However JHS schemes and port 
is still http.
** enabled both yarn.http.policy and mapreduce.jobhistory.http.policy 
(HTTPS_POLICY). Now all links to job history server, node manager and resource 
manager are with https scheme and for https port.

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787456#comment-13787456
 ] 

Omkar Vinit Joshi commented on YARN-1277:
-

updated the patch with common changes.

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1277:


Attachment: YARN-1277.20131005.2.patch

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.20131005.2.patch, 
> YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787454#comment-13787454
 ] 

Suresh Srinivas commented on YARN-1277:
---

[~ojoshi], why are you removing common part of the patch?

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787450#comment-13787450
 ] 

Vinod Kumar Vavilapalli commented on YARN-1278:
---

I debugged this with [~jianhe] directly on the cluster. This was caused by the 
patch for YARN-1149. Will upload logs and the analysis.

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Priority: Blocker
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1277:


Attachment: YARN-1277.20131005.1.patch

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787447#comment-13787447
 ] 

Omkar Vinit Joshi commented on YARN-1277:
-

updated yarn and mapred only patch

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.20131005.1.patch, YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787440#comment-13787440
 ] 

Hadoop QA commented on YARN-1068:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607039/yarn-1068-8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.client.api.impl.TestNMClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2127//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2127//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2127//console

This message is automatically generated.

> Add admin support for HA operations
> ---
>
> Key: YARN-1068
> URL: https://issues.apache.org/jira/browse/YARN-1068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>  Labels: ha
> Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
> yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
> yarn-1068-8.patch, yarn-1068-prelim.patch
>
>
> Support HA admin operations to facilitate transitioning the RM to Active and 
> Standby states.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1229) Define constraints on Auxiliary Service names. Change ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle.

2013-10-05 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787433#comment-13787433
 ] 

Karthik Kambatla commented on YARN-1229:


Updated the JIRA summary to reflect the commit message - explains the issue and 
the solution better.

> Define constraints on Auxiliary Service names. Change ShuffleHandler service 
> name from mapreduce.shuffle to mapreduce_shuffle.
> --
>
> Key: YARN-1229
> URL: https://issues.apache.org/jira/browse/YARN-1229
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Tassapol Athiapinya
>Assignee: Xuan Gong
>Priority: Blocker
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
> YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch
>
>
> I run sleep job. If AM fails to start, this exception could occur:
> 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
> state FAILED due to: Application application_1379673267098_0020 failed 1 
> times due to AM Container for appattempt_1379673267098_0020_01 exited 
> with  exitCode: 1 due to: Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException: 
> /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
>  line 12: export: 
> `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
> ': not a valid identifier
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> .Failing this attempt.. Failing the application.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1229) Define constraints on Auxiliary Service names. Change ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle.

2013-10-05 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1229:
---

Summary: Define constraints on Auxiliary Service names. Change 
ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle.  (was: 
Shell$ExitCodeException could happen if AM fails to start)

> Define constraints on Auxiliary Service names. Change ShuffleHandler service 
> name from mapreduce.shuffle to mapreduce_shuffle.
> --
>
> Key: YARN-1229
> URL: https://issues.apache.org/jira/browse/YARN-1229
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Tassapol Athiapinya
>Assignee: Xuan Gong
>Priority: Blocker
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
> YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch
>
>
> I run sleep job. If AM fails to start, this exception could occur:
> 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
> state FAILED due to: Application application_1379673267098_0020 failed 1 
> times due to AM Container for appattempt_1379673267098_0020_01 exited 
> with  exitCode: 1 due to: Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException: 
> /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
>  line 12: export: 
> `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
> ': not a valid identifier
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> .Failing this attempt.. Failing the application.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1068) Add admin support for HA operations

2013-10-05 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1068:
---

Attachment: yarn-1068-8.patch

Patch on top of YARN-1232, that addresses Bikas' earlier comments about ACL 
checking as well.

> Add admin support for HA operations
> ---
>
> Key: YARN-1068
> URL: https://issues.apache.org/jira/browse/YARN-1068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>  Labels: ha
> Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
> yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
> yarn-1068-8.patch, yarn-1068-prelim.patch
>
>
> Support HA admin operations to facilitate transitioning the RM to Active and 
> Standby states.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787426#comment-13787426
 ] 

Hudson commented on YARN-1274:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4550 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4550/])
YARN-1274. Fixed NodeManager's LinuxContainerExecutor to create user, app-dir 
and log-dirs correctly even when there are no resources to localize for the 
container. Contributed by Siddharth Seth. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529555)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt, 
> YARN-1274.trunk.2.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1277:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787410#comment-13787410
 ] 

Hadoop QA commented on YARN-1274:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607036/YARN-1274.trunk.2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2126//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2126//console

This message is automatically generated.

> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt, 
> YARN-1274.trunk.2.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1095) historyserver webapp config is missing yarn. prefix

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1095:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> historyserver webapp config is missing yarn. prefix
> ---
>
> Key: YARN-1095
> URL: https://issues.apache.org/jira/browse/YARN-1095
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ramya Sunil
>Assignee: Omkar Vinit Joshi
>Priority: Blocker
> Fix For: 2.1.1-beta
>
>
> The historyserver spnego webapp config is missing yarn. prefix, and it should 
> read historyserver or applicationhistoryserver instead of jobhistoryserver



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1208) One of the WebUI Links redirected to http instead https protocol with ssl enabled

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1208:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> One of the WebUI Links redirected to http instead https protocol with ssl 
> enabled
> -
>
> Key: YARN-1208
> URL: https://issues.apache.org/jira/browse/YARN-1208
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.1-beta
>
>
> One of the webUI links is redirecting to the http link when https is enabled.
> Open Nodemanager UI (https://nodemanager:50060/node/allContainers) and click 
> on RM HOME link. This link redirects to "http://resourcemanager:port"; instead 
> "https://resourcemanager:port";



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1203) Application Manager UI does not appear with Https enabled

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1203:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> Application Manager UI does not appear with Https enabled
> -
>
> Key: YARN-1203
> URL: https://issues.apache.org/jira/browse/YARN-1203
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1203.20131017.1.patch, YARN-1203.20131017.2.patch, 
> YARN-1203.20131017.3.patch, YARN-1203.20131018.1.patch, 
> YARN-1203.20131018.2.patch, YARN-1203.20131019.1.patch
>
>
> Need to add support to disable 'hadoop.ssl.enabled' for MR jobs.
> A job should be able to run on http protocol by setting 'hadoop.ssl.enabled' 
> property at job level.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1204) Need to add https port related property in Yarn

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1204:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> Need to add https port related property in Yarn
> ---
>
> Key: YARN-1204
> URL: https://issues.apache.org/jira/browse/YARN-1204
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
> YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
> YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch
>
>
> There is no yarn property available to configure https port for Resource 
> manager, nodemanager and history server. Currently, Yarn services uses the 
> port defined for http [defined by 
> 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
> 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
> Yarn should have list of property to assign https port for RM, NM and JHS.
> It can be like below.
> yarn.nodemanager.webapp.https.address
> yarn.resourcemanager.webapp.https.address
> mapreduce.jobhistory.webapp.https.address 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1260:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1280

> RM_HOME link breaks when webapp.https.address related properties are not 
> specified
> --
>
> Key: YARN-1260
> URL: https://issues.apache.org/jira/browse/YARN-1260
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.1.1-beta, 2.1.2-beta
>Reporter: Yesha Vora
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1260.20131030.1.patch
>
>
> This issue happens in multiple node cluster where resource manager and node 
> manager are running on different machines.
> Steps to reproduce:
> 1) set yarn.resourcemanager.hostname =  in yarn-site.xml
> 2) set hadoop.ssl.enabled = true in core-site.xml
> 3) Do not specify below property in yarn-site.xml
> yarn.nodemanager.webapp.https.address and 
> yarn.resourcemanager.webapp.https.address
> Here, the default value of above two property will be considered.
> 4) Go to nodemanager web UI "https://:8044/node"
> 5) Click on RM_HOME link 
> This link redirects to "https://:8090/cluster" instead 
> "https://:8090/cluster"



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (YARN-1280) [Umbrella] Ensure YARN+MR works with https on the web interfaces

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-1280:
-

 Summary: [Umbrella] Ensure YARN+MR works with https on the web 
interfaces
 Key: YARN-1280
 URL: https://issues.apache.org/jira/browse/YARN-1280
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi


Even though https is already supposed to work, it doesn't and we fixed and have 
been fixing a bunch of things related to enabling https in various daemons.

This is an umbrella JIRA for tracking this work.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787404#comment-13787404
 ] 

Bikas Saha commented on YARN-1278:
--

Its hard to debug this without any AM/RM logs. Can some logs be uploaded to the 
jira please?

> New AM does not start after rm restart
> --
>
> Key: YARN-1278
> URL: https://issues.apache.org/jira/browse/YARN-1278
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
>Reporter: Yesha Vora
>Priority: Blocker
>
> The new AM fails to start after RM restarts. It fails to start new 
> Application master and job fails with below error.
>  /usr/bin/mapred job -status job_1380985373054_0001
> 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname
> Job: job_1380985373054_0001
> Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
> Job Tracking URL : 
> http://hostname:8088/cluster/app/application_1380985373054_0001
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: FAILED
> retired: false
> reason for failure: There are no failed tasks for the job. Job is failed due 
> to some other reason and reason can be found in the logs.
> Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1274:
--

Attachment: YARN-1274.trunk.2.txt

Same patch with corrected code comment.

Ran native test-container-executor test which passes. Based on Sid's testing, 
will check this in once Jenkins says okay.

> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt, 
> YARN-1274.trunk.2.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete

2013-10-05 Thread Arun C Murthy (JIRA)
Arun C Murthy created YARN-1279:
---

 Summary: Expose a client API to allow clients to figure if log 
aggregation is complete
 Key: YARN-1279
 URL: https://issues.apache.org/jira/browse/YARN-1279
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.2-beta
Reporter: Arun C Murthy
Assignee: Arun C Murthy


Expose a client API to allow clients to figure if log aggregation is complete



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-10-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787396#comment-13787396
 ] 

Arun C Murthy commented on YARN-451:


Thinking aloud... we could track past allocations (#containers) per application 
in RM and also track future requests (sum numContainers for *) per application. 
Would showing those two help? 

> Add more metrics to RM page
> ---
>
> Key: YARN-451
> URL: https://issues.apache.org/jira/browse/YARN-451
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Lohit Vijayarenu
>Assignee: Sangjin Lee
>Priority: Blocker
> Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch
>
>
> ResourceManager webUI shows list of RUNNING applications, but it does not 
> tell which applications are requesting more resource compared to others. With 
> cluster running hundreds of applications at once it would be useful to have 
> some kind of metric to show high-resource usage applications vs low-resource 
> usage ones. At the minimum showing number of containers is good option.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (YARN-1278) New AM does not start after rm restart

2013-10-05 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-1278:


 Summary: New AM does not start after rm restart
 Key: YARN-1278
 URL: https://issues.apache.org/jira/browse/YARN-1278
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Yesha Vora
Priority: Blocker


The new AM fails to start after RM restarts. It fails to start new Application 
master and job fails with below error.

 /usr/bin/mapred job -status job_1380985373054_0001
13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at hostname
Job: job_1380985373054_0001
Job File: /user/abc/.staging/job_1380985373054_0001/job.xml
Job Tracking URL : 
http://hostname:8088/cluster/app/application_1380985373054_0001
Uber job : false
Number of maps: 0
Number of reduces: 0
map() completion: 0.0
reduce() completion: 0.0
Job state: FAILED
retired: false
reason for failure: There are no failed tasks for the job. Job is failed due to 
some other reason and reason can be found in the logs.
Counters: 0



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787388#comment-13787388
 ] 

Hudson commented on YARN-1090:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4549 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4549/])
YARN-1090. Fixed CS UI to better reflect applications as non-schedulable and 
not as pending. Contributed by Jian He. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529538)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java


> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1090.1.patch, YARN-1090.2.patch, YARN-1090.3.patch, 
> YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1032) NPE in RackResolve

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787385#comment-13787385
 ] 

Hudson commented on YARN-1032:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4548 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4548/])
YARN-1032. Fixed NPE in RackResolver. Contributed by Lohit Vijayarenu. 
(acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529534)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/RackResolver.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestRackResolver.java


> NPE in RackResolve
> --
>
> Key: YARN-1032
> URL: https://issues.apache.org/jira/browse/YARN-1032
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
> Environment: linux
>Reporter: Lohit Vijayarenu
>Assignee: Lohit Vijayarenu
>Priority: Critical
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1032.1.patch, YARN-1032.2.patch, YARN-1032.3.patch
>
>
> We found a case where our rack resolve script was not returning rack due to 
> problem with resolving host address. This exception was see in 
> RackResolver.java as NPE, ultimately caught in RMContainerAllocator. 
> {noformat}
> 2013-08-01 07:11:37,708 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM. 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:99)
>   at 
> org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:92)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignMapsWithLocality(RMContainerAllocator.java:1039)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignContainers(RMContainerAllocator.java:925)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:861)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$400(RMContainerAllocator.java:681)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:243)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787379#comment-13787379
 ] 

Hudson commented on YARN-1268:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4547 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4547/])
Fix location of YARN-1268 in CHANGES.txt (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529531)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
YARN-1268. TestFairScheduer.testContinuousScheduling is flaky (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529529)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> TestFairScheduler.testContinuousScheduling is flaky
> ---
>
> Key: YARN-1268
> URL: https://issues.apache.org/jira/browse/YARN-1268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: YARN-1268-1.patch, YARN-1268.patch
>
>
> It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky

2013-10-05 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1268:
-

Hadoop Flags: Reviewed

> TestFairScheduler.testContinuousScheduling is flaky
> ---
>
> Key: YARN-1268
> URL: https://issues.apache.org/jira/browse/YARN-1268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: YARN-1268-1.patch, YARN-1268.patch
>
>
> It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1268) TestFairScheduler.testContinuousScheduling is flaky

2013-10-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787377#comment-13787377
 ] 

Sandy Ryza commented on YARN-1268:
--

I just committed this to trunk and branch-2

> TestFairScheduler.testContinuousScheduling is flaky
> ---
>
> Key: YARN-1268
> URL: https://issues.apache.org/jira/browse/YARN-1268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: YARN-1268-1.patch, YARN-1268.patch
>
>
> It looks like there's a timeout in it that's causing it to be flaky.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787373#comment-13787373
 ] 

Hadoop QA commented on YARN-1090:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607021/YARN-1090.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2125//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2125//console

This message is automatically generated.

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Attachments: YARN-1090.1.patch, YARN-1090.2.patch, YARN-1090.3.patch, 
> YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1130) Improve the log flushing for tasks when mapred.userlog.limit.kb is set

2013-10-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787370#comment-13787370
 ] 

Arun C Murthy commented on YARN-1130:
-

I'm reviewing this, meanwhile, [~paulhan] can you pls look at the unit test 
failures? Tx.

> Improve the log flushing for tasks when mapred.userlog.limit.kb is set
> --
>
> Key: YARN-1130
> URL: https://issues.apache.org/jira/browse/YARN-1130
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Paul Han
>Assignee: Paul Han
> Fix For: 2.0.5-alpha
>
> Attachments: YARN-1130.patch
>
>
> When userlog limit is set with something like this:
> {code}
> 
> mapred.userlog.limit.kb
> 2048
> The maximum size of user-logs of each task in KB. 0 disables the 
> cap.
> 
> 
> {code}
> the log entry will be truncated randomly for the jobs.
> The log size is left between 1.2MB to 1.6MB.
> Since the log is already limited, avoid the log truncation is crucial for 
> user.
> The other issue with the current 
> impl(org.apache.hadoop.yarn.ContainerLogAppender) is that log entries will 
> not flush to file until the container shutdown and logmanager close all 
> appenders. If user likes to see the log during task execution, it doesn't 
> support it.
> Will propose a patch to add a flush mechanism and also flush the log when 
> task is done.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1130) Improve the log flushing for tasks when mapred.userlog.limit.kb is set

2013-10-05 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1130:


Assignee: Paul Han

> Improve the log flushing for tasks when mapred.userlog.limit.kb is set
> --
>
> Key: YARN-1130
> URL: https://issues.apache.org/jira/browse/YARN-1130
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Paul Han
>Assignee: Paul Han
> Fix For: 2.0.5-alpha
>
> Attachments: YARN-1130.patch
>
>
> When userlog limit is set with something like this:
> {code}
> 
> mapred.userlog.limit.kb
> 2048
> The maximum size of user-logs of each task in KB. 0 disables the 
> cap.
> 
> 
> {code}
> the log entry will be truncated randomly for the jobs.
> The log size is left between 1.2MB to 1.6MB.
> Since the log is already limited, avoid the log truncation is crucial for 
> user.
> The other issue with the current 
> impl(org.apache.hadoop.yarn.ContainerLogAppender) is that log entries will 
> not flush to file until the container shutdown and logmanager close all 
> appenders. If user likes to see the log during task execution, it doesn't 
> support it.
> Will propose a patch to add a flush mechanism and also flush the log when 
> task is done.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1090:
--

Attachment: YARN-1090.3.patch

upload a new patch that added missed renames

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Attachments: YARN-1090.1.patch, YARN-1090.2.patch, YARN-1090.3.patch, 
> YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1090:
--

Attachment: YARN-1090.2.patch

Uploaded a patch that only renames 
Num Pending Apps  -> Num Non-schedulable apps
Num Active Apps -> Num Schedulable apps

Also added a missed entry for Absolute Used Capacity in the UI.
Fixed a minor comment in QueueMetrics

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Attachments: YARN-1090.1.patch, YARN-1090.2.patch, YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi reassigned YARN-1277:
---

Assignee: Omkar Vinit Joshi

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Omkar Vinit Joshi
> Attachments: YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787339#comment-13787339
 ] 

Arun C Murthy commented on YARN-1090:
-

@[~jianhe] - Suggestion: can you please provide a simpler patch which only 
tweaks the UI for now? We can rename variables etc. later... thanks!

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Attachments: YARN-1090.1.patch, YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated YARN-1277:
--

Attachment: YARN-1277.patch

Here is an initial patch. [~ojoshi], can you please address one of the TODOs in 
the patch?

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
> Attachments: YARN-1277.patch
>
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Moved] (YARN-1277) Add http policy support for YARN daemons

2013-10-05 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas moved HDFS-5310 to YARN-1277:
-

Affects Version/s: (was: 2.0.0-alpha)
   2.0.0-alpha
  Key: YARN-1277  (was: HDFS-5310)
  Project: Hadoop YARN  (was: Hadoop HDFS)

> Add http policy support for YARN daemons
> 
>
> Key: YARN-1277
> URL: https://issues.apache.org/jira/browse/YARN-1277
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>
> This YARN part of HADOOP-10022.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (YARN-1276) Secure cluster can have random failure to launch a container.

2013-10-05 Thread Tassapol Athiapinya (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tassapol Athiapinya resolved YARN-1276.
---

Resolution: Duplicate

Correction: closed as duplicate of YARN-1274

> Secure cluster can have random failure to launch a container.
> -
>
> Key: YARN-1276
> URL: https://issues.apache.org/jira/browse/YARN-1276
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
> Fix For: 2.1.2-beta
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (YARN-1276) Secure cluster can have random failure to launch a container.

2013-10-05 Thread Tassapol Athiapinya (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tassapol Athiapinya resolved YARN-1276.
---

Resolution: Invalid

> Secure cluster can have random failure to launch a container.
> -
>
> Key: YARN-1276
> URL: https://issues.apache.org/jira/browse/YARN-1276
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
> Fix For: 2.1.2-beta
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Reopened] (YARN-1276) Secure cluster can have random failure to launch a container.

2013-10-05 Thread Tassapol Athiapinya (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tassapol Athiapinya reopened YARN-1276:
---


> Secure cluster can have random failure to launch a container.
> -
>
> Key: YARN-1276
> URL: https://issues.apache.org/jira/browse/YARN-1276
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
> Fix For: 2.1.2-beta
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1090) Job does not get into Pending State

2013-10-05 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1090:
--

Attachment: YARN-1090.1.patch

Upload a new patch, reorganized the scheduler UI page

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
> Attachments: YARN-1090.1.patch, YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787295#comment-13787295
 ] 

Hadoop QA commented on YARN-1274:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607008/YARN-1274.trunk.1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2124//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2124//console

This message is automatically generated.

> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787293#comment-13787293
 ] 

Hadoop QA commented on YARN-1274:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12607008/YARN-1274.trunk.1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2123//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2123//console

This message is automatically generated.

> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1274) LCE fails to run containers that don't have resources to localize

2013-10-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-1274:
-

Attachment: YARN-1274.trunk.1.txt

Patch for trunk and branch-2. The previous patch applies to branch-2.1.

> LCE fails to run containers that don't have resources to localize
> -
>
> Key: YARN-1274
> URL: https://issues.apache.org/jira/browse/YARN-1274
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.1.1-beta
>Reporter: Alejandro Abdelnur
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: YARN-1274.1.txt, YARN-1274.trunk.1.txt
>
>
> LCE container launch assumes the usercache/USER directory exists and it is 
> owned by the user running the container process.
> But the directory is created only if there are resources to localize by the 
> LCE localization command, if there are not resourcdes to localize, LCE 
> localization never executes and launching fails reporting 255 exit code and 
> the NM logs have something like:
> {code}
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
> provided 1
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
> llama
> 2013-10-04 14:07:56,425 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Can't create 
> directory llama in 
> /yarn/nm/usercache/llama/appcache/application_1380853306301_0004/container_1380853306301_0004_01_04
>  - Permission denied
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1254) NM is polluting container's credentials

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787231#comment-13787231
 ] 

Hudson commented on YARN-1254:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1254. Fixed NodeManager to not pollute container's credentials. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529382)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java


> NM is polluting container's credentials
> ---
>
> Key: YARN-1254
> URL: https://issues.apache.org/jira/browse/YARN-1254
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1254.20131004.1.patch, YARN-1254.20131004.2.patch, 
> YARN-1254.20131030.1.patch
>
>
> Before launching the container, NM is using the same credential object and so 
> is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1273) Distributed shell does not account for start container failures reported asynchronously.

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787234#comment-13787234
 ] 

Hudson commented on YARN-1273:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1273. Fixed Distributed-shell to account for containers that failed to 
start. Contributed by Hitesh Shah. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529389)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/ContainerLaunchFailAppMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


> Distributed shell does not account for start container failures reported 
> asynchronously.
> 
>
> Key: YARN-1273
> URL: https://issues.apache.org/jira/browse/YARN-1273
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1273.1_1.patch, YARN-1273.2.patch, YARN-1273.3.patch
>
>
> 2013-10-04 22:09:15,234 ERROR 
> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #1] 
> distributedshell.ApplicationMaster 
> (ApplicationMaster.java:onStartContainerError(719)) - Failed to start 
> Container container_1380920347574_0018_01_06



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787233#comment-13787233
 ] 

Hudson commented on YARN-1167:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1167. Fixed Distributed Shell to not incorrectly show empty hostname on RM 
UI. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529376)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java


> Submitted distributed shell application shows appMasterHost = empty
> ---
>
> Key: YARN-1167
> URL: https://issues.apache.org/jira/browse/YARN-1167
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
> YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch, 
> YARN-1167.8.patch, YARN-1167.9.patch
>
>
> Submit distributed shell application. Once the application turns to be 
> RUNNING state, app master host should not be empty. In reality, it is empty.
> ==console logs==
> distributedshell.Client: Got application report from ASM for, appId=12, 
> clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
> appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787226#comment-13787226
 ] 

Hudson commented on YARN-1251:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1251. TestDistributedShell#TestDSShell failed with timeout. Contributed by 
Xuan Gong. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529369)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> TestDistributedShell#TestDSShell failed with timeout
> 
>
> Key: YARN-1251
> URL: https://issues.apache.org/jira/browse/YARN-1251
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Junping Du
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: error.log, YARN-1225-kickOffTestDS.patch, 
> YARN-1251.1.patch
>
>
> TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
> recently.
> The Stacktrace is:
> {code}
> java.lang.Exception: test timed out after 9 milliseconds
>   at 
> com.google.protobuf.LiteralByteString.(LiteralByteString.java:234)
>   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
>   at 
> com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
>   at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at $Proxy70.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at $Proxy71.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
> {code}
> For details, please refer:
> https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1232) Configuration to support multiple RMs

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787229#comment-13787229
 ] 

Hudson commented on YARN-1232:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1232. Configuration to support multiple RMs (Karthik Kambatla via bikas) 
(bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529251)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestHAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


> Configuration to support multiple RMs
> -
>
> Key: YARN-1232
> URL: https://issues.apache.org/jira/browse/YARN-1232
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>  Labels: ha
> Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
> yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
> yarn-1232-7.patch
>
>
> We should augment the configuration to allow users specify two RMs and the 
> individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787230#comment-13787230
 ] 

Hudson commented on YARN-1253:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1569 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1569/])
YARN-1253. Changes to LinuxContainerExecutor to run containers as a single 
dedicated user in non-secure mode. (rvs via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529325)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java


> Changes to LinuxContainerExecutor to run containers as a single dedicated 
> user in non-secure mode
> -
>
> Key: YARN-1253
> URL: https://issues.apache.org/jira/browse/YARN-1253
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Roman Shaposhnik
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: YARN-1253.patch.txt
>
>
> When using cgroups we require LCE to be configured in the cluster to start 
> containers. 
> When LCE starts containers as the user that submitted the job. While this 
> works correctly in a secure setup, in an un-secure setup this presents a 
> couple issues:
> * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
> * Because users can impersonate other users, any user would have access to 
> any local file of other users
> Particularly, the second issue is not desirable as a user could get access to 
> ssh keys of other users in the nodes or if there are NFS mounts, get to other 
> users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787210#comment-13787210
 ] 

Hudson commented on YARN-1251:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1251. TestDistributedShell#TestDSShell failed with timeout. Contributed by 
Xuan Gong. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529369)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> TestDistributedShell#TestDSShell failed with timeout
> 
>
> Key: YARN-1251
> URL: https://issues.apache.org/jira/browse/YARN-1251
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Junping Du
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: error.log, YARN-1225-kickOffTestDS.patch, 
> YARN-1251.1.patch
>
>
> TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
> recently.
> The Stacktrace is:
> {code}
> java.lang.Exception: test timed out after 9 milliseconds
>   at 
> com.google.protobuf.LiteralByteString.(LiteralByteString.java:234)
>   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
>   at 
> com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
>   at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at $Proxy70.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at $Proxy71.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
> {code}
> For details, please refer:
> https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1254) NM is polluting container's credentials

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787215#comment-13787215
 ] 

Hudson commented on YARN-1254:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1254. Fixed NodeManager to not pollute container's credentials. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529382)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java


> NM is polluting container's credentials
> ---
>
> Key: YARN-1254
> URL: https://issues.apache.org/jira/browse/YARN-1254
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1254.20131004.1.patch, YARN-1254.20131004.2.patch, 
> YARN-1254.20131030.1.patch
>
>
> Before launching the container, NM is using the same credential object and so 
> is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1273) Distributed shell does not account for start container failures reported asynchronously.

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787218#comment-13787218
 ] 

Hudson commented on YARN-1273:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1273. Fixed Distributed-shell to account for containers that failed to 
start. Contributed by Hitesh Shah. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529389)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/ContainerLaunchFailAppMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


> Distributed shell does not account for start container failures reported 
> asynchronously.
> 
>
> Key: YARN-1273
> URL: https://issues.apache.org/jira/browse/YARN-1273
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1273.1_1.patch, YARN-1273.2.patch, YARN-1273.3.patch
>
>
> 2013-10-04 22:09:15,234 ERROR 
> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #1] 
> distributedshell.ApplicationMaster 
> (ApplicationMaster.java:onStartContainerError(719)) - Failed to start 
> Container container_1380920347574_0018_01_06



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1232) Configuration to support multiple RMs

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787213#comment-13787213
 ] 

Hudson commented on YARN-1232:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1232. Configuration to support multiple RMs (Karthik Kambatla via bikas) 
(bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529251)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestHAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


> Configuration to support multiple RMs
> -
>
> Key: YARN-1232
> URL: https://issues.apache.org/jira/browse/YARN-1232
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>  Labels: ha
> Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
> yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
> yarn-1232-7.patch
>
>
> We should augment the configuration to allow users specify two RMs and the 
> individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787217#comment-13787217
 ] 

Hudson commented on YARN-1167:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1167. Fixed Distributed Shell to not incorrectly show empty hostname on RM 
UI. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529376)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java


> Submitted distributed shell application shows appMasterHost = empty
> ---
>
> Key: YARN-1167
> URL: https://issues.apache.org/jira/browse/YARN-1167
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
> YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch, 
> YARN-1167.8.patch, YARN-1167.9.patch
>
>
> Submit distributed shell application. Once the application turns to be 
> RUNNING state, app master host should not be empty. In reality, it is empty.
> ==console logs==
> distributedshell.Client: Got application report from ASM for, appId=12, 
> clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
> appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787214#comment-13787214
 ] 

Hudson commented on YARN-1253:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1543/])
YARN-1253. Changes to LinuxContainerExecutor to run containers as a single 
dedicated user in non-secure mode. (rvs via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529325)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java


> Changes to LinuxContainerExecutor to run containers as a single dedicated 
> user in non-secure mode
> -
>
> Key: YARN-1253
> URL: https://issues.apache.org/jira/browse/YARN-1253
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Roman Shaposhnik
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: YARN-1253.patch.txt
>
>
> When using cgroups we require LCE to be configured in the cluster to start 
> containers. 
> When LCE starts containers as the user that submitted the job. While this 
> works correctly in a secure setup, in an un-secure setup this presents a 
> couple issues:
> * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
> * Because users can impersonate other users, any user would have access to 
> any local file of other users
> Particularly, the second issue is not desirable as a user could get access to 
> ssh keys of other users in the nodes or if there are NFS mounts, get to other 
> users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1225) FinishApplicationMasterRequest should also have a final IPC/RPC address.

2013-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787182#comment-13787182
 ] 

Hadoop QA commented on YARN-1225:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606985/YARN-1225-v4.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2122//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2122//console

This message is automatically generated.

> FinishApplicationMasterRequest should also have a final IPC/RPC address.
> 
>
> Key: YARN-1225
> URL: https://issues.apache.org/jira/browse/YARN-1225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Junping Du
> Attachments: YARN-1225-kickOffTestDS.patch, YARN-1225-v1.patch, 
> YARN-1225-v2.patch, YARN-1225-v3.patch, YARN-1225-v4.1.patch, 
> YARN-1225-v4.patch
>
>
> AMs already can report final Http URL via FinishApplicationMasterRequest, but 
> there is no field to report an IPC/RPC address.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1254) NM is polluting container's credentials

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787171#comment-13787171
 ] 

Hudson commented on YARN-1254:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1254. Fixed NodeManager to not pollute container's credentials. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529382)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java


> NM is polluting container's credentials
> ---
>
> Key: YARN-1254
> URL: https://issues.apache.org/jira/browse/YARN-1254
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1254.20131004.1.patch, YARN-1254.20131004.2.patch, 
> YARN-1254.20131030.1.patch
>
>
> Before launching the container, NM is using the same credential object and so 
> is polluting what container should see. We should fix this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1273) Distributed shell does not account for start container failures reported asynchronously.

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787174#comment-13787174
 ] 

Hudson commented on YARN-1273:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1273. Fixed Distributed-shell to account for containers that failed to 
start. Contributed by Hitesh Shah. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529389)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/ContainerLaunchFailAppMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


> Distributed shell does not account for start container failures reported 
> asynchronously.
> 
>
> Key: YARN-1273
> URL: https://issues.apache.org/jira/browse/YARN-1273
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1273.1_1.patch, YARN-1273.2.patch, YARN-1273.3.patch
>
>
> 2013-10-04 22:09:15,234 ERROR 
> [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #1] 
> distributedshell.ApplicationMaster 
> (ApplicationMaster.java:onStartContainerError(719)) - Failed to start 
> Container container_1380920347574_0018_01_06



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787173#comment-13787173
 ] 

Hudson commented on YARN-1167:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1167. Fixed Distributed Shell to not incorrectly show empty hostname on RM 
UI. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529376)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java


> Submitted distributed shell application shows appMasterHost = empty
> ---
>
> Key: YARN-1167
> URL: https://issues.apache.org/jira/browse/YARN-1167
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Tassapol Athiapinya
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: YARN-1167.1.patch, YARN-1167.2.patch, YARN-1167.3.patch, 
> YARN-1167.4.patch, YARN-1167.5.patch, YARN-1167.6.patch, YARN-1167.7.patch, 
> YARN-1167.8.patch, YARN-1167.9.patch
>
>
> Submit distributed shell application. Once the application turns to be 
> RUNNING state, app master host should not be empty. In reality, it is empty.
> ==console logs==
> distributedshell.Client: Got application report from ASM for, appId=12, 
> clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
> appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1251) TestDistributedShell#TestDSShell failed with timeout

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787166#comment-13787166
 ] 

Hudson commented on YARN-1251:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1251. TestDistributedShell#TestDSShell failed with timeout. Contributed by 
Xuan Gong. (hitesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529369)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> TestDistributedShell#TestDSShell failed with timeout
> 
>
> Key: YARN-1251
> URL: https://issues.apache.org/jira/browse/YARN-1251
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Reporter: Junping Du
>Assignee: Xuan Gong
> Fix For: 2.1.2-beta
>
> Attachments: error.log, YARN-1225-kickOffTestDS.patch, 
> YARN-1251.1.patch
>
>
> TestDistributedShell#TestDSShell on trunk Jenkins are failed consistently 
> recently.
> The Stacktrace is:
> {code}
> java.lang.Exception: test timed out after 9 milliseconds
>   at 
> com.google.protobuf.LiteralByteString.(LiteralByteString.java:234)
>   at com.google.protobuf.ByteString.copyFromUtf8(ByteString.java:255)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getMethodNameBytes(ProtobufRpcEngineProtos.java:286)
>   at 
> org.apache.hadoop.ipc.protobuf.ProtobufRpcEngineProtos$RequestHeaderProto.getSerializedSize(ProtobufRpcEngineProtos.java:462)
>   at 
> com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:84)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$RpcMessageWithHeader.write(ProtobufRpcEngine.java:302)
>   at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:989)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1357)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at $Proxy70.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:185)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at $Proxy71.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:195)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.monitorApplication(Client.java:622)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:597)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:125)
> {code}
> For details, please refer:
> https://builds.apache.org/job/PreCommit-YARN-Build/2039//testReport/



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1232) Configuration to support multiple RMs

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787169#comment-13787169
 ] 

Hudson commented on YARN-1232:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1232. Configuration to support multiple RMs (Karthik Kambatla via bikas) 
(bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529251)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestHAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMHAProtocolService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java


> Configuration to support multiple RMs
> -
>
> Key: YARN-1232
> URL: https://issues.apache.org/jira/browse/YARN-1232
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>  Labels: ha
> Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, 
> yarn-1232-4.patch, yarn-1232-5.patch, yarn-1232-6.patch, yarn-1232-7.patch, 
> yarn-1232-7.patch
>
>
> We should augment the configuration to allow users specify two RMs and the 
> individual RPC addresses for them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode

2013-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787170#comment-13787170
 ] 

Hudson commented on YARN-1253:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #353 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/353/])
YARN-1253. Changes to LinuxContainerExecutor to run containers as a single 
dedicated user in non-secure mode. (rvs via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1529325)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java


> Changes to LinuxContainerExecutor to run containers as a single dedicated 
> user in non-secure mode
> -
>
> Key: YARN-1253
> URL: https://issues.apache.org/jira/browse/YARN-1253
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Roman Shaposhnik
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: YARN-1253.patch.txt
>
>
> When using cgroups we require LCE to be configured in the cluster to start 
> containers. 
> When LCE starts containers as the user that submitted the job. While this 
> works correctly in a secure setup, in an un-secure setup this presents a 
> couple issues:
> * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
> * Because users can impersonate other users, any user would have access to 
> any local file of other users
> Particularly, the second issue is not desirable as a user could get access to 
> ssh keys of other users in the nodes or if there are NFS mounts, get to other 
> users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1225) FinishApplicationMasterRequest should also have a final IPC/RPC address.

2013-10-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1225:
-

Attachment: YARN-1225-v4.1.patch

Upload YARN-1225-v4.1.patch.
Above test failure is due to changes in v4 tries to catch specific exceptions 
instead of Exception in RMCommunicator(with addressing existing Findbugs 
warning). Without catching cast exception, TestLocalContainerAllocator will get 
failed as faked job class cannot cast to JobImpl when stopping 
LocalContainerAllocator (will call RMCommunicator.unregister()).

> FinishApplicationMasterRequest should also have a final IPC/RPC address.
> 
>
> Key: YARN-1225
> URL: https://issues.apache.org/jira/browse/YARN-1225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Junping Du
> Attachments: YARN-1225-kickOffTestDS.patch, YARN-1225-v1.patch, 
> YARN-1225-v2.patch, YARN-1225-v3.patch, YARN-1225-v4.1.patch, 
> YARN-1225-v4.patch
>
>
> AMs already can report final Http URL via FinishApplicationMasterRequest, but 
> there is no field to report an IPC/RPC address.



--
This message was sent by Atlassian JIRA
(v6.1#6144)