[jira] [Created] (YARN-5309) SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-01 Thread Thomas Friedrich (JIRA)
Thomas Friedrich created YARN-5309:
--

 Summary: SSLFactory truststore reloader thread leak in 
TimelineClientImpl
 Key: YARN-5309
 URL: https://issues.apache.org/jira/browse/YARN-5309
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver, yarn
Affects Versions: 2.7.1
Reporter: Thomas Friedrich


We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
creates the ReloadingX509TrustManager instance which in turn starts a trust 
store reloader thread. 
However, the SSLFactory is never destroyed and hence the trust store reloader 
threads are not killed.

This problem was observed by a customer who had SSL enabled in Hadoop and 
submitted many queries against the HiveServer2. After a few days, the HS2 
instance crashed and from the Java dump we could see many (over 13000) threads 
like this:
"Truststore reloader thread" #126 daemon prio=5 os_prio=0 
tid=0x7f680d2e3000 nid=0x98fd waiting on 
condition [0x7f67e482c000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
(ReloadingX509TrustManager.java:225)
at java.lang.Thread.run(Thread.java:745)

HiveServer2 uses the JobClient to submit a job:
Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at line 
89 in 

ReloadingX509TrustManager)) 
owns: Object  (id=464)  
owns: Object  (id=465)  
owns: Object  (id=466)  
owns: ServiceLoader  (id=210)
ReloadingX509TrustManager.(String, String, String, long) line: 89 
FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
SSLFactory.init() line: 131 
TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
TimelineClientImpl.serviceInit(Configuration) line: 269 
TimelineClientImpl(AbstractService).init(Configuration) line: 163   
YarnClientImpl.serviceInit(Configuration) line: 169 
YarnClientImpl(AbstractService).init(Configuration) line: 163   
ResourceMgrDelegate.serviceInit(Configuration) line: 102
ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
ResourceMgrDelegate.(YarnConfiguration) line: 96  
YARNRunner.(Configuration) line: 112  
YarnClientProtocolProvider.create(Configuration) line: 34   
Cluster.initialize(InetSocketAddress, Configuration) line: 95   
Cluster.(InetSocketAddress, Configuration) line: 82   
Cluster.(Configuration) line: 75  
JobClient.init(JobConf) line: 475   
JobClient.(JobConf) line: 454 
MapRedTask(ExecDriver).execute(DriverContext) line: 401 
MapRedTask.execute(DriverContext) line: 137 
MapRedTask(Task).executeTask() line: 160 
TaskRunner.runSequential() line: 88 
Driver.launchTask(Task, String, boolean, String, int, 
DriverContext) line: 1653   
Driver.execute() line: 1412 

For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl is 
created. But because the HS2 process stays up for days, the previous trust 
store reloader threads are still hanging around in the HS2 process and 
eventually use all the resources available. 

It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl but 
it doesn't have a destroy method to begin with. 

One option to avoid this problem is to disable the yarn timeline service 
(yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3662) Federation Membership State APIs

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359936#comment-15359936
 ] 

Hadoop QA commented on YARN-3662:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
23s {color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s 
{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
33s {color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} YARN-2915 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 10 
new + 26 unchanged - 2 fixed = 36 total (was 28) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 9s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 45s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e2f6409 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815848/YARN-3662-YARN-2915-v2.patch
 |
| JIRA Issue | YARN-3662 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  cc  |
| uname | Linux 2dd04b7de53b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-2915 / 41b006e |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12173/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
 |
|  Test Results | 

[jira] [Updated] (YARN-5307) Federation Application State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-5307:
-
Attachment: YARN-5307-YARN-2915-v1.patch

Uploading v1 of the *FederationApplicationState* APIs.

The patch is available for review but is not ready for CI as it depends on 
YARN-3662. I'll add the tests for the app state protocol records post commit of 
YARN-3662.

> Federation Application State APIs
> -
>
> Key: YARN-5307
> URL: https://issues.apache.org/jira/browse/YARN-5307
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-5307-YARN-2915-v1.patch
>
>
> The Federation Application State encapsulates the mapping between an 
> application and it's _home_ sub-cluster, i.e. the sub-cluster to which it is 
> submitted to by the Router. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3662) Federation Membership State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3662:
-
Attachment: YARN-3662-YARN-2915-v2.patch

Uploading v2 patch that contains only the *FederationMembershipState* APIs as 
mentioned above.

> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch, YARN-3662-YARN-2915-v2.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5308) FairScheduler: Move continuous scheduling related threads to TestContinuousScheduling

2016-07-01 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-5308:
--

 Summary: FairScheduler: Move continuous scheduling related threads 
to TestContinuousScheduling
 Key: YARN-5308
 URL: https://issues.apache.org/jira/browse/YARN-5308
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler, test
Affects Versions: 2.8.0
Reporter: Karthik Kambatla


TestFairScheduler still has some tests on continuous scheduling. We should move 
them to TestContinuousScheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3662) Federation Membership State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359895#comment-15359895
 ] 

Subru Krishnan commented on YARN-3662:
--

Splitting the JIRA into 2 to allow for more manageable review/feedback cycle:
  * Current one which is for adding *FederationMembershipState*.
  * YARN-5307 for adding *FederationApplicationState*.

> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3662) Federation Membership State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3662:
-
Description: 
The Federation Application State encapsulates the information about the active 
RM of each sub-cluster that is participating in Federation. The information 
includes addresses for ClientRM, ApplicationMaster and Admin services along 
with the sub_cluster _capability_ which is currently defined by 
*ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for further 
details.


  was:The Federation State defines the additional state that needs to be 
maintained to loosely couple multiple individual sub-clusters into a single 
large federated cluster


> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3662) Federation Membership State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3662:
-
Summary: Federation Membership State APIs  (was: Federation StateStore APIs)

> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch
>
>
> The Federation State defines the additional state that needs to be maintained 
> to loosely couple multiple individual sub-clusters into a single large 
> federated cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5307) Federation Application State APIs

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-5307:
-
Description: The Federation Application State encapsulates the mapping 
between an application and it's _home_ sub-cluster, i.e. the sub-cluster to 
which it is submitted to by the Router. Please refer to the design doc in 
parent JIRA for further details.  (was: The Federation State defines the 
additional state that needs to be maintained to loosely couple multiple 
individual sub-clusters into a single large federated cluster)

> Federation Application State APIs
> -
>
> Key: YARN-5307
> URL: https://issues.apache.org/jira/browse/YARN-5307
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>
> The Federation Application State encapsulates the mapping between an 
> application and it's _home_ sub-cluster, i.e. the sub-cluster to which it is 
> submitted to by the Router. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5307) Federation Application State APIs

2016-07-01 Thread Subru Krishnan (JIRA)
Subru Krishnan created YARN-5307:


 Summary: Federation Application State APIs
 Key: YARN-5307
 URL: https://issues.apache.org/jira/browse/YARN-5307
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Subru Krishnan
Assignee: Subru Krishnan


The Federation State defines the additional state that needs to be maintained 
to loosely couple multiple individual sub-clusters into a single large 
federated cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers

2016-07-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359869#comment-15359869
 ] 

Ray Chiang commented on YARN-5047:
--

RE: No new tests

Pure code refactoring.

RE: Checkstyle

Complaining about no accessors for a protected member variable.  Not critical.

RE: Failing unit tests

Tests pass in my tree.



> Refactor nodeUpdate across schedulers
> -
>
> Key: YARN-5047
> URL: https://issues.apache.org/jira/browse/YARN-5047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, scheduler
>Affects Versions: 3.0.0-alpha1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
> Attachments: YARN-5047.001.patch, YARN-5047.002.patch, 
> YARN-5047.003.patch, YARN-5047.004.patch, YARN-5047.005.patch
>
>
> FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of 
> commonality in their code.  See about refactoring the common parts into 
> AbstractYARNScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359853#comment-15359853
 ] 

Hadoop QA commented on YARN-2664:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
0s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 8m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
49s {color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 28s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 27s {color} 
| {color:red} root generated 1 new + 708 unchanged - 0 fixed = 709 total (was 
708) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s 
{color} | {color:red} root: The patch generated 63 new + 186 unchanged - 0 
fixed = 249 total (was 186) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 8m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch 8 line(s) with tabs. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
53s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 4m 57s 
{color} | {color:red} root generated 19 new + 11566 unchanged - 0 fixed = 11585 
total (was 11566) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 47s {color} 
| {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 6m 26s 
{color} | {color:red} The patch generated 5 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 169m 19s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer 
|
|   | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebApp |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815818/YARN-2664.12.patch |
| JIRA Issue | YARN-2664 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux ac4b52726efa 

[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers

2016-07-01 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359850#comment-15359850
 ] 

Karthik Kambatla commented on YARN-5047:


[~rchiang] - the latest patch looks good. Is the CS test failure related? 

> Refactor nodeUpdate across schedulers
> -
>
> Key: YARN-5047
> URL: https://issues.apache.org/jira/browse/YARN-5047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, scheduler
>Affects Versions: 3.0.0-alpha1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
> Attachments: YARN-5047.001.patch, YARN-5047.002.patch, 
> YARN-5047.003.patch, YARN-5047.004.patch, YARN-5047.005.patch
>
>
> FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of 
> commonality in their code.  See about refactoring the common parts into 
> AbstractYARNScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4568) Fix message when NodeManager runs into errors initializing the recovery directory

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359814#comment-15359814
 ] 

Hudson commented on YARN-4568:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10046 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10046/])
YARN-4568. Fix message when NodeManager runs into errors initializing (rchiang: 
rev 0a5def155eff4564b5dc7685e7460952f51bbd24)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java


> Fix message when NodeManager runs into errors initializing the recovery 
> directory
> -
>
> Key: YARN-4568
> URL: https://issues.apache.org/jira/browse/YARN-4568
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-4568.001.patch
>
>
> When the NodeManager tries to initialize the recovery directory, the method 
> NativeIO#chmod() can throw one of several Errno style exceptions.  This 
> propagates up to the top without any try/catch statement.
> It would be nice to have a cleaner error message in this situation (plus the 
> original exception) to give users an idea about what part of the system has 
> gone wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4568) Fix message when NodeManager runs into errors initializing the recovery directory

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359812#comment-15359812
 ] 

Hadoop QA commented on YARN-4568:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 2s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 54s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12781285/YARN-4568.001.patch |
| JIRA Issue | YARN-4568 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 302cfa591e46 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 36cd0bc |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12172/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12172/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Fix message when NodeManager runs into errors initializing the recovery 
> directory
> -
>
> Key: YARN-4568
> URL: https://issues.apache.org/jira/browse/YARN-4568
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.7.1
>   

[jira] [Commented] (YARN-4359) Update LowCost agents logic to take advantage of YARN-4358

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359784#comment-15359784
 ] 

Hadoop QA commented on YARN-4359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 50 unchanged - 10 fixed = 52 total (was 60) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 987 unchanged - 2 fixed = 987 total (was 989) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 36m 28s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 16s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815826/YARN-4359.10.patch |
| JIRA Issue | YARN-4359 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3de963542041 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c35a5a7 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12171/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12171/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12171/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12171/console |
| 

[jira] [Commented] (YARN-4568) Fix message when NodeManager runs into errors initializing the recovery directory

2016-07-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359776#comment-15359776
 ] 

Ray Chiang commented on YARN-4568:
--

Checking this in.

> Fix message when NodeManager runs into errors initializing the recovery 
> directory
> -
>
> Key: YARN-4568
> URL: https://issues.apache.org/jira/browse/YARN-4568
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: YARN-4568.001.patch
>
>
> When the NodeManager tries to initialize the recovery directory, the method 
> NativeIO#chmod() can throw one of several Errno style exceptions.  This 
> propagates up to the top without any try/catch statement.
> It would be nice to have a cleaner error message in this situation (plus the 
> original exception) to give users an idea about what part of the system has 
> gone wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5211) Supporting "priorities" in the ReservationSystem

2016-07-01 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po reassigned YARN-5211:
-

Assignee: Sean Po  (was: Carlo Curino)

> Supporting "priorities" in the ReservationSystem
> 
>
> Key: YARN-5211
> URL: https://issues.apache.org/jira/browse/YARN-5211
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
>
> The ReservationSystem currently has an implicit FIFO priority. This JIRA 
> tracks effort to generalize this to arbitrary priority. This is non-trivial 
> as the greedy nature of our ReservationAgents might need to be revisited if 
> not enough space if found for late-arriving but higher priority reservations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4568) Fix message when NodeManager runs into errors initializing the recovery directory

2016-07-01 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359769#comment-15359769
 ] 

Karthik Kambatla commented on YARN-4568:


+1

> Fix message when NodeManager runs into errors initializing the recovery 
> directory
> -
>
> Key: YARN-4568
> URL: https://issues.apache.org/jira/browse/YARN-4568
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: YARN-4568.001.patch
>
>
> When the NodeManager tries to initialize the recovery directory, the method 
> NativeIO#chmod() can throw one of several Errno style exceptions.  This 
> propagates up to the top without any try/catch statement.
> It would be nice to have a cleaner error message in this situation (plus the 
> original exception) to give users an idea about what part of the system has 
> gone wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5282) Fix typos in CapacityScheduler documentation

2016-07-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359738#comment-15359738
 ] 

Ray Chiang commented on YARN-5282:
--

Thanks [~varun_saxena] for the commit and [~ajisakaa] for the review!

> Fix typos in CapacityScheduler documentation
> 
>
> Key: YARN-5282
> URL: https://issues.apache.org/jira/browse/YARN-5282
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Trivial
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-5282.001.patch
>
>
> Found some minor typos while reading the CapacityScheduler documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4359) Update LowCost agents logic to take advantage of YARN-4358

2016-07-01 Thread Ishai Menache (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishai Menache updated YARN-4359:

Attachment: YARN-4359.10.patch

> Update LowCost agents logic to take advantage of YARN-4358
> --
>
> Key: YARN-4359
> URL: https://issues.apache.org/jira/browse/YARN-4359
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Ishai Menache
> Attachments: YARN-4359.0.patch, YARN-4359.10.patch, 
> YARN-4359.3.patch, YARN-4359.4.patch, YARN-4359.5.patch, YARN-4359.6.patch, 
> YARN-4359.7.patch, YARN-4359.8.patch, YARN-4359.9.patch
>
>
> Given the improvements of YARN-4358, the LowCost agent should be improved to 
> leverage this, and operate on RLESparseResourceAllocation (ideally leveraging 
> the improvements of YARN-3454 to compute avaialable resources)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5023) TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry random failure

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359678#comment-15359678
 ] 

Hudson commented on YARN-5023:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10044 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10044/])
YARN-5023. TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry (jianhe: 
rev c35a5a7a8d85b42498e6981a6b1f09f2bdd56459)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java


> TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry random failure
> ---
>
> Key: YARN-5023
> URL: https://issues.apache.org/jira/browse/YARN-5023
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Bibin A Chundatt
>Assignee: sandflee
> Fix For: 2.8.0
>
> Attachments: YARN-5023.01.patch
>
>
> https://builds.apache.org/job/PreCommit-YARN-Build/11296/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_91.txt
> {noformat}
> Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 96.482 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
> testShouldNotCountFailureToMaxAttemptRetry(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart)
>   Time elapsed: 56.467 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timeout). 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:266)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:225)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:955)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:942)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:961)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForNewAMToLaunchAndRegister(MockRM.java:295)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testShouldNotCountFailureToMaxAttemptRetry(TestAMRestart.java:647)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5296) NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl

2016-07-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359674#comment-15359674
 ] 

Jian He commented on YARN-5296:
---

No, this does not need to be in 2.7.3

> NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl
> ---
>
> Key: YARN-5296
> URL: https://issues.apache.org/jira/browse/YARN-5296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Karam Singh
>Assignee: Junping Du
> Attachments: YARN-5296.patch
>
>
> Ran tests in following manner,
> 1. Run GridMix of 768 sequestionally around 17 times to execute about 12.9K 
> apps.
> 2. After 4-5hrs take Check NM Heap using Memory Analyser. It report around 
> 96% Heap is being used my ContainerMetrics
> 3. Run 7 more GridMix run for have around 18.2apps ran in total. Again check 
> NM heap using Memory Analyser again 96% heap is being used by 
> ContainerMetrics. 
> 4. Start one more grimdmix run, while run going on , NMs started going down 
> with OOM, around running 18.7K+, On analysing NM heap using Memory analyser, 
> OOM was caused by ContainerMetrics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359667#comment-15359667
 ] 

Carlo Curino commented on YARN-2664:


[~elgoiri] thanks for taking this over. Let's wait for Jenkins "opinion" and we 
will re-review. If my memory served me well the patch was reasonably good, but 
had some scalability issues when we were showing too many reservations. 
Please test it in a live cluster with a large number of reservations. 

Also it would be good to open a JIRA to port this magic to [~wangda]'s next 
version of UIs, using REST endpoints (we have a list API that [~seanpo03] 
contributed that should help there).  

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, 
> YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5296) NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl

2016-07-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359649#comment-15359649
 ] 

Vinod Kumar Vavilapalli commented on YARN-5296:
---

ContainerMetrics went into 2.7.0 though the histograms only went into 2.9.0. 
Does this patch need to go into 2.7.3?

> NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl
> ---
>
> Key: YARN-5296
> URL: https://issues.apache.org/jira/browse/YARN-5296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Karam Singh
>Assignee: Junping Du
> Attachments: YARN-5296.patch
>
>
> Ran tests in following manner,
> 1. Run GridMix of 768 sequestionally around 17 times to execute about 12.9K 
> apps.
> 2. After 4-5hrs take Check NM Heap using Memory Analyser. It report around 
> 96% Heap is being used my ContainerMetrics
> 3. Run 7 more GridMix run for have around 18.2apps ran in total. Again check 
> NM heap using Memory Analyser again 96% heap is being used by 
> ContainerMetrics. 
> 4. Start one more grimdmix run, while run going on , NMs started going down 
> with OOM, around running 18.7K+, On analysing NM heap using Memory analyser, 
> OOM was caused by ContainerMetrics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5296) NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl

2016-07-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359636#comment-15359636
 ] 

Jian He commented on YARN-5296:
---

yep, we'll run more testing to verify if the oom is resolved.

> NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl
> ---
>
> Key: YARN-5296
> URL: https://issues.apache.org/jira/browse/YARN-5296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Karam Singh
>Assignee: Junping Du
> Attachments: YARN-5296.patch
>
>
> Ran tests in following manner,
> 1. Run GridMix of 768 sequestionally around 17 times to execute about 12.9K 
> apps.
> 2. After 4-5hrs take Check NM Heap using Memory Analyser. It report around 
> 96% Heap is being used my ContainerMetrics
> 3. Run 7 more GridMix run for have around 18.2apps ran in total. Again check 
> NM heap using Memory Analyser again 96% heap is being used by 
> ContainerMetrics. 
> 4. Start one more grimdmix run, while run going on , NMs started going down 
> with OOM, around running 18.7K+, On analysing NM heap using Memory analyser, 
> OOM was caused by ContainerMetrics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.12.patch

Rebased patch.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, 
> YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359633#comment-15359633
 ] 

Subru Krishnan commented on YARN-2664:
--

Assigning to [~elgoiri] as this has been idle for a while and [~elgoiri] is 
keen on taking this up for 2.8. Thanks!

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, 
> YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-2664:
-
Assignee: Inigo Goiri  (was: Matteo Mazzucchelli)

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.2.patch, YARN-2664.3.patch, 
> YARN-2664.4.patch, YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, 
> YARN-2664.8.patch, YARN-2664.9.patch, YARN-2664.patch, legal.patch, 
> screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-07-01 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359623#comment-15359623
 ] 

Robert Kanter commented on YARN-4676:
-

I briefly spoke with [~djp] and [~vvasudev] at Hadoop Summit.  I think we're 
all in agreement that this is good to go and we'll do a followup JIRA to handle 
the RM restart/failover issue.  I'm going to do one last look through the patch 
either later today or Tuesday.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, 
> GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, 
> YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, 
> YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, 
> YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch, 
> YARN-4676.015.patch, YARN-4676.016.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to 
> graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager 
> automatically evaluates
> status of all affected nodes to kicks out decommission or recommission 
> actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to 
> decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning 
> timeout at individual
> nodes granularity is supported and could be dynamically updated. The 
> mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves 
> different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful 
> decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5296) NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl

2016-07-01 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359624#comment-15359624
 ] 

Daniel Templeton commented on YARN-5296:


If the fix resolves the issue, then LGTM.

> NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl
> ---
>
> Key: YARN-5296
> URL: https://issues.apache.org/jira/browse/YARN-5296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Karam Singh
>Assignee: Junping Du
> Attachments: YARN-5296.patch
>
>
> Ran tests in following manner,
> 1. Run GridMix of 768 sequestionally around 17 times to execute about 12.9K 
> apps.
> 2. After 4-5hrs take Check NM Heap using Memory Analyser. It report around 
> 96% Heap is being used my ContainerMetrics
> 3. Run 7 more GridMix run for have around 18.2apps ran in total. Again check 
> NM heap using Memory Analyser again 96% heap is being used by 
> ContainerMetrics. 
> 4. Start one more grimdmix run, while run going on , NMs started going down 
> with OOM, around running 18.7K+, On analysing NM heap using Memory analyser, 
> OOM was caused by ContainerMetrics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5296) NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl

2016-07-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359620#comment-15359620
 ] 

Jian He commented on YARN-5296:
---

I analyzed the memory heap dump together with Junping. This is indeed an issue 
to be fixed.
I'd like to commit this today.  

> NMs going OutOfMemory because ContainerMetrics leak in ContainerMonitorImpl
> ---
>
> Key: YARN-5296
> URL: https://issues.apache.org/jira/browse/YARN-5296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Karam Singh
>Assignee: Junping Du
> Attachments: YARN-5296.patch
>
>
> Ran tests in following manner,
> 1. Run GridMix of 768 sequestionally around 17 times to execute about 12.9K 
> apps.
> 2. After 4-5hrs take Check NM Heap using Memory Analyser. It report around 
> 96% Heap is being used my ContainerMetrics
> 3. Run 7 more GridMix run for have around 18.2apps ran in total. Again check 
> NM heap using Memory Analyser again 96% heap is being used by 
> ContainerMetrics. 
> 4. Start one more grimdmix run, while run going on , NMs started going down 
> with OOM, around running 18.7K+, On analysing NM heap using Memory analyser, 
> OOM was caused by ContainerMetrics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359372#comment-15359372
 ] 

Varun Saxena commented on YARN-5174:


Looks fine to me too. I think its good to go in.

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5306) Yarn should detect and fail fast on duplicate resources in container request

2016-07-01 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-5306:


 Summary: Yarn should detect and fail fast on duplicate resources 
in container request
 Key: YARN-5306
 URL: https://issues.apache.org/jira/browse/YARN-5306
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Yesha Vora
Priority: Critical


In some cases, Yarn gets duplicate copies of resources in resource-list. 
In this case, you end up with a resource list which contains two copies of 
resource JAR, with the timestamps of the two separate uploads —only one of 
which (the later one) is correct. At download time, the NM goes through the 
list and fails the download when it gets to the one with the older timestamp.

We need some utility class to do a scan & check could be used by the NM at 
download time (so fail with meaningful errors), and the yarn client could 
perhaps do the check before launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4359) Update LowCost agents logic to take advantage of YARN-4358

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359349#comment-15359349
 ] 

Hadoop QA commented on YARN-4359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 50 unchanged - 10 fixed = 53 total (was 60) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 987 unchanged - 2 fixed = 987 total (was 989) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 51s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 33s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815783/YARN-4359.9.patch |
| JIRA Issue | YARN-4359 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4dfeae956341 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c25021f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12167/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12167/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit test logs |  

[jira] [Comment Edited] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359301#comment-15359301
 ] 

Sangjin Lee edited comment on YARN-5174 at 7/1/16 5:13 PM:
---

I'll commit this patch shortly unless there is an objection. Thanks!


was (Author: sjlee0):
I'll commit this patch unless there is an objection. Thanks!

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5226) remove AHS enable check from LogsCLI#fetchAMContainerLogs

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359306#comment-15359306
 ] 

Hadoop QA commented on YARN-5226:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 6s {color} | 
{color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 56s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestLogsCLI |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12809285/YARN-5226.1.patch |
| JIRA Issue | YARN-5226 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c3cb825c579f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c25021f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12168/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12168/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12168/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12168/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> remove AHS enable check from 

[jira] [Commented] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359301#comment-15359301
 ] 

Sangjin Lee commented on YARN-5174:
---

I'll commit this patch unless there is an objection. Thanks!

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-5174:
--
Attachment: The YARN Timeline Service v.2.pdf

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2928) YARN Timeline Service: Next generation

2016-07-01 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-2928:
--
Attachment: The YARN Timeline Service v.2 Documentation.pdf

> YARN Timeline Service: Next generation
> --
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, 
> ATSv2BackendHBaseSchemaproposal.pdf, Data model proposal v1.pdf, The YARN 
> Timeline Service v.2 Documentation.pdf, Timeline Service Next Gen - Planning 
> - ppt.pptx, TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf, 
> timeline_service_v2_next_milestones.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2928) YARN Timeline Service: Next generation

2016-07-01 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-2928:
--
Attachment: (was: The YARN Timeline Service v.2 Documentation.pdf)

> YARN Timeline Service: Next generation
> --
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, 
> ATSv2BackendHBaseSchemaproposal.pdf, Data model proposal v1.pdf, Timeline 
> Service Next Gen - Planning - ppt.pptx, 
> TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf, 
> timeline_service_v2_next_milestones.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-5174:
--
Attachment: (was: The YARN Timeline Service v.2.pdf)

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4359) Update LowCost agents logic to take advantage of YARN-4358

2016-07-01 Thread Ishai Menache (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishai Menache updated YARN-4359:

Attachment: YARN-4359.9.patch

> Update LowCost agents logic to take advantage of YARN-4358
> --
>
> Key: YARN-4359
> URL: https://issues.apache.org/jira/browse/YARN-4359
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Ishai Menache
> Attachments: YARN-4359.0.patch, YARN-4359.3.patch, YARN-4359.4.patch, 
> YARN-4359.5.patch, YARN-4359.6.patch, YARN-4359.7.patch, YARN-4359.8.patch, 
> YARN-4359.9.patch
>
>
> Given the improvements of YARN-4358, the LowCost agent should be improved to 
> leverage this, and operate on RLESparseResourceAllocation (ideally leveraging 
> the improvements of YARN-3454 to compute avaialable resources)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5174) [documentation] several updates/corrections to timeline service documentation

2016-07-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359144#comment-15359144
 ] 

Sangjin Lee commented on YARN-5174:
---

The latest patch looks good to me. Is there anything else we should add?

> [documentation] several updates/corrections to timeline service documentation
> -
>
> Key: YARN-5174
> URL: https://issues.apache.org/jira/browse/YARN-5174
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
> Attachments: Hierarchy.png, 
> PublishingApplicationDatatoYARNTimelineServicev.pdf, The YARN Timeline 
> Service v.2.pdf, YARN-5174-YARN-2928.01.patch, YARN-5174-YARN-2928.02.patch, 
> YARN-5174-YARN-2928.03.patch, YARN-5174-YARN-2928.03.patch, 
> YARN-5174-YARN-2928.04.patch, YARN-5174-YARN-2928.05.patch, 
> YARN-5174-YARN-2928.06.patch, YARN-5174-YARN-2928.07.patch, 
> YARN-5174-YARN-2928.08.patch, YARN-5174-YARN-2928.09.patch, 
> YARN-5174-YARN-2928.10.patch, flow_hierarchy.png
>
>
> One part that is missing in the documentation is the need to add 
> {{hbase-site.xml}} on the client side (the client hadoop cluster). First, we 
> need to arrive at the minimally required client setting to connect to the 
> right hbase cluster. Then, we need to document it so that users know exactly 
> what to do to configure the cluster to use the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5305) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token III

2016-07-01 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358963#comment-15358963
 ] 

Naganarasimha G R commented on YARN-5305:
-

[~xinxianyin], IIUC when YARN-5175 is in place i think this issue should not 
occur. how about working on that instead of this ?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token III
> ---
>
> Key: YARN-5305
> URL: https://issues.apache.org/jira/browse/YARN-5305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098 and YARN-5302, this problem happens when AM submits 
> a startContainer request with a new HDFS token (say, tokenB) which is not 
> managed by YARN, so two tokens exist in the credentials of the user on NM, 
> one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is 
> selected when connect to HDFS and tokenB expires, exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358944#comment-15358944
 ] 

Naganarasimha G R commented on YARN-5302:
-

It seems to be handled in YARN-2704. [~xinxianyin] please verify whether 
YARN-2704 fixes this and YARN-5305 (not particularly sure abt 5305)

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5295) YARN queue-mappings to check Queue is present before submitting job

2016-07-01 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358899#comment-15358899
 ] 

Prabhu Joseph commented on YARN-5295:
-

Hi [~sunilg], In addition, we also need to include below code snippet in 
UserGroupMappingPlacementRule#getMappedQueue before returning the mapped queue, 
which will return a valid queue which is a existent leaf queue. 

{code}
 for (QueueMapping mapping : mappings) {
 if (queue == null || !(queue instanceof LeafQueue)) {
  continue;
 }
 return queue;
}
{code}

> YARN queue-mappings to check Queue is present before submitting job
> ---
>
> Key: YARN-5295
> URL: https://issues.apache.org/jira/browse/YARN-5295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.2
>Reporter: Prabhu Joseph
>
> In yarn Queue-Mappings, Yarn should check if the queue is present before 
> submitting the job. If not present it should go to next mapping available.
> For example if we have
> yarn.scheduler.capacity.queue-mappings=u:%user:%user,g:edw:platform
> and I submit job with user "test" and if there is no "test" queue then it 
> should check the second mapping (g:edw:platform) in the list and if test is 
> part of edw group it should submit job in platform queue.
> Below Sanity checks has to be done for the mapped queue in the list and if it 
> fails then the the next queue mapping has to be chosen, when there is no 
> queue mapping passing the sanity check, only then the application has to be 
> Rejected.
> 1. is queue present
> 2. is queue not a leaf queue
> 3. is user either have ACL Submit_Applications or Administer_Queue of the 
> queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358718#comment-15358718
 ] 

Xianyin Xin commented on YARN-5302:
---

Hi [~varun_saxena], when initAppAggregator, it uses the credentials read from 
NMStateStore. So it has nothing to do with the new token sent from RM.

{quote}
Do we not update the token in NM state store when it changes ?
{quote}

Just upload a patch which adopt this way, but maybe we also have other better 
solutions.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-679) add an entry point that can start any Yarn service

2016-07-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090020#comment-15090020
 ] 

Steve Loughran edited comment on YARN-679 at 7/1/16 9:45 AM:
-

GitHub user steveloughran opened a pull request:




was (Author: githubbot):
GitHub user steveloughran opened a pull request:

https://github.com/apache/hadoop/pull/68

YARN-679 service launcher

Pull-request version of YARN-679; initially the 005 patch plus corrections 
of javadocs and checkstyles

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/steveloughran/hadoop stevel/YARN-679-launcher

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/68.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #68


commit 8190fcbea75d203a43052339736ea2a412d44f16
Author: Steve Loughran 
Date:   2014-06-03T17:09:26Z

YARN-679: launcher code move

commit 5216a290371eb9050bf1fc98cd82aeea05f2f9d5
Author: Steve Loughran 
Date:   2014-06-03T18:43:41Z

YARN-679 service launcher adapting to changes in ExitUtil; passng params 
down as a list


commit a8ea0b26cb101dbfc47bb8349bfa3510d0701efe
Author: Steve Loughran 
Date:   2014-06-04T09:46:45Z

YARN-679 add javadocs & better launching for service-launcher

commit dcb4599ca9ed1feadff2d0149819640740405201
Author: Steve Loughran 
Date:   2014-06-04T10:54:44Z

YARN-679 move IRQ escalation into its own class for cleanliness and 
testability; lots of javadocs

commit bdd41f632deeb60a0b309e891755630d93956280
Author: Steve Loughran 
Date:   2014-06-04T13:19:53Z

YARN-679 initial TestInterruptHandling test

commit ff422b3dd70a9a39d7668b063811acee285fcbba
Author: Steve Loughran 
Date:   2014-06-04T14:26:33Z

YARN-679 TestInterruptHandling

commit 1d35197f8a8d80d3ca9aa4691b7f086686fcb454
Author: Steve Loughran 
Date:   2014-06-04T14:40:13Z

YARN-679 TestInterruptHandling final test -that blocking service stops 
don't stop shutdown from kicking in


commit ddbdfae3f7e2ce79f3c0138bc5c855bde8094c2f
Author: Steve Loughran 
Date:   2014-06-04T15:41:19Z

YARN-679: service exception handling improvements during creation, 
sprintf-formatted exception creation

commit db0a2ef4e8a46bfab6db4ec7a89cde70779432c8
Author: Steve Loughran 
Date:   2014-06-04T15:56:11Z

YARN-679 service instantiation failures

commit 2a95da1a320811c381b93c14125d56e2d21798c1
Author: Steve Loughran 
Date:   2014-06-04T17:41:54Z

YARN-679 lots more on exception handling and error code propagation, 
including making ServiceStateException have an exit code and propagate any 
inner one

commit 6fc00fa46e47d1ae6039d2e6d16b8bfb61c87ea1
Author: Steve Loughran 
Date:   2014-06-04T19:12:13Z

YARN-679 move test services into their own package; test for stop in 
runnable

commit 4dfed85a0a86440784583c41d2249d6c1106889d
Author: Steve Loughran 
Date:   2014-06-04T20:00:20Z

YARN-679 conf arg passdown validated

commit 6c12bb43a1d4554e7e196db7f9994562bd899fee
Author: Steve Loughran 
Date:   2014-06-05T10:03:34Z

YARN-679 test for service launch

commit 803250fb6810e7bd2373c53c9c07d9548c9eb71d
Author: Steve Loughran 
Date:   2014-06-05T12:46:26Z

YARN-679 test for bindArgs operations

commit f21f0fe6bdd8b1815080bdead225572c93430a24
Author: Steve Loughran 
Date:   2014-06-05T13:05:30Z

YARN-679 add AbstractLaunchedService base class for launched services, 
tests to verify that a subclass of this rejects arguments -but doesn't reject 
--conf args as they are stripped

commit a7056381a61fac239f71c7ecb8c40b74c4330864
Author: Steve Loughran 
Date:   2014-06-05T13:24:24Z

YARN-679 exception throwing/catching in execute

commit 24b74787dc52ca41dd6fde9db6c1dddb471ba1b8
Author: Steve Loughran 
Date:   2014-06-05T14:13:35Z

YARN-679 verify that constructor inits are handled

commit 554e317a0f2ef6ea353887e0cc501e0d27eb9a27
Author: Steve Loughran 
Date:   2014-06-05T14:28:14Z

services that only have a (String) constructor are handled by giving them 
their classname as a name

commit 49e457785752c1c27137ce1bb448b00f565cff20
Author: Steve Loughran 
Date:   2014-06-05T14:31:43Z

YARN-679 optimise imports

commit 62984ff26819965e86eb8727d1c5d8b73cd7fce9
Author: Steve Loughran 
Date:   2014-06-05T16:45:29Z

YARN-679 inner Launching logic with assertions and checks that Throwables 
get picked 

[jira] [Comment Edited] (YARN-679) add an entry point that can start any Yarn service

2016-07-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355358#comment-15355358
 ] 

Steve Loughran edited comment on YARN-679 at 7/1/16 9:44 AM:
-

Github user steveloughran closed the pull request at:





was (Author: githubbot):
Github user steveloughran closed the pull request at:

https://github.com/apache/hadoop/pull/68


> add an entry point that can start any Yarn service
> --
>
> Key: YARN-679
> URL: https://issues.apache.org/jira/browse/YARN-679
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-679-001.patch, YARN-679-002.patch, 
> YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, 
> YARN-679-005.patch, YARN-679-006.patch, YARN-679-007.patch, 
> YARN-679-008.patch, org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf
>
>  Time Spent: 72h
>  Remaining Estimate: 0h
>
> There's no need to write separate .main classes for every Yarn service, given 
> that the startup mechanism should be identical: create, init, start, wait for 
> stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
> interrupt.
> Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-679) add an entry point that can start any Yarn service

2016-07-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355323#comment-15355323
 ] 

Steve Loughran edited comment on YARN-679 at 7/1/16 9:44 AM:
-

bq. I don't know why the patch is failing; I just did a test apply to trunk and 
it all worked. recreating and Resubmitting with a new version number

..

bq.  


Two things:

1. Once a JIRA references a github pull request, Yetus prioritizes that over 
any attached files.  There probably should be a change to compare date/time 
stamps but that's a tremendous amount of work and we just haven't gotten to it 
yet.

2. Yetus has more and more trouble applying a PR the more and more individual 
commits there are.  This has a lot to do with how the github API presents diff 
files vs. patch files and git's own ability to apply said files. 

Two options (either one will work):

1. Squash the commits into a single commit and use the GH PR
2. Remove the references to the GH PR or at least change the URL enough that 
Yetus doesn't pick it up. It will then use the attached files.




was (Author: aw):
bq. I don't know why the patch is failing; I just did a test apply to trunk and 
it all worked. recreating and Resubmitting with a new version number

..

bq.  GITHUB PR https://github.com/apache/hadoop/pull/68

Two things:

1. Once a JIRA references a github pull request, Yetus prioritizes that over 
any attached files.  There probably should be a change to compare date/time 
stamps but that's a tremendous amount of work and we just haven't gotten to it 
yet.

2. Yetus has more and more trouble applying a PR the more and more individual 
commits there are.  This has a lot to do with how the github API presents diff 
files vs. patch files and git's own ability to apply said files. 

Two options (either one will work):

1. Squash the commits into a single commit and use the GH PR
2. Remove the references to the GH PR or at least change the URL enough that 
Yetus doesn't pick it up. It will then use the attached files.



> add an entry point that can start any Yarn service
> --
>
> Key: YARN-679
> URL: https://issues.apache.org/jira/browse/YARN-679
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-679-001.patch, YARN-679-002.patch, 
> YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, 
> YARN-679-005.patch, YARN-679-006.patch, YARN-679-007.patch, 
> YARN-679-008.patch, org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf
>
>  Time Spent: 72h
>  Remaining Estimate: 0h
>
> There's no need to write separate .main classes for every Yarn service, given 
> that the startup mechanism should be identical: create, init, start, wait for 
> stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
> interrupt.
> Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Attachment: YARN-5032.001.patch

Upload a preview patch which update the new token to NMStateStore.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin reassigned YARN-5302:
-

Assignee: Xianyin Xin

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358553#comment-15358553
 ] 

Varun Saxena edited comment on YARN-5302 at 7/1/16 7:30 AM:


Ok. So this happens when RM has renewed the token but not yet passed the token 
onto the NM in Heartbeat because NM restarted.
Do we not update the token in NM state store when it changes ?


was (Author: varun_saxena):
Ok. So this happens when RM has renewed the token but not yet passed the token 
onto the NM in Heartbeat because NM restarted.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4855) Should check if node exists when replace nodelabels

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358559#comment-15358559
 ] 

Hadoop QA commented on YARN-4855:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 21 new + 123 unchanged - 5 fixed = 144 total (was 128) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 16s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 44s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestYarnClient |
|   | hadoop.yarn.client.cli.TestLogsCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:85209cc |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815691/YARN-4855.004.patch |
| JIRA Issue | YARN-4855 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9bb2dde7dc74 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c25021f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12166/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12166/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12166/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12166/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 

[jira] [Comment Edited] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358553#comment-15358553
 ] 

Varun Saxena edited comment on YARN-5302 at 7/1/16 7:21 AM:


Ok. So this happens when RM has renewed the token but not yet passed the token 
onto the NM in Heartbeat because NM restarted.


was (Author: varun_saxena):
Ok. So this happens when RM has renewed the token but not yet passed the token 
onto the NM in Heartbeat.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358553#comment-15358553
 ] 

Varun Saxena commented on YARN-5302:


Ok. So this happens when RM has renewed the token but not yet passed the token 
onto the NM in Heartbeat.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5305) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token III

2016-07-01 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-5305:
-

 Summary: Yarn Application log Aggreagation fails due to NM can not 
get correct HDFS delegation token III
 Key: YARN-5305
 URL: https://issues.apache.org/jira/browse/YARN-5305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Xianyin Xin


Different with YARN-5098 and YARN-5302, this problem happens when AM submits a 
startContainer request with a new HDFS token (say, tokenB) which is not managed 
by YARN, so two tokens exist in the credentials of the user on NM, one is 
tokenB, the other is the one renewed on RM (tokenA). If tokenB is selected when 
connect to HDFS and tokenB expires, exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358518#comment-15358518
 ] 

Xianyin Xin commented on YARN-5302:
---

I think we have 2 options. One is persist the new token to the 
ContainerLaunchContext in NMStateStore; the other is change the 
{{initAppAggregator()}} to be async in which it can wait for RM's new token 
once using the original token fails.
[~jianhe], do you have any idea?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4855) Should check if node exists when replace nodelabels

2016-07-01 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358505#comment-15358505
 ] 

Tao Jie commented on YARN-4855:
---

Attached patch updated, [~Naganarasimha], [~sunilg], [~leftnoteasy] would you 
mind giving it a quick review?

> Should check if node exists when replace nodelabels
> ---
>
> Key: YARN-4855
> URL: https://issues.apache.org/jira/browse/YARN-4855
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-4855.001.patch, YARN-4855.002.patch, 
> YARN-4855.003.patch, YARN-4855.004.patch
>
>
> Today when we add nodelabels to nodes, it would succeed even if nodes are not 
> existing NodeManger in cluster without any message.
> It could be like this:
> When we use *yarn rmadmin -replaceLabelsOnNode --fail-on-unkown-nodes 
> "node1=label1"* , it would be denied if node is unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4855) Should check if node exists when replace nodelabels

2016-07-01 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4855:
--
Attachment: YARN-4855.004.patch

> Should check if node exists when replace nodelabels
> ---
>
> Key: YARN-4855
> URL: https://issues.apache.org/jira/browse/YARN-4855
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-4855.001.patch, YARN-4855.002.patch, 
> YARN-4855.003.patch, YARN-4855.004.patch
>
>
> Today when we add nodelabels to nodes, it would succeed even if nodes are not 
> existing NodeManger in cluster without any message.
> It could be like this:
> When we use *yarn rmadmin -replaceLabelsOnNode --fail-on-unkown-nodes 
> "node1=label1"* , it would be denied if node is unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Description: 
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens. The app is a long running 
service.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}

  was:
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358491#comment-15358491
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks varun. Maybe you are mentioning YARN-4783. But from the discussing of 
YARN-4783, it seems in that case RM has canceled the token, so it is not secure 
to continue to maitain a HDFS delegation token for the app. In this case, app 
is still running, but RM has reqeusted a new HDFS token. Because this exception 
happens duration NM recovering, RM's new token hasn't be passed to NM. The old 
token is read from StateStore and cause the exception.

Sorry for insufficient information.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Description: 
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}

  was:
Different with YARN-5089, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5282) Fix typos in CapacityScheduler documentation

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358478#comment-15358478
 ] 

Hudson commented on YARN-5282:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10041 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10041/])
YARN-5282. Fix typos in CapacityScheduler documentation. (Ray Chiang via 
(varunsaxena: rev 8ade81228e126c0575818d73b819f43b3da85c6e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md


> Fix typos in CapacityScheduler documentation
> 
>
> Key: YARN-5282
> URL: https://issues.apache.org/jira/browse/YARN-5282
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Trivial
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-5282.001.patch
>
>
> Found some minor typos while reading the CapacityScheduler documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358475#comment-15358475
 ] 

Varun Saxena commented on YARN-5302:


This is the same issue which had discussed a few days back ? When RM and NM 
both restart ?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5089, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5282) Fix typos in CapacityScheduler documentation

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358472#comment-15358472
 ] 

Varun Saxena commented on YARN-5282:


Committed this to trunk, branch-2.
Thanks Ray Chiang for your contribution.

> Fix typos in CapacityScheduler documentation
> 
>
> Key: YARN-5282
> URL: https://issues.apache.org/jira/browse/YARN-5282
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Trivial
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-5282.001.patch
>
>
> Found some minor typos while reading the CapacityScheduler documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5282) Fix typos in CapacityScheduler documentation

2016-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358471#comment-15358471
 ] 

Varun Saxena commented on YARN-5282:


Looks good to me too. Committing this.

> Fix typos in CapacityScheduler documentation
> 
>
> Key: YARN-5282
> URL: https://issues.apache.org/jira/browse/YARN-5282
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Trivial
>  Labels: supportability
> Attachments: YARN-5282.001.patch
>
>
> Found some minor typos while reading the CapacityScheduler documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5304) Ship single node HBase config option with single startup command

2016-07-01 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358463#comment-15358463
 ] 

Joep Rottinghuis commented on YARN-5304:


When we tackle this issue well, then that will address at least some of the 
concerns raised in YARN-5281.

> Ship single node HBase config option with single startup command
> 
>
> Key: YARN-5304
> URL: https://issues.apache.org/jira/browse/YARN-5304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>Assignee: Joep Rottinghuis
>
> For small to medium Hadoop deployments we should make it dead-simple to use 
> the timeline service v2. We should have a single command to launch and stop 
> the timelineservice back-end for the default HBase implementation.
> A default config with all the values should be packaged that launches all the 
> needed daemons (on the RM node) with a single command with all the 
> recommended settings.
> Having a timeline admin command, perhaps an init command might be needed, or 
> perhaps the timeline service can even auto-detect that and create tables, 
> deploy needed coprocessors etc.
> The overall purpose is to ensure nobody needs to be an HBase expert to get 
> this going. For those cluster operators with HBase experience, they can 
> choose their own more sophisticated deployment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5304) Ship single node HBase config option with single startup command

2016-07-01 Thread Joep Rottinghuis (JIRA)
Joep Rottinghuis created YARN-5304:
--

 Summary: Ship single node HBase config option with single startup 
command
 Key: YARN-5304
 URL: https://issues.apache.org/jira/browse/YARN-5304
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Joep Rottinghuis
Assignee: Joep Rottinghuis


For small to medium Hadoop deployments we should make it dead-simple to use the 
timeline service v2. We should have a single command to launch and stop the 
timelineservice back-end for the default HBase implementation.
A default config with all the values should be packaged that launches all the 
needed daemons (on the RM node) with a single command with all the recommended 
settings.

Having a timeline admin command, perhaps an init command might be needed, or 
perhaps the timeline service can even auto-detect that and create tables, 
deploy needed coprocessors etc.

The overall purpose is to ensure nobody needs to be an HBase expert to get this 
going. For those cluster operators with HBase experience, they can choose their 
own more sophisticated deployment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org