[jira] [Commented] (YARN-3409) Support Node Attribute functionality

2018-01-08 Thread LiangYe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317920#comment-16317920
 ] 

LiangYe commented on YARN-3409:
---

Node labels for partition have exclusivity,  some kinds of node attributes 
maybe need the exclusivity.  Could this feature will be considered in branch 
YARN-3409 [Naganarasimha G R 
|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=Naganarasimha]?

> Support Node Attribute functionality
> 
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, client, RM
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: 3409-apiChanges_v2.pdf (4).pdf, 
> Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Attributes are orthogonal to partition, they’re describing features of node’s 
> hardware/software just for affinity. Some example of attributes:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317592#comment-16317592
 ] 

lujie edited comment on YARN-6948 at 1/9/18 6:06 AM:
-

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. 
[^YARN-6948_1.patch] is not clean and has checkstyle errors, I reattach the 
[^YARN-6948_2.patch]


was (Author: xiaoheipangzi):
After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. 
YARN-6948_1.patch is not clean and has checkstyle errors, I reattach the 
YARN-6948_2

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, YARN-6948_2.patch, yarn-6948.png, 
> yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317592#comment-16317592
 ] 

lujie edited comment on YARN-6948 at 1/9/18 6:05 AM:
-

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. 
YARN-6948_1.patch is not clean and has checkstyle errors, I reattach the 
YARN-6948_2


was (Author: xiaoheipangzi):
After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. 

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, YARN-6948_2.patch, yarn-6948.png, 
> yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317592#comment-16317592
 ] 

lujie edited comment on YARN-6948 at 1/9/18 6:04 AM:
-

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. 


was (Author: xiaoheipangzi):
After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 
And there exists two checksyte errors in my locally running,   but i have no 
idea to fix them, any suggestion?

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, YARN-6948_2.patch, yarn-6948.png, 
> yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-6948:

Attachment: YARN-6948_2.patch

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, YARN-6948_2.patch, yarn-6948.png, 
> yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7714) YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer

2018-01-08 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317737#comment-16317737
 ] 

Rohith Sharma K S commented on YARN-7714:
-

We have separate YARN_TIMELINEREADER_OPTS for timelinereader. And 
YARN_TIMELINESERVER_OPTS is used for ATS1/1.5.

> YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer
> ---
>
> Key: YARN-7714
> URL: https://issues.apache.org/jira/browse/YARN-7714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> From hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh,
> {code}
> # Specify the max heapsize for the timelineserver.  If no units are
> # given, it will be assumed to be in MB.
> # This value will be overridden by an Xmx setting specified in either
> # HADOOP_OPTS and/or YARN_TIMELINESERVER_OPTS.
> # Default is the same as HADOOP_HEAPSIZE_MAX.
> #export YARN_TIMELINE_HEAPSIZE=
> # Specify the JVM options to be used when starting the TimeLineServer.
> # These options will be appended to the options specified as HADOOP_OPTS
> # and therefore may override any similar flags set in HADOOP_OPTS
> #
> # See ResourceManager for some examples
> #
> #export YARN_TIMELINESERVER_OPTS=
> {code}
> However, YARN_TIMELINESERVER_OPTS does not work. The correct one to set is 
> YARN_TIMELINEREADER_OPTS instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317705#comment-16317705
 ] 

Eric Yang commented on YARN-7605:
-

[~jianhe] {quote}
IIUC, the code is load if app is not running. This can be that app is pending 
or finished, it may still hit hdfs a lot. For getStatus kinda API, the consumer 
is always calling it in a loop. This will get very bad if hdfs is down or 
during failover, the api call will be spinning and quickly overload RM. IMO, we 
may need a better solution for serving persistent app status. The current 
approach may create a bottleneck accidentally.
{quote}

What api would you recommend to retrieve the status and spec combined object.  
The code used to create a new service object, and only set service name, status 
and time remaining.  There is a lot of missing information, which can confuse 
user.  This is the reason that I changed it to return the spec from HDFS.  If 
spec was stored in solr or hbase with memory caching, it will reduce the 
bottleneck.  However, those proposed ideas was not well received during the 
early implementation.  Therefore, there isn't many options here to make this 
more robust at this time.  I am open to suggestions.

{quote}
How is it retrieving it from AM ? Only RM knows the app's remaining time
{quote}

It appears that the application master stores the information to update the 
status of the service.  I am not sure how it was computed.  Remaining time is 
also updated.  One could argue that it may have delay in comparison to current 
time, if this is the case, I can put the code back.  However, I didn't find the 
remaining time to be not current when retrieved via AMproxy.  

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317697#comment-16317697
 ] 

genericqa commented on YARN-6948:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 198 unchanged - 0 fixed = 201 total (was 198) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 63m  
1s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-6948 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905211/YARN-6948_1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 75d281d1fcc0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b3290c4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19146/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19146/testReport/ |
| Max. process+thread count | 885 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourc

[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317647#comment-16317647
 ] 

Jian He commented on YARN-7605:
---

bq. I added a check to return EXIT_NOT_FOUND in actionDestroy, if the file is 
not found in HDFS, it will return the proper application not found information.
still the logic that "result != 0" means application is not found is fragile. 
The next time user added a new exit code, this error message is broken. And 
currently it is inconsistent between CLI and rest API. CLI is ignoring the 
error code whereas rest API is throwing exception.
It's better to make them consistent. And the Exception that ApplicationNotFound 
is confusing, ApplicationNotFound usually means not found in RM, but here it is 
the app folder not existing in hdfs. I think we can return the fact that the 
app doesn't exist hdfs to be explicit, like the CLI does,  instead of throwing 
exception.
bq. I refactor the code to load only if RM doesn't find the app.
IIUC,  the code is load if app is not running. This can be that app is pending 
or finished, it may still hit hdfs a lot. For getStatus kinda API, the consumer 
is always calling it in a loop. This will get very bad if hdfs is down or 
during failover, the api call will be spinning and quickly overload RM. IMO, we 
may need a better solution for serving persistent app status. The current 
approach may create a bottleneck accidentally. 
bq. Redundant fetch of information. If it is running, remaining time is in the 
copy retrieved from AM.
How is it retrieving it from AM ? Only RM knows the app's remaining time
bq. Patch failed due to revert of YARN-7540. Please commit YARN-7540 so this 
patch can be applied. Thank you.
Once this patch review is done, we can combine both patches and run jenkins and 
commit together.

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7718) DistributedShell failed to specify resource other than memory/vcores from container_resources

2018-01-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317625#comment-16317625
 ] 

Sunil G commented on YARN-7718:
---

Thanks [~leftnoteasy]. Makes sense. I ll try some tests and ll commit today.

> DistributedShell failed to specify resource other than memory/vcores from 
> container_resources
> -
>
> Key: YARN-7718
> URL: https://issues.apache.org/jira/browse/YARN-7718
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-7718.001.patch
>
>
> After YARN-7242, it has a bug to read resource values other than 
> memory/vcores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317592#comment-16317592
 ] 

lujie edited comment on YARN-6948 at 1/9/18 3:03 AM:
-

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 
And there exists two checksyte errors in my locally running,   but i have no 
idea to fix them, any suggesttion


was (Author: xiaoheipangzi):
After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, yarn-6948.png, yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317592#comment-16317592
 ] 

lujie edited comment on YARN-6948 at 1/9/18 3:03 AM:
-

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 
And there exists two checksyte errors in my locally running,   but i have no 
idea to fix them, any suggestion?


was (Author: xiaoheipangzi):
After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 
And there exists two checksyte errors in my locally running,   but i have no 
idea to fix them, any suggesttion

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, yarn-6948.png, yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2018-01-08 Thread lujie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-6948:

Attachment: YARN-6948_1.patch

After discuss [YARN-7663|https://issues.apache.org/jira/browse/YARN-7663] with 
[#Jason Lowe], I think this bug can have same unit test strategy. The only 
difference is that I override the onInvalidTranstion in a independent class 
RMAppAttemptImplForTest. 

> Invalid event: ATTEMPT_ADDED at FINAL_SAVING
> 
>
> Key: YARN-6948
> URL: https://issues.apache.org/jira/browse/YARN-6948
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-alpha4
>Reporter: lujie
> Attachments: YARN-6948_1.patch, yarn-6948.png, yarn-6948.txt
>
>
> When I send kill command to a running job, I check the logs and find the 
> Exception:
> {code:java}
> 2017-08-03 01:35:20,485 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ATTEMPT_ADDED at FINAL_SAVING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7663) RMAppImpl:Invalid event: START at KILLED

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317555#comment-16317555
 ] 

genericqa commented on YARN-7663:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7663 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905185/YARN-7663_7.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3aeb3678a6ce 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 59ab5da |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19143/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19143/testReport/ |
| Max. process+thread count | 847 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yar

[jira] [Commented] (YARN-6486) FairScheduler: Deprecate continuous scheduling

2018-01-08 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317553#comment-16317553
 ] 

Yufei Gu commented on YARN-6486:


Thanks for the patch.
For class {{FairScheduler}}, I guess you want to deprecate nodeLocalityDelayMs 
and rackLocalityDelayMs instead of nodeLocalityThreshold and 
rackLocalityThreshold. 
For class {{FairSchedulerConfiguration}}, I saw you did mention that fields 
like LOCALITY_DELAY_NODE_MS is deprecated in comments, any reason you don't 
annotate them "@Depreated"? 

> FairScheduler: Deprecate continuous scheduling
> --
>
> Key: YARN-6486
> URL: https://issues.apache.org/jira/browse/YARN-6486
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6486.001.patch, YARN-6486.002.patch, 
> YARN-6486.003.patch, YARN-6486.004.patch
>
>
> Mark continuous scheduling as deprecated in 2.9 and remove the code in 3.0. 
> Removing continuous scheduling from the code will be logged as a separate jira



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317550#comment-16317550
 ] 

Eric Yang commented on YARN-7605:
-

Patch failed due to revert of YARN-7540.  Please commit YARN-7540 so this patch 
can be applied.  Thank you.

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317547#comment-16317547
 ] 

genericqa commented on YARN-7605:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-7605 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7605 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905204/YARN-7605.014.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19145/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7605:

Attachment: YARN-7605.014.patch

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7605:

Attachment: (was: YARN-7605.014.patch)

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317538#comment-16317538
 ] 

genericqa commented on YARN-7605:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-7605 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7605 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905201/YARN-7605.014.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19144/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7605:

Attachment: YARN-7605.014.patch

[~jianhe] . Thanks for the review:

{quote}
Previous comment: Here it still assumes if result != 0 , it is 
ApplicationNotFoundException. I think this is inappropriate, in fact in the 
implementation, it'll not return result!=0, so this condition may be just 
removed.
{quote}

I added a check to return EXIT_NOT_FOUND in actionDestroy, if the file is not 
found in HDFS, it will return the proper application not found information.

{quote}
It is inappropriate to assume all exceptions are caused by "Permission denied" ?
{quote}

I removed the capture of exception, and allow the exception to be handled at 
upper layer.

{quote}
In getStatus is is changed to load from hdfs every time, won't this hit hdfs 
too much ?
{quote}

I refactor the code to load only if RM doesn't find the app.

{quote}
why is below code in getStatus removed ?
{quote}

Redundant fetch of information.  If it is running, remaining time is in the 
copy retrieved from AM.

{quote}
I meant why this patch added the InterruptedException to this submitApp 
interface ? it wasn't throwing InterruptedException before
{quote}

I removed InterruptedException, sorry about this.

{quote}
Looks like methods are not exceeding 80 column limit and looks truncated to 
separate lines unnecessarily. what were the checkstyles about ?
{quote}

Both lines are 81 characters.

{quote}
there's always this line printed when trying the CLI, any way to get rid of 
this ?
{quote}

Not sure how to get rid of this.

{quote}
For flexing it doesn't print flexed from/to numbers, I think it is useful to 
print 'component name' and from/to numbers
{quote}

Fixed.

{quote}
both stop and destroy will print the same logging as below, I think we can 
differentiate them
“Successfully stopped service”
{quote}

Fixed

{quote}
"yarn app -save" doesn't have logging to indicate whether it is succesful or not
{quote}

Fixed.

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch, 
> YARN-7605.014.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7720) [Federation] Race condition between second app attempt and UAM timeout when first attempt node is down

2018-01-08 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7720:
---
Summary: [Federation] Race condition between second app attempt and UAM 
timeout when first attempt node is down  (was: [Federation] Race condition 
between second app attempt and UAM heartbeat when first attempt node is down)

> [Federation] Race condition between second app attempt and UAM timeout when 
> first attempt node is down
> --
>
> Key: YARN-7720
> URL: https://issues.apache.org/jira/browse/YARN-7720
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>
> In Federation, multiple attempts of an application share the same UAM in each 
> secondary sub-cluster. When first attempt fails, we reply on the fact that 
> secondary RM won't kill the existing UAM before the AM heartbeat timeout 
> (default at 10 min). When second attempt comes up in the home sub-cluster, it 
> will pick up the UAM token from Yarn Registry and resume the UAM heartbeat to 
> secondary RMs. 
> The default heartbeat timeout for NM and AM are both 10 mins. The problem is 
> that when the first attempt node goes down or out of connection, only after 
> 10 mins will the home RM mark the first attempt as failed, and then schedule 
> the 2nd attempt in some other node. By then the UAMs in secondaries are 
> already timing out, and they might not survive until the second attempt comes 
> up. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7720) [Federation] Race condition between second app attempt and UAM heartbeat when first attempt node is down

2018-01-08 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317509#comment-16317509
 ] 

Botong Huang commented on YARN-7720:


As a fix in our clusters, we increased the AM heartbeat timeout to 15 mins and 
kept NM timeout to be 10 mins. This is a new requirement from Federation, that 
UAM timeout needs to be longer than NM timeout. I am wondering if it is okay to 
change the default value in general. Any other idea on this issue? 

> [Federation] Race condition between second app attempt and UAM heartbeat when 
> first attempt node is down
> 
>
> Key: YARN-7720
> URL: https://issues.apache.org/jira/browse/YARN-7720
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>
> In Federation, multiple attempts of an application share the same UAM in each 
> secondary sub-cluster. When first attempt fails, we reply on the fact that 
> secondary RM won't kill the existing UAM before the AM heartbeat timeout 
> (default at 10 min). When second attempt comes up in the home sub-cluster, it 
> will pick up the UAM token from Yarn Registry and resume the UAM heartbeat to 
> secondary RMs. 
> The default heartbeat timeout for NM and AM are both 10 mins. The problem is 
> that when the first attempt node goes down or out of connection, only after 
> 10 mins will the home RM mark the first attempt as failed, and then schedule 
> the 2nd attempt in some other node. By then the UAMs in secondaries are 
> already timing out, and they might not survive until the second attempt comes 
> up. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7720) [Federation] Race condition between second app attempt and UAM heartbeat when first attempt node is down

2018-01-08 Thread Botong Huang (JIRA)
Botong Huang created YARN-7720:
--

 Summary: [Federation] Race condition between second app attempt 
and UAM heartbeat when first attempt node is down
 Key: YARN-7720
 URL: https://issues.apache.org/jira/browse/YARN-7720
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Botong Huang
Assignee: Botong Huang


In Federation, multiple attempts of an application share the same UAM in each 
secondary sub-cluster. When first attempt fails, we reply on the fact that 
secondary RM won't kill the existing UAM before the AM heartbeat timeout 
(default at 10 min). When second attempt comes up in the home sub-cluster, it 
will pick up the UAM token from Yarn Registry and resume the UAM heartbeat to 
secondary RMs. 

The default heartbeat timeout for NM and AM are both 10 mins. The problem is 
that when the first attempt node goes down or out of connection, only after 10 
mins will the home RM mark the first attempt as failed, and then schedule the 
2nd attempt in some other node. By then the UAMs in secondaries are already 
timing out, and they might not survive until the second attempt comes up. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7689) TestRMContainerAllocator fails after YARN-6124

2018-01-08 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317469#comment-16317469
 ] 

Wilfred Spiegelenburg commented on YARN-7689:
-

We have the same existing warnings for other fields in the 
AbstractYarnScheduler. The follow on jira to move the init to the 
AbstractYarnScheduler  should remove the need for any getter/setter methods and 
also make it the variable private. All other variables in the file also raise 
the same checkstyle warning at the moment, and the variable definition has not 
really changed with this fix.

> TestRMContainerAllocator fails after YARN-6124
> --
>
> Key: YARN-7689
> URL: https://issues.apache.org/jira/browse/YARN-7689
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-7689.001.patch, YARN-7689.002.patch
>
>
> After the change that was made for YARN-6124 multiple tests in the 
> TestRMContainerAllocator from MapReduce fail with the following NPE:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928)
> {code}
> In the test we just call reinitiaize on a scheduler and never call init.
> The stop of the service is guarded and so should the start and the re-init.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7719) [Yarn services] Yarn application logs does not collect all AM log files

2018-01-08 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-7719:
-
Priority: Critical  (was: Major)

> [Yarn services] Yarn application logs does not collect all AM log files
> ---
>
> Key: YARN-7719
> URL: https://issues.apache.org/jira/browse/YARN-7719
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Priority: Critical
>
> Steps:
> 1) Run Yarn Service Application such as httpd
> 2) Gather yarn application log after application is finished 
> The log collection only shows content of container-localizer-syslog. 
> Log collection should also gather below files from AM.
> * directory.info
> * launch_container.sh
> * prelaunch.err
> * prelaunch.out
> * serviceam-err.txt
> * serviceam-out.txt
> Without these log files, debugging of an failure app becomes impossible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7719) [Yarn services] Yarn application logs does not collect all AM log files

2018-01-08 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-7719:


 Summary: [Yarn services] Yarn application logs does not collect 
all AM log files
 Key: YARN-7719
 URL: https://issues.apache.org/jira/browse/YARN-7719
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora


Steps:

1) Run Yarn Service Application such as httpd
2) Gather yarn application log after application is finished 

The log collection only shows content of container-localizer-syslog. 
Log collection should also gather below files from AM.

* directory.info
* launch_container.sh
* prelaunch.err
* prelaunch.out
* serviceam-err.txt
* serviceam-out.txt

Without these log files, debugging of an failure app becomes impossible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4227) Ignore expired containers from removed nodes in FairScheduler

2018-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317431#comment-16317431
 ] 

Hudson commented on YARN-4227:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13463 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13463/])
YARN-4227. Ignore expired containers from removed nodes in (rchiang: rev 
59ab5da0a0337c49a58bc9b2db9d1a89f4d5b9dd)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


> Ignore expired containers from removed nodes in FairScheduler
> -
>
> Key: YARN-4227
> URL: https://issues.apache.org/jira/browse/YARN-4227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0, 2.5.0, 2.7.1
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Fix For: 3.1.0, 2.10.0
>
> Attachments: YARN-4227.006.patch, YARN-4227.2.patch, 
> YARN-4227.3.patch, YARN-4227.4.patch, YARN-4227.5.patch, YARN-4227.patch
>
>
> Under some circumstances the node is removed before an expired container 
> event is processed causing the RM to exit:
> {code}
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: 
> Expired:container_1436927988321_1307950_01_12 Timed out after 600 secs
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1436927988321_1307950_01_12 Container Transitioned from 
> ACQUIRED to EXPIRED
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: 
> Completed container: container_1436927988321_1307950_01_12 in state: 
> EXPIRED event:EXPIRE
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=system_op   
>OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1436927988321_1307950 
> CONTAINERID=container_1436927988321_1307950_01_12
> 2015-10-04 21:14:01,063 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_EXPIRED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:849)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1273)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> The stack trace is from 2.3.0 but the same issue has been observed in 2.5.0 
> and 2.6.0 by different customers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7718) DistributedShell failed to specify resource other than memory/vcores from container_resources

2018-01-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317410#comment-16317410
 ] 

Wangda Tan commented on YARN-7718:
--

[~sunilg] could you help to review and get this committed?

Thanks,

> DistributedShell failed to specify resource other than memory/vcores from 
> container_resources
> -
>
> Key: YARN-7718
> URL: https://issues.apache.org/jira/browse/YARN-7718
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-7718.001.patch
>
>
> After YARN-7242, it has a bug to read resource values other than 
> memory/vcores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7663) RMAppImpl:Invalid event: START at KILLED

2018-01-08 Thread lujie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-7663:

Attachment: YARN-7663_7.patch

I have moved the TODO to new added foo, and fix checkstyle error

> RMAppImpl:Invalid event: START at KILLED
> 
>
> Key: YARN-7663
> URL: https://issues.apache.org/jira/browse/YARN-7663
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: lujie
>Assignee: lujie
>Priority: Minor
>  Labels: patch
> Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, 
> YARN-7663_4.patch, YARN-7663_5.patch, YARN-7663_6.patch, YARN-7663_7.patch
>
>
> Send kill to application, the RM log shows:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> START at KILLED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> if insert sleep before where the START event was created, this bug will 
> deterministically reproduce. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7718) DistributedShell failed to specify resource other than memory/vcores from container_resources

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317384#comment-16317384
 ] 

genericqa commented on YARN-7718:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 
40s{color} | {color:green} hadoop-yarn-applications-distributedshell in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7718 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905176/YARN-7718.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7b273cd05fb0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2ee0d64 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19142/testReport/ |
| Max. process+thread count | 639 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19142/conso

[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler

2018-01-08 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317379#comment-16317379
 ] 

Arun Suresh commented on YARN-6599:
---

[~leftnoteasy], did a slightly deeper dive.

This is my understanding of the general flow - I'd break this down the 2 phases.
# When the app makes the allocate call: This is where you do the 
updatePendingAsk etc for each SchedulingRequest. This is also where you 
instantiate the AppPlacementAllocator and map it to the request. Looks like the 
really interesting method is {{validateAndSetSchedulingRequest}}, which is 
where you validate the placement constraints. This method sets the valid 
targetAllocationTags etc.
# When the node heartbeats: At this point, in the leafQueue, we pick the 
SchedulingRequest with highest priority and finally, we call the 
{{canAllocate}} method which checks if the Node can be assigned to the 
scheduling request based on the placementConstratint. right ?

Given the above, we should agree that:
This approach - while it ensures that allocation of a SchedulingRequest to a 
node will guarantee that Constraints are NOT violated, It does nothing in the 
way of trying to FIND the nodes that will meet the constraints.. agreed ?

My opinion - and something we should call out / document is that:
* If an app is more concerned about priority ordering of its requests, then we 
can recommend using this approach.
* If the app the more concerned about an optimal placement, then the processor 
route will give better results - since it activly tries to find nodes that 
satisfy constraints by considering multiple schedulingRequests and multiple 
nodes.
Thuoghts ?

Also some other comments:
* In the Scheduler, you are adding a new List parameter to 
the allocate() method. Do you think, a better approach might be to create a 
common superclass / interface which both the SchedulingRequest and 
ResourceRequest extends and change the existing parameter to {{List}} ?
* If we do the above, then you wont have to duplicate methods like : 
application.updateResourceRequests and normalizeSchedulingRequests


> Support rich placement constraints in scheduler
> ---
>
> Key: YARN-6599
> URL: https://issues.apache.org/jira/browse/YARN-6599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6599-YARN-6592.003.patch, 
> YARN-6599-YARN-6592.004.patch, YARN-6599-YARN-6592.005.patch, 
> YARN-6599-YARN-6592.006.patch, YARN-6599-YARN-6592.007.patch, 
> YARN-6599-YARN-6592.008.patch, YARN-6599-YARN-6592.wip.002.patch, 
> YARN-6599.poc.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7516) Security check for untrusted docker image

2018-01-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317340#comment-16317340
 ] 

Eric Badger commented on YARN-7516:
---

bq. How about docker.privileged-containers.registries for the last issue?
I think that would be fine. It would allow for a 
{{docker.non-privileged-containers.registries}} in the future

> Security check for untrusted docker image
> -
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4227) Ignore expired containers from removed nodes in FairScheduler

2018-01-08 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-4227:
-
Summary: Ignore expired containers from removed nodes in FairScheduler  
(was: FairScheduler: RM quits processing expired container from a removed node)

> Ignore expired containers from removed nodes in FairScheduler
> -
>
> Key: YARN-4227
> URL: https://issues.apache.org/jira/browse/YARN-4227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0, 2.5.0, 2.7.1
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: YARN-4227.006.patch, YARN-4227.2.patch, 
> YARN-4227.3.patch, YARN-4227.4.patch, YARN-4227.5.patch, YARN-4227.patch
>
>
> Under some circumstances the node is removed before an expired container 
> event is processed causing the RM to exit:
> {code}
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: 
> Expired:container_1436927988321_1307950_01_12 Timed out after 600 secs
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1436927988321_1307950_01_12 Container Transitioned from 
> ACQUIRED to EXPIRED
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: 
> Completed container: container_1436927988321_1307950_01_12 in state: 
> EXPIRED event:EXPIRE
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=system_op   
>OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1436927988321_1307950 
> CONTAINERID=container_1436927988321_1307950_01_12
> 2015-10-04 21:14:01,063 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_EXPIRED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:849)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1273)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> The stack trace is from 2.3.0 but the same issue has been observed in 2.5.0 
> and 2.6.0 by different customers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7516) Security check for untrusted docker image

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317330#comment-16317330
 ] 

Eric Yang commented on YARN-7516:
-

[~ebadger] Thanks for the review.  I will fix the first two issues.  For prefix 
check, I will improve it by optionally adding  '/' as delimiter between 
registry name and start of image name to make sure that the registry name is an 
exact match.  How about {{docker.privileged-containers.registries}} for the 
last issue?

> Security check for untrusted docker image
> -
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7713) Add parallel copying of directories into FSDownload

2018-01-08 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-7713:
-
Summary: Add parallel copying of directories into FSDownload  (was: Add 
parallel copying of directories into)

> Add parallel copying of directories into FSDownload
> ---
>
> Key: YARN-7713
> URL: https://issues.apache.org/jira/browse/YARN-7713
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>
> YARN currently copies directories sequentially when localizing. This could be 
> improved to do in parallel, since the source blocks are normally on different 
> nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7663) RMAppImpl:Invalid event: START at KILLED

2018-01-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317315#comment-16317315
 ] 

Jason Lowe commented on YARN-7663:
--

Thanks for updating the patch!

I thought the TODO comment was going to be moved to onInvalidStateTransition 
method?  Once that's fixed along with the over-indentation on the unit test 
that led to the last over 80 columns checkstyle nit then I think the patch is 
ready to go.  The unit test failures appear to be unrelated and they pass 
locally for me with the patch applied.


> RMAppImpl:Invalid event: START at KILLED
> 
>
> Key: YARN-7663
> URL: https://issues.apache.org/jira/browse/YARN-7663
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: lujie
>Assignee: lujie
>Priority: Minor
>  Labels: patch
> Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, 
> YARN-7663_4.patch, YARN-7663_5.patch, YARN-7663_6.patch
>
>
> Send kill to application, the RM log shows:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> START at KILLED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> if insert sleep before where the START event was created, this bug will 
> deterministically reproduce. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API

2018-01-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317313#comment-16317313
 ] 

Jian He commented on YARN-7605:
---

- Previous comment: Here it still assumes if {{result \!= 0}} , it is 
ApplicationNotFoundException. I think this is inappropriate, in fact in the 
implementation, it'll not return result!=0, so this condition may be just 
removed.
{code}
if (result == 0) {
  ServiceStatus serviceStatus = new ServiceStatus();
  serviceStatus.setDiagnostics("Successfully stopped service " +
  appName);
  return formatResponse(Status.OK, serviceStatus);
} else {
  throw new ApplicationNotFoundException("Service " + appName +
  " is not found in YARN.");
}
{code}
- previous comment: It is inappropriate to assume all exceptions are caused by 
"Permission denied" ? this will confuse the users. And why not just use 
LOG.warn("...", e), but use a seaprate log statement for the exception ?
{code}
try {
  proxy.flexComponents(requestBuilder.build());
} catch (YarnException | IOException e) {
  LOG.warn("Exception caught during flex operation.");
  LOG.warn(ExceptionUtils.getFullStackTrace(e));
  throw new YarnException("Permission denied to perform flex operation.");
}
{code}
- In getStatus is is changed to load from hdfs every time, won't this hit hdfs 
too much ? 
{code}
Service appSpec = ServiceApiUtil.loadService(fs, serviceName);
{code} 

- why is below code in getStatus removed ?
{code}
ApplicationTimeout lifetime =
appReport.getApplicationTimeouts().get(ApplicationTimeoutType.LIFETIME);
if (lifetime != null) {
  appSpec.setLifetime(lifetime.getRemainingTime());
}
{code}
- bq. In ServiceClient, several methods shouldn't throw InterruptedException 
because AppAdminClient definition doesn't throw InterruptedException. This is 
the reason that InterruptedException were converted to YarnException
I meant why this patch added the InterruptedException to this submitApp 
interface ? it wasn't throwing InterruptedException before
{code}
  private ApplicationId submitApp(Service app)
  throws IOException, YarnException, InterruptedException {
{code}
- bq. Formatting for createAMProxy, actionSave were to remove some check style 
issues that exist in ServiceClient.
Looks like methods are not exceeding 80 column limit and looks truncated to 
separate lines unnecessarily. what were the checkstyles about ?
- there's always this line printed when trying the CLI, any way to get rid of 
this ?
{code}18/01/08 14:31:35 INFO util.log: Logging initialized @1287ms
{code}
- For flexing it doesn't print flexed from/to numbers, I think it is useful to 
print 'component name' and from/to numbers 
{code}
18/01/08 14:33:44 INFO client.ApiServiceClient: Service jian2 is successfully 
flexed.
{code}

- both stop and destroy will print the same logging as below, I think we can 
differentiate them
“Successfully stopped service”
-  "yarn app -save" doesn't have logging to indicate whether it is succesful or 
not

> Implement doAs for Api Service REST API
> ---
>
> Key: YARN-7605
> URL: https://issues.apache.org/jira/browse/YARN-7605
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: yarn-native-services
>
> Attachments: YARN-7605.001.patch, YARN-7605.004.patch, 
> YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, 
> YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, 
> YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch
>
>
> In YARN-7540, all client entry points for API service is centralized to use 
> REST API instead of having direct file system and resource manager rpc calls. 
>  This change helped to centralize yarn metadata to be owned by yarn user 
> instead of crawling through every user's home directory to find metadata.  
> The next step is to make sure "doAs" calls work properly for API Service.  
> The metadata is stored by YARN user, but the actual workload still need to be 
> performed as end users, hence API service must authenticate end user kerberos 
> credential, and perform doAs call when requesting containers via 
> ServiceClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7689) TestRMContainerAllocator fails after YARN-6124

2018-01-08 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317310#comment-16317310
 ] 

Miklos Szegedi commented on YARN-7689:
--

Thank you, [~wilfreds]. Could you address the outstanding checkstyle issue?

> TestRMContainerAllocator fails after YARN-6124
> --
>
> Key: YARN-7689
> URL: https://issues.apache.org/jira/browse/YARN-7689
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-7689.001.patch, YARN-7689.002.patch
>
>
> After the change that was made for YARN-6124 multiple tests in the 
> TestRMContainerAllocator from MapReduce fail with the following NPE:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928)
> {code}
> In the test we just call reinitiaize on a scheduler and never call init.
> The stop of the service is guarded and so should the start and the re-init.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7718) DistributedShell failed to specify resource other than memory/vcores from container_resources

2018-01-08 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7718:
-
Attachment: YARN-7718.001.patch

Attached ver.1 patch, we should set container_resources regardless of profiles 
enabled.

> DistributedShell failed to specify resource other than memory/vcores from 
> container_resources
> -
>
> Key: YARN-7718
> URL: https://issues.apache.org/jira/browse/YARN-7718
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
> Attachments: YARN-7718.001.patch
>
>
> After YARN-7242, it has a bug to read resource values other than 
> memory/vcores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7718) DistributedShell failed to specify resource other than memory/vcores from container_resources

2018-01-08 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-7718:


 Summary: DistributedShell failed to specify resource other than 
memory/vcores from container_resources
 Key: YARN-7718
 URL: https://issues.apache.org/jira/browse/YARN-7718
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wangda Tan
Priority: Critical


After YARN-7242, it has a bug to read resource values other than memory/vcores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2018-01-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317251#comment-16317251
 ] 

Jason Lowe commented on YARN-6929:
--

Sorry for the delay.  Looks like the patch needs to be rebased.

How is this going to work with existing log aggregation setups?  In other 
words, as soon as this is deployed, will users be able to access logs that were 
aggregated under the old scheme?  Or during a rolling upgrade where some 
nodemanagers are aggregating to the old location and some are aggregating to 
the new location, will a log reader be able to find all of the logs?

I'm not thrilled with seeing the hdfs and hdfs-client dependency added for what 
is essentially a config key.  This is especially true since YARN should not 
require HDFS be the underlying filesystem, and therefore this key may not be 
set.  Even if HDFS is being used as the underlying filesystem, 
dfs.namenode.fs-limits.max-directory-items is a property intended to be set and 
read on the namenode server and not intended for client consumption.  There's 
no guarantee it has been set properly on the client side.  And if it isn't set 
properly on the client side, the log reader will not be able to find the logs 
properly.  As such I think it makes more sense to have a separate config 
property for this.  If users want to tie the two values together they can set 
the YARN property value to "$\{dfs.namenode.fs-limits.max-directory-items\}"

These logs looks like they were added for debug purposes and either should be 
removed or marked as a debug log:
{noformat}
  LOG.info("UserDir="+userDir.getPath());
[...]
LOG.info("USERDIR="+userDirPath);
[...]
  LOG.info("CLUSTERTIMETSMA="+clusterTimestampDir.getPath());
[...]
  LOG.info("BUCKET="+bucketDir.getPath());

 {noformat}

CLUSTERTIMETSMA is a typo if the log is preserved.

To make this more efficient and reduce the load on the remote filesystem, we 
should avoid a double liststat call.  We know how many entries are in the 
original liststat, and we know that deleting any child in the directory is 
going to update the modtime of the parent directory.  If we delete a directory 
then we know the modtime is updated, so it won't be worth immediately checking 
again to see if it should be deleted.  It will only be deleted if there were 
zero entries in the directory at the start of the loop, so we don't need to 
stat it again after the loop.

Speaking of deleting bucket/cluster directories, won't it be a problem if one 
thread is trying to delete a bucket just as another thread tries to aggregate 
an app there?  In other words, just because the directory looks empty now, how 
to we know it will remain empty?  The writing code path goes out of its way to 
check if directories are there and creates them if missing, but the deletion 
code could be removing a directory just after the write code checks to make 
sure it is there.

Nit: To get a string value the code should use String.valueOf or Long.toString, 
Integer.toString, etc. rather than adding an empty string to something.

Nit: This change seems unnecessary:
{noformat}
-  LOG.info("aggregated log deletion started.");
+  LOG.info("aggregated log deletion started here.");
{noformat}

Nit: It looks like there are a number of reformatting/whitespace-only changes 
which are unrelated to this patch that makes it harder to read and backport, 
e.g.:
{noformat}
 logIOException("Error reading root log dir this deletion " +
-   "attempt is being aborted", e);
+ "attempt is being aborted", e);
   }
{noformat}


> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
> Attachments: YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, 
> YARN-6929.3.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apac

[jira] [Commented] (YARN-4227) FairScheduler: RM quits processing expired container from a removed node

2018-01-08 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317212#comment-16317212
 ] 

Ray Chiang commented on YARN-4227:
--

+1.  LGTM.

I'll commit this upstream soon.

> FairScheduler: RM quits processing expired container from a removed node
> 
>
> Key: YARN-4227
> URL: https://issues.apache.org/jira/browse/YARN-4227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0, 2.5.0, 2.7.1
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: YARN-4227.006.patch, YARN-4227.2.patch, 
> YARN-4227.3.patch, YARN-4227.4.patch, YARN-4227.5.patch, YARN-4227.patch
>
>
> Under some circumstances the node is removed before an expired container 
> event is processed causing the RM to exit:
> {code}
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: 
> Expired:container_1436927988321_1307950_01_12 Timed out after 600 secs
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1436927988321_1307950_01_12 Container Transitioned from 
> ACQUIRED to EXPIRED
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: 
> Completed container: container_1436927988321_1307950_01_12 in state: 
> EXPIRED event:EXPIRE
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=system_op   
>OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1436927988321_1307950 
> CONTAINERID=container_1436927988321_1307950_01_12
> 2015-10-04 21:14:01,063 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_EXPIRED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:849)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1273)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> The stack trace is from 2.3.0 but the same issue has been observed in 2.5.0 
> and 2.6.0 by different customers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7516) Security check for untrusted docker image

2018-01-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317178#comment-16317178
 ] 

Eric Badger commented on YARN-7516:
---

Hey [~eyang], thanks for the updated patch!

{noformat}
+  char **trusted_repo = 
get_configuration_values_delimiter("docker.trusted.registries", 
CONTAINER_EXECUTOR_CFG_DOCKER_SECTION, conf, ",");
{noformat}
Minor nit: the variable and config value aren't consistent (repo vs. registry)

{noformat}
+  for (i = 0; trusted_repo[i] != NULL; i++) {
+if (strncmp(image_name, trusted_repo[i], strlen(trusted_repo[i]))==0) {
+  fprintf(ERRORFILE, "image: %s is trusted in %s.\n", image_name, 
trusted_repo[i]);
+  found=true;
+}
+  }
{noformat}
Once we find the image, we can break out of the loop. No need to continue 
through the rest of the array.

{noformat}
+if (strncmp(image_name, trusted_repo[i], strlen(trusted_repo[i]))==0) {
{noformat}
I'm far from a web url expert, so this may be obviously impossible, but I'm 
wondering if the following case could be exploited with this: 
The trusted registry is a.b.c and some malicious domain gets a.b.c.d. Since we 
are doing a strncmp only on the length of the trusted repo, then we would allow 
a.b.c.d 

{noformat}
+| `docker.trusted.registries` | Comma separated list of trusted docker 
registries for running trusted privileged docker containers.  By default, no 
registries are defined. |
{noformat}
I'm having a hard time coming to this definition of "trusted". In my mind, I 
could trust a registry without wanting to allow privileged containers. For me, 
trusting a registry means that I will allow containers from it to run at all. 
Then the next step would be allowing them to run with privileges. Possibly this 
doesn't all need to happen in this JIRA, but I'm hesitant to use the word 
"trusted", since I think that there are different levels of trust going on 
here. 

> Security check for untrusted docker image
> -
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7712) Add ability to ignore timestamps in localized files

2018-01-08 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317154#comment-16317154
 ] 

Chris Douglas commented on YARN-7712:
-

As [~ste...@apache.org] 
[suggested|https://issues.apache.org/jira/browse/HDFS-7878?focusedCommentId=15512866&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15512866],
 we could also use the {{PathHandle}} API for YARN dependencies.

> Add ability to ignore timestamps in localized files
> ---
>
> Key: YARN-7712
> URL: https://issues.apache.org/jira/browse/YARN-7712
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>
> YARN currently requires and checks the timestamp of localized files and 
> fails, if the file on HDFS does not match to the one requested. This jira 
> adds the ability to ignore the timestamp based on the request of the client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7704) Document improvement for registry dns

2018-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317134#comment-16317134
 ] 

Hudson commented on YARN-7704:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13460 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13460/])
YARN-7704. Document improvement for registry dns. Contributed by Jian He 
(billie: rev dc54747d70fc4dc77767051d0f8f89ccda7ba3c0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/YarnServiceAPI.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Examples.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/main/resources/definition/YARN-Simplified-V1-API-Layer-For-Services.yaml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStart.md


> Document improvement for registry dns
> -
>
> Key: YARN-7704
> URL: https://issues.apache.org/jira/browse/YARN-7704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Fix For: yarn-native-services
>
> Attachments: YARN-7704.01.patch, YARN-7704.02.patch
>
>
> Add document for how to point the cluster to use the registry dns



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7717) Add configuration consistency for module.enabled and docker.privileged-containers.enabled

2018-01-08 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-7717:


 Summary: Add configuration consistency for module.enabled and 
docker.privileged-containers.enabled
 Key: YARN-7717
 URL: https://issues.apache.org/jira/browse/YARN-7717
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Yesha Vora


container-executor.cfg has two properties related to dockerization. 
1)  module.enabled = true/false
2) docker.privileged-containers.enabled = 1/0

Here, both property takes different value to enable / disable feature. Module 
enabled take true/false string while docker.privileged-containers.enabled  
takes 1/0 integer value. 

This properties behavior should be consistent. Both properties should have true 
or false string as value to enable or disable feature/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7704) Document improvement for registry dns

2018-01-08 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317091#comment-16317091
 ] 

Billie Rinaldi commented on YARN-7704:
--

+1 for patch 02. This doesn't add tests, but it's a documentation-only patch.

> Document improvement for registry dns
> -
>
> Key: YARN-7704
> URL: https://issues.apache.org/jira/browse/YARN-7704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Fix For: yarn-native-services
>
> Attachments: YARN-7704.01.patch, YARN-7704.02.patch
>
>
> Add document for how to point the cluster to use the registry dns



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7704) Document improvement for registry dns

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317051#comment-16317051
 ] 

genericqa commented on YARN-7704:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
36m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7704 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905142/YARN-7704.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  |
| uname | Linux 0a35c80d30cd 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 01f3f21 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19140/testReport/ |
| Max. process+thread count | 302 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19140/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Document improvement for registry dns
> 

[jira] [Commented] (YARN-7716) metricsTimeStart and metricsTimeEnd should be all lower case in the doc

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317033#comment-16317033
 ] 

genericqa commented on YARN-7716:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 37s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7716 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905145/YARN-7716.00.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux c02b9368c87b 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 01f3f21 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 408 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19141/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> metricsTimeStart and metricsTimeEnd should be all lower case in the doc
> ---
>
> Key: YARN-7716
> URL: https://issues.apache.org/jira/browse/YARN-7716
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-7716.00.patch
>
>
> The TimelineV2 REST API is case sensitive. The doc refers to 
> `metricstimestart` and `metricstimeend` as metricsTimeStart and 
> metricsTimeEnd. When users follow the API doc, it appears that the two 
> parameters do not work properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7015) Handle Container ExecType update (Promotion/Demotion) in cgroups resource handlers

2018-01-08 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-7015.
--
Resolution: Duplicate

> Handle Container ExecType update (Promotion/Demotion) in cgroups resource 
> handlers
> --
>
> Key: YARN-7015
> URL: https://issues.apache.org/jira/browse/YARN-7015
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Miklos Szegedi
>
> YARN-5085 adds support for change of container execution type 
> (Promotion/Demotion).
> Modifications to the ContainerManagementProtocol, ContainerManager and 
> ContainerScheduler to handle this change are now in trunk. Opening this JIRA 
> to track changes (if any) required in the cgroups resourcehandlers to 
> accommodate this in the context of YARN-1011.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7714) YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer

2018-01-08 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316957#comment-16316957
 ] 

Haibo Chen commented on YARN-7714:
--

Ah, I see. I was confusing timelineserver with timelinereader.

Not sure if there will be common confusion. If not, we could close this as Not 
A Problem.

> YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer
> ---
>
> Key: YARN-7714
> URL: https://issues.apache.org/jira/browse/YARN-7714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> From hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh,
> {code}
> # Specify the max heapsize for the timelineserver.  If no units are
> # given, it will be assumed to be in MB.
> # This value will be overridden by an Xmx setting specified in either
> # HADOOP_OPTS and/or YARN_TIMELINESERVER_OPTS.
> # Default is the same as HADOOP_HEAPSIZE_MAX.
> #export YARN_TIMELINE_HEAPSIZE=
> # Specify the JVM options to be used when starting the TimeLineServer.
> # These options will be appended to the options specified as HADOOP_OPTS
> # and therefore may override any similar flags set in HADOOP_OPTS
> #
> # See ResourceManager for some examples
> #
> #export YARN_TIMELINESERVER_OPTS=
> {code}
> However, YARN_TIMELINESERVER_OPTS does not work. The correct one to set is 
> YARN_TIMELINEREADER_OPTS instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7716) metricsTimeStart and metricsTimeEnd should be all lower case in the doc

2018-01-08 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7716:
-
Attachment: YARN-7716.00.patch

> metricsTimeStart and metricsTimeEnd should be all lower case in the doc
> ---
>
> Key: YARN-7716
> URL: https://issues.apache.org/jira/browse/YARN-7716
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-7716.00.patch
>
>
> The TimelineV2 REST API is case sensitive. The doc refers to 
> `metricstimestart` and `metricstimeend` as metricsTimeStart and 
> metricsTimeEnd. When users follow the API doc, it appears that the two 
> parameters do not work properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7714) YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer

2018-01-08 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316880#comment-16316880
 ] 

Vrushali C commented on YARN-7714:
--

Hmm, there isn't a timeline server in atsv2 but it exists in 1 and 1.5, perhaps 
this option is used for those versions?  Also, in 1/1.5, there isn't a separate 
reader I think. 

So, I think it's 
ATSv1 and ATSv1.5 => timeline server
ATSv2 => timeline reader

What do you think? 
cc [~varun_saxena] [~rohithsharma]

> YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer
> ---
>
> Key: YARN-7714
> URL: https://issues.apache.org/jira/browse/YARN-7714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> From hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh,
> {code}
> # Specify the max heapsize for the timelineserver.  If no units are
> # given, it will be assumed to be in MB.
> # This value will be overridden by an Xmx setting specified in either
> # HADOOP_OPTS and/or YARN_TIMELINESERVER_OPTS.
> # Default is the same as HADOOP_HEAPSIZE_MAX.
> #export YARN_TIMELINE_HEAPSIZE=
> # Specify the JVM options to be used when starting the TimeLineServer.
> # These options will be appended to the options specified as HADOOP_OPTS
> # and therefore may override any similar flags set in HADOOP_OPTS
> #
> # See ResourceManager for some examples
> #
> #export YARN_TIMELINESERVER_OPTS=
> {code}
> However, YARN_TIMELINESERVER_OPTS does not work. The correct one to set is 
> YARN_TIMELINEREADER_OPTS instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7716) metricsTimeStart and metricsTimeEnd should be all lower case in the doc

2018-01-08 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-7716:


 Summary: metricsTimeStart and metricsTimeEnd should be all lower 
case in the doc
 Key: YARN-7716
 URL: https://issues.apache.org/jira/browse/YARN-7716
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelinereader
Affects Versions: 3.0.0
Reporter: Haibo Chen
Assignee: Haibo Chen


The TimelineV2 REST API is case sensitive. The doc refers to `metricstimestart` 
and `metricstimeend` as metricsTimeStart and metricsTimeEnd. When users follow 
the API doc, it appears that the two parameters do not work properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7715) Update CPU and Memory cgroups params on container update as well.

2018-01-08 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316871#comment-16316871
 ] 

Arun Suresh commented on YARN-7715:
---

I was thinking we introduce callbacks into the ResourceHandler modules for 
container update as well
Thoughts [~miklos.szeg...@cloudera.com], [~haibo.chen], [~kkaranasos] ?

> Update CPU and Memory cgroups params on container update as well.
> -
>
> Key: YARN-7715
> URL: https://issues.apache.org/jira/browse/YARN-7715
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>
> In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups 
> params for the containers, based on opportunistic or guaranteed, in the 
> *preStart* method.
> Now that YARN-5085 is in, Container executionType (as well as the cpu, memory 
> and any other resources) can be updated after the container has started. This 
> means we need the ability to change cgroups params after container start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7715) Update CPU and Memory cgroups params on container update as well.

2018-01-08 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-7715:
-

 Summary: Update CPU and Memory cgroups params on container update 
as well.
 Key: YARN-7715
 URL: https://issues.apache.org/jira/browse/YARN-7715
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Arun Suresh


In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups 
params for the containers, based on opportunistic or guaranteed, in the 
*preStart* method.

Now that YARN-5085 is in, Container executionType (as well as the cpu, memory 
and any other resources) can be updated after the container has started. This 
means we need the ability to change cgroups params after container start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7704) Document improvement for registry dns

2018-01-08 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7704:
--
Attachment: YARN-7704.02.patch

also updated the outdated "/ws/v1/services" to "/app/v1/services"

> Document improvement for registry dns
> -
>
> Key: YARN-7704
> URL: https://issues.apache.org/jira/browse/YARN-7704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Fix For: yarn-native-services
>
> Attachments: YARN-7704.01.patch, YARN-7704.02.patch
>
>
> Add document for how to point the cluster to use the registry dns



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7714) YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer

2018-01-08 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7714:
-
Description: 
>From hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh,
{code}
# Specify the max heapsize for the timelineserver.  If no units are
# given, it will be assumed to be in MB.
# This value will be overridden by an Xmx setting specified in either
# HADOOP_OPTS and/or YARN_TIMELINESERVER_OPTS.
# Default is the same as HADOOP_HEAPSIZE_MAX.
#export YARN_TIMELINE_HEAPSIZE=

# Specify the JVM options to be used when starting the TimeLineServer.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# See ResourceManager for some examples
#
#export YARN_TIMELINESERVER_OPTS=
{code}

However, YARN_TIMELINESERVER_OPTS does not work. The correct one to set is 
YARN_TIMELINEREADER_OPTS instead.

> YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer
> ---
>
> Key: YARN-7714
> URL: https://issues.apache.org/jira/browse/YARN-7714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> From hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh,
> {code}
> # Specify the max heapsize for the timelineserver.  If no units are
> # given, it will be assumed to be in MB.
> # This value will be overridden by an Xmx setting specified in either
> # HADOOP_OPTS and/or YARN_TIMELINESERVER_OPTS.
> # Default is the same as HADOOP_HEAPSIZE_MAX.
> #export YARN_TIMELINE_HEAPSIZE=
> # Specify the JVM options to be used when starting the TimeLineServer.
> # These options will be appended to the options specified as HADOOP_OPTS
> # and therefore may override any similar flags set in HADOOP_OPTS
> #
> # See ResourceManager for some examples
> #
> #export YARN_TIMELINESERVER_OPTS=
> {code}
> However, YARN_TIMELINESERVER_OPTS does not work. The correct one to set is 
> YARN_TIMELINEREADER_OPTS instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7714) YARN_TIMELINESERVER_OPTS is not valid env variable for TimelineReaderServer

2018-01-08 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-7714:


 Summary: YARN_TIMELINESERVER_OPTS is not valid env variable for 
TimelineReaderServer
 Key: YARN-7714
 URL: https://issues.apache.org/jira/browse/YARN-7714
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelinereader
Affects Versions: 3.0.0
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7713) Add parallel copying of directories into

2018-01-08 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-7713:


 Summary: Add parallel copying of directories into
 Key: YARN-7713
 URL: https://issues.apache.org/jira/browse/YARN-7713
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Miklos Szegedi
Assignee: Miklos Szegedi


YARN currently copies directories sequentially when localizing. This could be 
improved to do in parallel, since the source blocks are normally on different 
nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7712) Add ability to ignore timestamps in localized files

2018-01-08 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-7712:


 Summary: Add ability to ignore timestamps in localized files
 Key: YARN-7712
 URL: https://issues.apache.org/jira/browse/YARN-7712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Miklos Szegedi
Assignee: Miklos Szegedi


YARN currently requires and checks the timestamp of localized files and fails, 
if the file on HDFS does not match to the one requested. This jira adds the 
ability to ignore the timestamp based on the request of the client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2185) Use pipes when localizing archives

2018-01-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316777#comment-16316777
 ] 

Jason Lowe commented on YARN-2185:
--

Thanks for the patch!

This patch adds parallel copying of directories and ability to ignore 
timestamps which IMHO is outside the scope of this patch.  I think those should 
be handled in separate JIRAs with appropriate unit tests.

Do we really want a while loop to pull off extensions?  That's not how it 
behaved before, and it will now do different, potentially unexpected, things 
for a ".jar.gz", ".tgz.tgz", ".gz.jar", etc.  Granted these are unlikely 
filenames to encounter, but if somehow a user had those before and were 
working, this will change that behavior.

Does it make sense to use an external process to unpack .jar and .zip files?  
The original code did this via the JDK directly rather than requiring a 
subprocess, and given there's a ZipInputStream and JarInputStream, it seems we 
could do the same here with the input stream rather than requiring the overhead 
of a subprocess and copying of data between streams.

This now logs an INFO message for all stderr and stdout for the subprocess 
(i.e.: at least every entry in the archive), which is not something it did 
before.  This should be debug, if logged at all, and it requires buffering all 
of the output.  Some of our users have some crazy use cases with very 
large/deep tarballs, and grabbing all the output from tar would not be a 
trivial amount of memory for the NM or a localizer heap, especially if there 
are multiple of these at the same time.  If the log message is kept, the 
verbose flag should be passed to the command only if the output will be logged.

I think it would be better to explicitly create the directory from the Java 
code rather than squirrel it into the tar command.  We can provide a better 
error message with more context if done directly, and as it is now the mkdir 
can fail but blindlly proceed to the tar command.

Wouldn't "cd _destpath_ && tar -xf" be better than "cd _destpath_; tar -xf"?  
Otherwise if somehow the change directory fails it will end up unpacking to an 
unexpected place.

Is the "rm -rf" necessary?

Has this been tested on Windows?  The zipOrJar path appears to be eligible for 
Windows but uses shell syntax that {{cmd}} is not going to understand, like 
semicolons between commands.

Does the single-quote escaping work properly?  Escaping double quotes with a 
backslash works within a double-quoted string, but the same is not true for 
single-quoted strings.  The shell does not interpret any characters in a 
single-quoted string, even backslash characters.  Therefore the only way I know 
of to put a single-quote character in a single-quoted string is to terminate 
the current string, insert an escaped single quote (since we're not in a 
single-quote parsing context at this point), then start the single-quote string 
up again.  In other words, each {{'}} character trying to be escaped within a 
single-quote context becomes the four character sequence: {{'\''}}  For 
example, single-quoting {{abc'def'ghi}} becomes {{'abc'\''def'\''ghi'}}.

I think we should shutdown the executor service if we created it, otherwise we 
risk running out of threads before the garbage collector gets around to 
noticing it can free everything up.  Either that or require the caller to 
provide the executor.  Note that if the caller provides a single thread 
executor (or some other executor that can only execute serially) then this can 
deadlock if the process emits a lot of output on stderr.  The thread trying to 
consume stdout data is blocked until the child process completes, but the child 
process is blocked trying to emit stderr output that the parent process is not 
consuming.  I wonder if it would be better to add a method to Shell that takes 
an input stream so we could reuse all the subprocess handling code there.

This debug log was upgraded to an info, yet it still has the debug check in 
front of it.  I'm not sure this should be upgraded to an info log as part of 
this change.
{code}
if (LOG.isDebugEnabled()) {
  LOG.info(String.format("Starting to download %s %s %s",
  sCopy,
  resource.getType(),
  resource.getPattern()));
}
{code}

I don't see how the following code treats PATTERN files that don't end in 
'.jar' like ARCHIVE files since the resource type check will still find PATTERN 
instead of ARCHIVE when it falls through from the first {{if}} to the second:
{code}
  if (resource.getType() == LocalResourceType.PATTERN) {
if (destinationFile.endsWith(".jar")) {
  // Unpack and keep a copy of the whole jar for mapreduce
  String p = resource.getPattern();
  RunJar.unJarAndSave(inputStream, new File(destination.toUri()),
  source.getName(),
  p == null ? RunJar.MATCH_ANY : Pa

[jira] [Commented] (YARN-7624) RM gives YARNFeatureNotEnabledException even when resource profile feature is not enabled

2018-01-08 Thread Manikandan R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316635#comment-16316635
 ] 

Manikandan R commented on YARN-7624:


[~sunilg] On debugging this, call to {{yarnClient.getResourceProfiles()}} 
throws this exception through 
{{ResourceProfilesManagerImpl#checkAndThrowExceptionWhenFeatureDisabled}} and 
logging happens at server side if this feature is turned off. Can we assume 
{{yarnClient.getResourceProfiles()}} should be called only when this feature is 
turned ON and make the changes at client side?

> RM gives YARNFeatureNotEnabledException even when resource profile feature is 
> not enabled
> -
>
> Key: YARN-7624
> URL: https://issues.apache.org/jira/browse/YARN-7624
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Weiwei Yang
>Assignee: Sunil G
>
> A single node setup, I haven't enabled resource profile feature. Property 
> {{yarn.resourcemanager.resource-profiles.enabled}} was not set. Start yarn, 
> launch a job, I got following error message in RM log
> {noformat}
> org.apache.hadoop.yarn.exceptions.YARNFeatureNotEnabledException: Resource 
> profile is not enabled, please enable resource profile feature before using 
> its functions. (by setting yarn.resourcemanager.resource-profiles.enabled to 
> true)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.checkAndThrowExceptionWhenFeatureDisabled(ResourceProfilesManagerImpl.java:191)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.resource.ResourceProfilesManagerImpl.getResourceProfiles(ResourceProfilesManagerImpl.java:214)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1822)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> this is confusing because I did not enable this feature, why I still get this 
> error?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316623#comment-16316623
 ] 

Eric Badger commented on YARN-7677:
---

bq. This change will make yarnfile content more consistent that environment 
variable and mounting directories both needs to present in yarnfile to show 
HADOOP_CONF_DIR is exposed to docker container.
Yes, that is correct. I'll go ahead an put up a patch in a little bit once I 
get a free moment.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2185) Use pipes when localizing archives

2018-01-08 Thread Gergo Repas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316339#comment-16316339
 ] 

Gergo Repas commented on YARN-2185:
---

Thanks [~miklos.szeg...@cloudera.com].
+1 (non-binding)

> Use pipes when localizing archives
> --
>
> Key: YARN-2185
> URL: https://issues.apache.org/jira/browse/YARN-2185
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
> Attachments: YARN-2185.000.patch, YARN-2185.001.patch, 
> YARN-2185.002.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it, 
> and then removes it.  It would be more efficient to stream the data as it's 
> being unpacked to avoid both the extra disk space requirements and the 
> additional disk activity from storing the archive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7710) http://ip:8088/cluster show different ID with same name

2018-01-08 Thread jimmy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jimmy updated YARN-7710:

Priority: Blocker  (was: Minor)

> http://ip:8088/cluster show different ID with same name  
> -
>
> Key: YARN-7710
> URL: https://issues.apache.org/jira/browse/YARN-7710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 2.7.3
> Environment: hadoop2.7.3 
> jdk 1.8
>Reporter: jimmy
>Priority: Blocker
>
> 1.create five thread
> 2.submit five steamJob with different name
> 3.visit http://ip:8088 we can see same name for different id sometimes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7557) It should be possible to specify resource types in the fair scheduler increment value

2018-01-08 Thread Gergo Repas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315948#comment-16315948
 ] 

Gergo Repas commented on YARN-7557:
---

Thanks [~rkanter].

> It should be possible to specify resource types in the fair scheduler 
> increment value
> -
>
> Key: YARN-7557
> URL: https://issues.apache.org/jira/browse/YARN-7557
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Daniel Templeton
>Assignee: Gergo Repas
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: YARN-7557.000.patch, YARN-7557.001.patch, 
> YARN-7557.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4227) FairScheduler: RM quits processing expired container from a removed node

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315915#comment-16315915
 ] 

genericqa commented on YARN-4227:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  6m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 31s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m  
6s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-4227 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905009/YARN-4227.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f7617b2b3e93 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 01f3f21 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19138/testReport/ |
| Max. process+thread count | 836 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19138/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> FairScheduler: RM quits processing ex

[jira] [Commented] (YARN-6486) FairScheduler: Deprecate continuous scheduling

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315929#comment-16315929
 ] 

genericqa commented on YARN-6486:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 20 unchanged - 1 fixed = 20 total (was 21) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 254 unchanged - 8 fixed = 255 total (was 262) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 61m 
13s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-6486 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905007/YARN-6486.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 63740b3dca0e 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 01f3f21 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19137/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19137/testReport/ |
| Max. process+thre

[jira] [Commented] (YARN-7699) queueUsagePercentage is coming as INF for getApp REST api call

2018-01-08 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315911#comment-16315911
 ] 

genericqa commented on YARN-7699:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
14s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-3.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 15s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 47s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:20ca677 |
| JIRA Issue | YARN-7699 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905023/YARN-7699-branch-3.0.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 312eb4ac55a2 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.0 / 53a7bd1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19136/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19136/testReport/ |
| 

[jira] [Commented] (YARN-7711) YARN UI2 should redirect into Active RM in HA

2018-01-08 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315860#comment-16315860
 ] 

Sunil G commented on YARN-7711:
---

Thanks [~rohithsharma] [~naganarasimha...@apache.org]

UI2 is based on js framework (ember). We download statics (such as images, js 
files, icons) from the server which is hosting (Currently both RM has this). 
Once we have statics, page will be rendered and then server invocation will 
done to fetch data (This will be done only from ACTIVE RM). Hence after 
switching, basic static may b served from standby RM, however data is fetched 
only from Active RM. In contrast to old UI where it is a server side impl, new 
UI is end to end client side. So it helps to overcome this redirects etc.

> YARN UI2 should redirect into Active RM in HA
> -
>
> Key: YARN-7711
> URL: https://issues.apache.org/jira/browse/YARN-7711
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>
> When UI2 is enabled in HA mode, if REST query goes into stand by RM, then it 
> is not redirecting into active RM. 
> It should redirect into Active RM as old UI redirect into active!
> cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7711) YARN UI2 should redirect into Active RM in HA

2018-01-08 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315849#comment-16315849
 ] 

Naganarasimha G R commented on YARN-7711:
-

Hi [~rohithsharma] & [~sunilg], Agree that it was getting redirected to Active 
RM in the Old webui, but was wondering it was good to have it in the same way 
in new webui too, as whenever there is RM failover we will not be able to 
access the logs , JMX metrics etc from the webui of the standby and not all 
might have access to the hadoop cluster to see the standby admin logs. 
Thoughts ?


> YARN UI2 should redirect into Active RM in HA
> -
>
> Key: YARN-7711
> URL: https://issues.apache.org/jira/browse/YARN-7711
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>
> When UI2 is enabled in HA mode, if REST query goes into stand by RM, then it 
> is not redirecting into active RM. 
> It should redirect into Active RM as old UI redirect into active!
> cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org