[jira] [Commented] (YARN-8380) Support shared mounts in docker runtime

2018-05-31 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496185#comment-16496185
 ] 

Rohith Sharma K S commented on YARN-8380:
-

Just fyi.. Whenever docker volume is mounted with shared, I see an error like 
*_docker: Error response from daemon: linux mounts: Could not find source mount 
of /var/lib/kubelet_*. Do you know any reason for this? 

> Support shared mounts in docker runtime
> ---
>
> Key: YARN-8380
> URL: https://issues.apache.org/jira/browse/YARN-8380
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
>
> The docker run command supports the mount type shared, but currently we are 
> only supporting ro and rw mount types in the docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8368) yarn app start cli should print applicationId

2018-05-31 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496188#comment-16496188
 ] 

Rohith Sharma K S commented on YARN-8368:
-

Thanks [~billie.rinaldi] for review and committing the patch..

> yarn app start cli should print applicationId
> -
>
> Key: YARN-8368
> URL: https://issues.apache.org/jira/browse/YARN-8368
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Rohith Sharma K S
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8368.01.patch, YARN-8368.02.patch
>
>
> yarn app start cli should print the application Id similar to yarn launch cmd.
> {code:java}
> bash-4.2$ yarn app -start hbase-app-test
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> 18/05/24 15:15:53 INFO client.RMProxy: Connecting to ResourceManager at 
> xxx/xxx:8050
> 18/05/24 15:15:54 INFO client.RMProxy: Connecting to ResourceManager at 
> xxx/xxx:8050
> 18/05/24 15:15:55 INFO client.ApiServiceClient: Service hbase-app-test is 
> successfully started.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496215#comment-16496215
 ] 

genericqa commented on YARN-8308:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core:
 The patch generated 2 new + 23 unchanged - 0 fixed = 25 total (was 23) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
55s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8308 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925862/YARN-8308.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 01f990458225 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 02c4b89 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20906/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20906/testReport/ 

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496223#comment-16496223
 ] 

genericqa commented on YARN-8375:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 36s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestCGroupElasticMemoryController
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8375 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925864/YARN-8375.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 40e64adb729f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 02c4b89 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20905/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20905/testReport/ |
| Max. process+thread count | 301 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-pro

[jira] [Commented] (YARN-8276) [UI2] After version field became mandatory, form-based submission of new YARN service through UI2 doesn't work

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496263#comment-16496263
 ] 

Sunil Govindan commented on YARN-8276:
--

Yes. Changes looks fine to me.

I ll commit this shortly.

> [UI2] After version field became mandatory, form-based submission of new YARN 
> service through UI2 doesn't work
> --
>
> Key: YARN-8276
> URL: https://issues.apache.org/jira/browse/YARN-8276
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Gergely Novák
>Assignee: Gergely Novák
>Priority: Critical
> Attachments: YARN-8276.001.patch
>
>
> After version became mandatory in YARN service, one cannot create a new 
> service through UI, there is no way to specify the version field and the 
> service fails with the following message:
> {code}
> "Error: Adapter operation failed". 
> {code}
> Checking through browser dev tools, the REST response is the following:
> {code}
> {"diagnostics":"Version of service sleeper-service is either empty or not 
> provided"}
> {code}
> Discovered by [~vinodkv].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8367) Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in SchedulingRequest is null

2018-05-31 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8367:
--
Summary: Fix NPE in SingleConstraintAppPlacementAllocator when placement 
constraint in SchedulingRequest is null  (was: 2 components, one with placement 
constraint and one without causes NPE in SingleConstraintAppPlacementAllocator)

> Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in 
> SchedulingRequest is null
> ---
>
> Key: YARN-8367
> URL: https://issues.apache.org/jira/browse/YARN-8367
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Gour Saha
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8367.001.patch
>
>
> While testing the fix for YARN-8350, [~billie.rinaldi] encountered this NPE 
> in AM log. Filling this on her behalf -
> {noformat}
> 2018-05-25 21:11:54,006 [AMRM Heartbeater thread] ERROR 
> impl.AMRMClientAsyncImpl - Exception on heartbeat
> java.lang.NullPointerException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.validateAndSetSchedulingRequest(SingleConstraintAppPlacementAllocator.java:245)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.internalUpdatePendingAsk(SingleConstraintAppPlacementAllocator.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.updatePendingAsk(SingleConstraintAppPlacementAllocator.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.addSchedulingRequests(AppSchedulingInfo.java:269)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateSchedulingRequests(AppSchedulingInfo.java:240)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateSchedulingRequests(SchedulerApplicationAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1154)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:278)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.SchedulerPlacementProcessor.allocate(SchedulerPlacementProcessor.java:53)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>

[jira] [Updated] (YARN-7953) [GQ] Data structures for federation global queues calculations

2018-05-31 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated YARN-7953:

Attachment: YARN-7953-YARN-7402.v2.patch

> [GQ] Data structures for federation global queues calculations
> --
>
> Key: YARN-7953
> URL: https://issues.apache.org/jira/browse/YARN-7953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-7953-YARN-7402.v1.patch, 
> YARN-7953-YARN-7402.v2.patch, YARN-7953.v1.patch
>
>
> This Jira tracks data structures and helper classes used by the core 
> algorithms of YARN-7402 umbrella Jira (currently YARN-7403, and YARN-7834).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8258) [UI2] New UI webappcontext should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8258:
-
Attachment: YARN-8258.002.patch

> [UI2] New UI webappcontext should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8258) [UI2] New UI webappcontext should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8258:
-
Issue Type: Bug  (was: Improvement)

> [UI2] New UI webappcontext should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8258:
-
Summary: YARN webappcontext for UI2 should inherit all filters from default 
context  (was: [UI2] New UI webappcontext should inherit all filters from 
default context)

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496387#comment-16496387
 ] 

Sunil Govindan commented on YARN-8258:
--

Attaching patch for review. This now adds all filters from default context for 
new ui2 context.
Also it adds the same set of URLs to expose via the filters.

 

[~vinodkv] [~leftnoteasy] [~rohithsharma] pls help to review the  patch.

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8197:
-
Attachment: YARN-8197.003.patch

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496404#comment-16496404
 ] 

Sunil Govindan commented on YARN-8197:
--

Thanks [~vinodkv] [~eyang]

Updating latest patch addressing the comments.

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7953) [GQ] Data structures for federation global queues calculations

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496410#comment-16496410
 ] 

genericqa commented on YARN-7953:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-7402 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
16s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} YARN-7402 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 69m  
6s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-7953 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925885/YARN-7953-YARN-7402.v2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 46cfa0e6050c 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-7402 / 262ca7f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20907/testReport/ |
| Max. process+thread count | 857 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/

[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496422#comment-16496422
 ] 

genericqa commented on YARN-8258:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 2 new + 
41 unchanged - 0 fixed = 43 total (was 41) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8258 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925895/YARN-8258.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f976e16ebf11 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d1e2b80 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20908/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20908/testReport/ |
| Max. process+thread count | 352 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-y

[jira] [Commented] (YARN-8259) Revisit liveliness checks for Docker containers

2018-05-31 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496492#comment-16496492
 ] 

Shane Kumpf commented on YARN-8259:
---

I've been doing additional testing here and could use input from the community 
as all of the solutions have cons. Here is what I've tested and been 
considering.

1) */proc/pid check as yarn*

Pros:
 * No c-e changes
 * Works for with Docker live restore

Cons:
 * Breaks down when using hide pid
 * Portability


2) */proc/pid or kill -0 as privileged user*

Pros:
 * Works for with Docker live restore

Cons:
 * Circumvents hidepid, allows the yarn user to check the existence of any pid 
due to use of elevated privileges.
 * Portability (/proc method)


3) *docker inspect*

Pros:
 * No c-e changes
 * Uses the Docker API

Cons:
 * Requires retry handling to support Docker live restore.
 ** In the case of a Docker daemon upgrade, this means the upgrade must 
complete before the retries are exhausted, which could mean hundreds of retries.


4) *Hybrid* (Keep existing kill -0 for non-privileged, docker inspect for 
privileged)

Pros:
 * No c-e changes
 * Limits impacts to live restore

Cons:
 * Requires retry handling to support Docker live restore.
 * Different handling based on container type.


I believe #2 is a non-starter as it silently bypasses the hidepid option.  I'm 
leaning towards striking #3 from the list as well, as we really need the 
recovery logic to be solid, so I don't want to unnecessary impact 
non-privileged containers which appear to be working well.

At this point, I'm leaning towards #4 or #1 (with docs indicating that the NM 
user must be whitelisted if hidepid is enabled).

> Revisit liveliness checks for Docker containers
> ---
>
> Key: YARN-8259
> URL: https://issues.apache.org/jira/browse/YARN-8259
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.2, 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: Docker
> Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN 
> run as user, sending the null signal for liveliness checks could fail. We 
> need to reconsider how liveliness checks are handled in the Docker case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8259) Revisit liveliness checks for Docker containers

2018-05-31 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496492#comment-16496492
 ] 

Shane Kumpf edited comment on YARN-8259 at 5/31/18 12:51 PM:
-

I've been doing additional testing here and could use input from the community 
as all of the solutions have cons. Here is what I've tested and been 
considering.

1) */proc/pid check as yarn*

Pros:
 * No c-e changes
 * Works with Docker live restore

Cons:
 * Breaks down when using hide pid
 * Portability


2) */proc/pid or kill -0 as privileged user*

Pros:
 * Works with Docker live restore

Cons:
 * Circumvents hidepid, allows the yarn user to check the existence of any pid 
due to use of elevated privileges.
 * Portability (/proc method)


3) *docker inspect*

Pros:
 * No c-e changes
 * Uses the Docker API

Cons:
 * Requires retry handling to support Docker live restore.
 ** In the case of a Docker daemon upgrade, this means the upgrade must 
complete before the retries are exhausted, which could mean hundreds of retries.


4) *Hybrid* (Keep existing kill -0 for non-privileged, docker inspect for 
privileged)

Pros:
 * No c-e changes
 * Limits impacts to live restore

Cons:
 * Requires retry handling to support Docker live restore.
 * Different handling based on container type.


I believe #2 is a non-starter as it silently bypasses the hidepid option.  I'm 
leaning towards striking #3 from the list as well, as we really need the 
recovery logic to be solid, so I don't want to unnecessary impact 
non-privileged containers which appear to be working well.

At this point, I'm leaning towards #4 or #1 (with docs indicating that the NM 
user must be whitelisted if hidepid is enabled).


was (Author: shaneku...@gmail.com):
I've been doing additional testing here and could use input from the community 
as all of the solutions have cons. Here is what I've tested and been 
considering.

1) */proc/pid check as yarn*

Pros:
 * No c-e changes
 * Works for with Docker live restore

Cons:
 * Breaks down when using hide pid
 * Portability


2) */proc/pid or kill -0 as privileged user*

Pros:
 * Works for with Docker live restore

Cons:
 * Circumvents hidepid, allows the yarn user to check the existence of any pid 
due to use of elevated privileges.
 * Portability (/proc method)


3) *docker inspect*

Pros:
 * No c-e changes
 * Uses the Docker API

Cons:
 * Requires retry handling to support Docker live restore.
 ** In the case of a Docker daemon upgrade, this means the upgrade must 
complete before the retries are exhausted, which could mean hundreds of retries.


4) *Hybrid* (Keep existing kill -0 for non-privileged, docker inspect for 
privileged)

Pros:
 * No c-e changes
 * Limits impacts to live restore

Cons:
 * Requires retry handling to support Docker live restore.
 * Different handling based on container type.


I believe #2 is a non-starter as it silently bypasses the hidepid option.  I'm 
leaning towards striking #3 from the list as well, as we really need the 
recovery logic to be solid, so I don't want to unnecessary impact 
non-privileged containers which appear to be working well.

At this point, I'm leaning towards #4 or #1 (with docs indicating that the NM 
user must be whitelisted if hidepid is enabled).

> Revisit liveliness checks for Docker containers
> ---
>
> Key: YARN-8259
> URL: https://issues.apache.org/jira/browse/YARN-8259
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.2, 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: Docker
> Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN 
> run as user, sending the null signal for liveliness checks could fail. We 
> need to reconsider how liveliness checks are handled in the Docker case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8367) Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in SchedulingRequest is null

2018-05-31 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8367:
--
Fix Version/s: 3.1.1

> Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in 
> SchedulingRequest is null
> ---
>
> Key: YARN-8367
> URL: https://issues.apache.org/jira/browse/YARN-8367
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Gour Saha
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8367.001.patch
>
>
> While testing the fix for YARN-8350, [~billie.rinaldi] encountered this NPE 
> in AM log. Filling this on her behalf -
> {noformat}
> 2018-05-25 21:11:54,006 [AMRM Heartbeater thread] ERROR 
> impl.AMRMClientAsyncImpl - Exception on heartbeat
> java.lang.NullPointerException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.validateAndSetSchedulingRequest(SingleConstraintAppPlacementAllocator.java:245)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.internalUpdatePendingAsk(SingleConstraintAppPlacementAllocator.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.updatePendingAsk(SingleConstraintAppPlacementAllocator.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.addSchedulingRequests(AppSchedulingInfo.java:269)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateSchedulingRequests(AppSchedulingInfo.java:240)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateSchedulingRequests(SchedulerApplicationAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1154)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:278)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.SchedulerPlacementProcessor.allocate(SchedulerPlacementProcessor.java:53)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 

[jira] [Commented] (YARN-8367) Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in SchedulingRequest is null

2018-05-31 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496508#comment-16496508
 ] 

Weiwei Yang commented on YARN-8367:
---

Just committed to trunk and cherry picked to branch-3.1.

> Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in 
> SchedulingRequest is null
> ---
>
> Key: YARN-8367
> URL: https://issues.apache.org/jira/browse/YARN-8367
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Gour Saha
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8367.001.patch
>
>
> While testing the fix for YARN-8350, [~billie.rinaldi] encountered this NPE 
> in AM log. Filling this on her behalf -
> {noformat}
> 2018-05-25 21:11:54,006 [AMRM Heartbeater thread] ERROR 
> impl.AMRMClientAsyncImpl - Exception on heartbeat
> java.lang.NullPointerException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.validateAndSetSchedulingRequest(SingleConstraintAppPlacementAllocator.java:245)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.internalUpdatePendingAsk(SingleConstraintAppPlacementAllocator.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.updatePendingAsk(SingleConstraintAppPlacementAllocator.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.addSchedulingRequests(AppSchedulingInfo.java:269)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateSchedulingRequests(AppSchedulingInfo.java:240)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateSchedulingRequests(SchedulerApplicationAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1154)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:278)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.SchedulerPlacementProcessor.allocate(SchedulerPlacementProcessor.java:53)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop

[jira] [Commented] (YARN-8367) Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in SchedulingRequest is null

2018-05-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496527#comment-16496527
 ] 

Hudson commented on YARN-8367:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14322 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14322/])
YARN-8367. Fix NPE in SingleConstraintAppPlacementAllocator when (wwei: rev 
6468071f137e6d918a7b4799ad54558fa26b25ce)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/SingleConstraintAppPlacementAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestSchedulingRequestContainerAllocation.java


> Fix NPE in SingleConstraintAppPlacementAllocator when placement constraint in 
> SchedulingRequest is null
> ---
>
> Key: YARN-8367
> URL: https://issues.apache.org/jira/browse/YARN-8367
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Gour Saha
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8367.001.patch
>
>
> While testing the fix for YARN-8350, [~billie.rinaldi] encountered this NPE 
> in AM log. Filling this on her behalf -
> {noformat}
> 2018-05-25 21:11:54,006 [AMRM Heartbeater thread] ERROR 
> impl.AMRMClientAsyncImpl - Exception on heartbeat
> java.lang.NullPointerException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.validateAndSetSchedulingRequest(SingleConstraintAppPlacementAllocator.java:245)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.internalUpdatePendingAsk(SingleConstraintAppPlacementAllocator.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.updatePendingAsk(SingleConstraintAppPlacementAllocator.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.addSchedulingRequests(AppSchedulingInfo.java:269)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.updateSchedulingRequests(AppSchedulingInfo.java:240)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateSchedulingRequests(SchedulerApplicationAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1154)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:278)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.SchedulerPlacementProcessor.allocate(SchedulerPlacementProcessor.java:53)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntim

[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-05-31 Thread Manikandan R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-4606:
---
Attachment: YARN-4606.003.patch

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.1.poc.patch, YARN-4606.POC.2.patch, 
> YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-05-31 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496539#comment-16496539
 ] 

Manikandan R commented on YARN-4606:


Attached .003 patch for review.

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.1.poc.patch, YARN-4606.POC.2.patch, 
> YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-31 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496551#comment-16496551
 ] 

Billie Rinaldi commented on YARN-8333:
--

Thanks, [~eyang]! +1 for patch 3. I will commit this to trunk and branch-3.1.

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch, YARN-8333.002.patch, 
> YARN-8333.003.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496567#comment-16496567
 ] 

Hudson commented on YARN-8333:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14323 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14323/])
YARN-8333. Load balance YARN services using RegistryDNS multiple A (billie: rev 
6bc92e304fe05e80f13830104d1fd2c59da8344b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/server/dns/ContainerServiceRecordProcessor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/server/dns/TestRegistryDNS.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/ServiceDiscovery.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/server/dns/BaseServiceRecordProcessor.java


> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch, YARN-8333.002.patch, 
> YARN-8333.003.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8380) Support shared mounts in docker runtime

2018-05-31 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496574#comment-16496574
 ] 

Billie Rinaldi commented on YARN-8380:
--

[~rohithsharma] You can address this error by running the following on the 
directory you would like to mount as shared:
{noformat}
/bin/mount --bind /dir /dir
/bin/mount --make-shared /dir
{noformat}

> Support shared mounts in docker runtime
> ---
>
> Key: YARN-8380
> URL: https://issues.apache.org/jira/browse/YARN-8380
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
>
> The docker run command supports the mount type shared, but currently we are 
> only supporting ro and rw mount types in the docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-31 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496604#comment-16496604
 ] 

Billie Rinaldi commented on YARN-8333:
--

Jenkins failure was due to the build using the wrong version of libprotoc.

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch, YARN-8333.002.patch, 
> YARN-8333.003.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6677:
-
Attachment: YARN-6677.01.patch

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch, YARN-6677.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-8308:

Attachment: YARN-8308.002.patch

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8308.001.patch, YARN-8308.002.patch
>
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496638#comment-16496638
 ] 

Haibo Chen commented on YARN-6677:
--

I updated the patch without addressing the getCGroupsHandler() comment, because 
I think it is more common to do dependency injection with overriding method.

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch, YARN-6677.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8258:
-
Attachment: YARN-8258.003.patch

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch, 
> YARN-8258.003.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496698#comment-16496698
 ] 

Sunil Govindan commented on YARN-8258:
--

Attaching latest patch after correcting some checkstyle issues.

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch, 
> YARN-8258.003.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues

2018-05-31 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496715#comment-16496715
 ] 

Eric Payne commented on YARN-8379:
--

Thanks [~leftnoteasy] for bringing this up. Yes, our use case would benefit 
from this feature. We are currently running 2.8, which does the balancing, so 
this would help us in moving to 3.x in the future.

> Add an option to allow Capacity Scheduler preemption to balance satisfied 
> queues
> 
>
> Key: YARN-8379
> URL: https://issues.apache.org/jira/browse/YARN-8379
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
>
> Existing capacity scheduler only supports preemption for an underutilized 
> queue to reach its guaranteed resource. In addition to that, there’s an 
> requirement to get better balance between queues when all of them reach 
> guaranteed resource but with different fairness resource.
> An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c 
> = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing 
> scheduler preemption won't happen. But this is unfair to queue_b since 
> queue_b has the same guaranteed resources.
> Before YARN-5864, capacity scheduler do additional preemption to balance 
> queues. We changed the logic since it could preempt too many containers 
> between queues when all queues are satisfied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496725#comment-16496725
 ] 

genericqa commented on YARN-8308:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
54s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8308 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925923/YARN-8308.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b89fdb280083 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1361030 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20910/testReport/ |
| Max. process+thread count | 773 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20910/console |
| 

[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496737#comment-16496737
 ] 

Eric Yang commented on YARN-8333:
-

Thank you [~billie.rinaldi] for the review and commit.

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8333.001.patch, YARN-8333.002.patch, 
> YARN-8333.003.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496748#comment-16496748
 ] 

genericqa commented on YARN-6677:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 17 new + 145 unchanged - 0 fixed = 162 total (was 145) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 32s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
17s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 59s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-6677 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925922/YARN-6677.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 19ee28db10cb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1361030 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20909/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/20909/artifact/out/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-ya

[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496796#comment-16496796
 ] 

genericqa commented on YARN-8258:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 56s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8258 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925930/YARN-8258.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4d87eeb83c6c 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1361030 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20911/testReport/ |
| Max. process+thread count | 396 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20911/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.




[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496810#comment-16496810
 ] 

Eric Yang commented on YARN-8342:
-

Precommit build test report was for:

YARN-8350. NPE in service AM related to placement policy. Contributed by 
(detail)
HDDS-128. Support for DN to SCM signaling. Contributed by Nanda Kumar. (detail)

Somehow the unit test report is attached to the wrong JIRA, and the failing 
tests are not related to this patch.  Attach patch 003 with checkstyle fixes.

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch, YARN-8342.002.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8342:

Attachment: YARN-8342.003.patch

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch, YARN-8342.002.patch, 
> YARN-8342.003.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8382) cgroup file leak in NM

2018-05-31 Thread Hu Ziqian (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hu Ziqian updated YARN-8382:

Description: 
As Jiandan said in YARN-6525, NM may delete  Cgroup container file timeout with 
logs like below:

org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: 
Unable to delete cgroup at: /cgroup/cpu/hadoop-yarn/container_xxx, tried to 
delete for 1000ms

 

we found one situation is that when we set 
*yarn.nodemanager.sleep-delay-before-sigkill.ms* bigger than 
*yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms*, the 
cgroup file leak happens *.* 

 

One container process tree looks like follow graph:

bash(16097)───java(16099)─┬─\{java}(16100) 

                                                  ├─\{java}(16101) 

{{                       ├─\{java}(16102)}}

 

{{when NM kills a container, NM sends kill -15 -pid to kill container process 
group. Bash process will exit when it received sigterm, but java process may do 
some job (shutdownHook etc.), and doesn't exit unit receive sigkill. And when 
bash process exits, CgroupsLCEResourcesHandler begin to try to delete cgroup 
files. So when 
*yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* arrived, 
the java processes may still running and cgourp/tasks still not empty and cause 
a cgroup file leak.}}

 

{{we add a condition that 
*yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* must 
bigger than *yarn.nodemanager.sleep-delay-before-sigkill.ms* to solve this 
problem.}}

 

  was:
As Jiandan said in YARN-6525, NM may delete  Cgroup container file timeout with 
logs like below:

org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: 
Unable to delete cgroup at: /cgroup/cpu/hadoop-yarn/container_xxx, tried to 
delete for 1000ms

 

we found one situation is that when we set 
*yarn.nodemanager.sleep-delay-before-sigkill.ms* bigger than 
yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms, the cgroup 
file leak happens *.* 

 

One container process tree looks like follow graph:

bash(16097)───java(16099)─┬─\{java}(16100) 

                                                  ├─\{java}(16101) 

{{                       ├─\{java}(16102)}}

 

{{when NM kill a container, NM send kill -15 -pid to kill container process 
group. Bash process will exit when it received sigterm, but java process may do 
some job (shutdownHook etc.), and may exit unit receive sigkill. And when bash 
process exit, CgroupsLCEResourcesHandler begin to try to delete cgroup. So when 
*yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* arrived, 
the java processes may still running and cgourp/tasks still not empty and cause 
a cgroup file leak.}}

 

{{we add a condition that 
*yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* must 
bigger than *yarn.nodemanager.sleep-delay-before-sigkill.ms* to solve this 
problem.}}

 


> cgroup file leak in NM
> --
>
> Key: YARN-8382
> URL: https://issues.apache.org/jira/browse/YARN-8382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: we write an container with a shutdownHook which has a 
> piece of code like  "while(true) sleep(100)" . when 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms <* 
> *yarn.nodemanager.sleep-delay-before-sigkill.ms , cgourp file leak happens; 
> when* *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms >* 
> ** *yarn.nodemanager.sleep-delay-before-sigkill.ms, cgroup file is deleted 
> successfully***
>Reporter: Hu Ziqian
>Assignee: Hu Ziqian
>Priority: Major
> Attachments: YARN-8382-branch-2.8.3.001.patch, YARN-8382.001.patch
>
>
> As Jiandan said in YARN-6525, NM may delete  Cgroup container file timeout 
> with logs like below:
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: 
> Unable to delete cgroup at: /cgroup/cpu/hadoop-yarn/container_xxx, tried to 
> delete for 1000ms
>  
> we found one situation is that when we set 
> *yarn.nodemanager.sleep-delay-before-sigkill.ms* bigger than 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms*, the 
> cgroup file leak happens *.* 
>  
> One container process tree looks like follow graph:
> bash(16097)───java(16099)─┬─\{java}(16100) 
>                                                   ├─\{java}(16101) 
> {{                       ├─\{java}(16102)}}
>  
> {{when NM kills a container, NM sends kill -15 -pid to kill container process 
> group. Bash process will exit when it received sigterm, but java process may 
> do some job (shutdownHook etc.), and doesn't exit unit receive sigkill. And 
> when bash process exits, CgroupsLCEResourcesHandler begin to try to delete 
> cgroup files. So when 
> 

[jira] [Created] (YARN-8383) TimelineServer 1.5 start fails with NoClassDefFoundError

2018-05-31 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-8383:
---

 Summary: TimelineServer 1.5 start fails with NoClassDefFoundError
 Key: YARN-8383
 URL: https://issues.apache.org/jira/browse/YARN-8383
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.4
Reporter: Rohith Sharma K S


TimelineServer 1.5 start fails with NoClassDefFoundError.
{noformat}
2018-05-31 22:10:58,548 FATAL 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
 Error starting ApplicationHistoryServer
java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/JsonFactory
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2306)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2271)
at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2367)
at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2393)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.createSummaryStore(EntityGroupFSTimelineStore.java:239)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:146)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:115)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:180)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:190)
Caused by: java.lang.ClassNotFoundException: 
com.fasterxml.jackson.core.JsonFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 15 more

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8382) cgroup file leak in NM

2018-05-31 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496869#comment-16496869
 ] 

Miklos Szegedi commented on YARN-8382:
--

Thank you for the patch [~ziqian hu].

I think you can just use {{Math.min}} for your logic.

This can still cause an issue, since the sigkill logic may have some residual 
delay racing with deleting the cgroups. Probably the right apporoach would be 
to set the cgroup delete delay to sigkill timeout + 1 second + 
NM_LINUX_CONTAINER_CGROUPS_DELETE_TIMEOUT. The cgroups timeout is there 
probably to address the issues due to clean up after the last thread has exited.

Please address the checkstyle issues.

> cgroup file leak in NM
> --
>
> Key: YARN-8382
> URL: https://issues.apache.org/jira/browse/YARN-8382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: we write an container with a shutdownHook which has a 
> piece of code like  "while(true) sleep(100)" . when 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms <* 
> *yarn.nodemanager.sleep-delay-before-sigkill.ms , cgourp file leak happens; 
> when* *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms >* 
> ** *yarn.nodemanager.sleep-delay-before-sigkill.ms, cgroup file is deleted 
> successfully***
>Reporter: Hu Ziqian
>Assignee: Hu Ziqian
>Priority: Major
> Attachments: YARN-8382-branch-2.8.3.001.patch, YARN-8382.001.patch
>
>
> As Jiandan said in YARN-6525, NM may delete  Cgroup container file timeout 
> with logs like below:
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: 
> Unable to delete cgroup at: /cgroup/cpu/hadoop-yarn/container_xxx, tried to 
> delete for 1000ms
>  
> we found one situation is that when we set 
> *yarn.nodemanager.sleep-delay-before-sigkill.ms* bigger than 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms*, the 
> cgroup file leak happens *.* 
>  
> One container process tree looks like follow graph:
> bash(16097)───java(16099)─┬─\{java}(16100) 
>                                                   ├─\{java}(16101) 
> {{                       ├─\{java}(16102)}}
>  
> {{when NM kills a container, NM sends kill -15 -pid to kill container process 
> group. Bash process will exit when it received sigterm, but java process may 
> do some job (shutdownHook etc.), and doesn't exit unit receive sigkill. And 
> when bash process exits, CgroupsLCEResourcesHandler begin to try to delete 
> cgroup files. So when 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* 
> arrived, the java processes may still running and cgourp/tasks still not 
> empty and cause a cgroup file leak.}}
>  
> {{we add a condition that 
> *yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms* must 
> bigger than *yarn.nodemanager.sleep-delay-before-sigkill.ms* to solve this 
> problem.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-31 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496872#comment-16496872
 ] 

Haibo Chen commented on YARN-8250:
--

[~asuresh], [~leftnoteasy] and I had an offline discussion about this again. 

We think one alternative to avoid two different implementations of the 
container scheduler is to modify the behavior of the existing 
ContainerScheduler to accommodate the requirements of NM over-allocation. 
Specifically, the behavior changes of the current ContainerScheduler will 
include

Before: 

1) Upon a GUARANTEED container scheduling event, always queue the GUARANTEED 
container first and then check if any OPPORTUNISTIC container needs to be 
preempted. If so, wait for the OPPORTUNISTIC container(s) to be killed. 
Otherwise, launch the GUARANTEED container.

2) Upon an OPPORTUNISTIC container scheduling event, queue the container first 
and only launch the OPPORTUNISTIC container if there is enough room.

3) Upon any container completed or finished event that signals resources that 
have been released, check if any container (GUARANTEED containers first, then 
OPPORTUNISTIC containers) can be launched

After:

1) Upon a GUARANTEED container scheduling event, launch the GUARANTEED 
container immediately (without queuing). Rely on cgroups OOM control 
(YARN-6677) to preempt OPPORTUNISTIC containers as necessary.

2) Upon an OPPORTUNISTIC container scheduling event, simply queue the 
OPPORTUNISTIC container. 

3) Upon any container completed or finished event, do not try to launch any 
container.

4) Introduce a periodic check (in ContainersMonitor thread) that launches 
OPPORTUNISTIC container. Ideally, the period is configurable so that the 
latency to launch OPPORTUNISTIC containers can be reduced.

As we have discussed in previous comments, this reduces the latency to launch 
GUARANTEED containers and allow us to control how aggressive OPPORTUNISTIC 
containers are launched, which is especially important for reliability when 
over-allocation is turned on. The code can be a lot simpler as well.

*But it does increase the latency to launch OPPORTUNISTIC containers in cases 
where over-allocation is not on, because we give up opportunities to launch 
them when there are containers finished or paused*. In addition, it does add a 
dependency on cgroup OOM control to preempt OPPORTUNISTIC containers, even 
though I'd argue it's best to turn on cgroup isolation anyway to ensure 
GUARANTEED containers are not adversely impacted by running OPPORUTNISTIC 
containers.

Let us know your thoughts, if the workload you guys are running is okay with 
the change. [~leftnoteasy] Please add anything that I may have missed.

> Create another implementation of ContainerScheduler to support NM 
> overallocation
> 
>
> Key: YARN-8250
> URL: https://issues.apache.org/jira/browse/YARN-8250
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8250-YARN-1011.00.patch, 
> YARN-8250-YARN-1011.01.patch, YARN-8250-YARN-1011.02.patch
>
>
> YARN-6675 adds NM over-allocation support by modifying the existing 
> ContainerScheduler and providing a utilizationBased resource tracker.
> However, the implementation adds a lot of complexity to ContainerScheduler, 
> and future tweak of over-allocation strategy based on how much containers 
> have been launched is even more complicated.
> As such, this Jira proposes a new ContainerScheduler that always launch 
> guaranteed containers immediately and queues opportunistic containers. It 
> relies on a periodical check to launch opportunistic containers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496879#comment-16496879
 ] 

genericqa commented on YARN-8197:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:
 The patch generated 3 new + 15 unchanged - 0 fixed = 18 total (was 15) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8197 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925898/YARN-8197.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 56d9a769774b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1361030 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20912/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20912/testReport/ |
| Max. process+

[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496884#comment-16496884
 ] 

Sunil Govindan commented on YARN-8197:
--

Somehow i missed couple of imports. I ll correct it in next patch, but will 
wait for [~vinodkv] for review.

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Sunil Govindan (JIRA)
Sunil Govindan created YARN-8384:


 Summary: stdout, stderr logs of a Native Service container is 
coming with group as nobody
 Key: YARN-8384
 URL: https://issues.apache.org/jira/browse/YARN-8384
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Reporter: Sunil Govindan


# ls -l
total 48
-rw-r--r-- 1 nobody hadoop   354 May 31 17:33 container-localizer-syslog
-rw-r--r-- 1 nobody hadoop  1042 May 31 17:35 directory.info
-rw-r- 1 nobody hadoop  4944 May 31 17:35 launch_container.sh
-rw-r--r-- 1 nobody hadoop   440 May 31 17:35 prelaunch.err
-rw-r--r-- 1 nobody hadoop   100 May 31 17:35 prelaunch.out
-rw-r- 1 nobody nobody 18733 May 31 17:37 stderr.txt
-rw-r- 1 nobody nobody   400 May 31 17:35 stdout.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8384:
-
Description: 
{noformat}
rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt

{noformat}

  was:
# ls -l
total 48
-rw-r--r-- 1 nobody hadoop   354 May 31 17:33 container-localizer-syslog
-rw-r--r-- 1 nobody hadoop  1042 May 31 17:35 directory.info
-rw-r- 1 nobody hadoop  4944 May 31 17:35 launch_container.sh
-rw-r--r-- 1 nobody hadoop   440 May 31 17:35 prelaunch.err
-rw-r--r-- 1 nobody hadoop   100 May 31 17:35 prelaunch.out
-rw-r- 1 nobody nobody 18733 May 31 17:37 stderr.txt
-rw-r- 1 nobody nobody   400 May 31 17:35 stdout.txt


> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8385) Clean local directories when a container is killed

2018-05-31 Thread Marco Gaido (JIRA)
Marco Gaido created YARN-8385:
-

 Summary: Clean local directories when a container is killed
 Key: YARN-8385
 URL: https://issues.apache.org/jira/browse/YARN-8385
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Marco Gaido


In long running applications, it may happen that many containers are created 
and killed. A use case is Spark Thrift Server when dynamic allocation is 
enabled. A lot of containers are killed and the application keeps running 
indefinitely.

Currently, YARN seems to remove the local directories only when the whole 
application terminates. In the scenario described above, this can cause serious 
resource leakages. Please, check 
https://issues.apache.org/jira/browse/SPARK-22575.

I think YARN should clean up all the local directories of a container when it 
is killed and not when the whole application terminates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496950#comment-16496950
 ] 

Miklos Szegedi commented on YARN-6677:
--

Thanks, for the update [~haibochen].
{code:java}
40  /**
41  * The timestamp when the container start request is received.
42  * @return
43  */{code}
I think the @return is not needed
{code:java}
57  boolean isC1Opportunistic = isOpportunistic(c1);
58  boolean isC2Opportunistic = isOpportunistic(c2);
59  if (isC1Opportunistic == isC2Opportunistic) {{code}
It is probable shorter to do a {{Boolean isC1Opportunistic = 
isOpportunistic(c1); and do a int level0 = 
isC1Opportunistic.compareTo(isC2Opportunistic)...
{code:java}
61  long order = c1.getContainerLaunchTime() - c2.getContainerLaunchTime();
62  return order > 0 ? -1 : order < 0 ? 1 : 0;{code}
Similarly this could be {{-1 * 
 
((Long)c1.getContainerLaunchTime()).compareTo((Long)c2.getContainerLaunchTime())}}
{code:java}
97  private boolean killGuaranteedContainerIfOOM(
98  Container container, String fileName) {
99  assert(!isOpportunistic(container));{code}
I do not see any reasons to restrict to Guaranteed here. The logic is not 
specific to guaranteed, I would keep the original naming. Anyone can use these 
for opportunistic or any other type of containers in the future.
{code}
172* In general we try to find a newly run container that exceeded its 
limits.  
173* The justification is cost, since probably this is the one that has 

174* accumulated the least amount of uncommitted data so far.   
175* We continue the process until the OOM is resolved.
{code}
This comment is removed. I think it would be useful to have something like this 
for the comparator above.
{code}
242 for (Container container : containers) {
243   boolean isOpportunistic = isOpportunistic(container);
244   if (isOpportunistic) {
{code}
I would reconsider this logic. It would be so much simpler to compare the usage 
vs. request for opportunistic containers as well. If I remember well the way 
[~kasha] designed oversubscription is to be backward compatible. This means 
that if I run some containers as opportunistic and get them promoted 
eventually, the ordering should not be different as opposed to I launch them as 
guaranteed. This current logic has a conflict with this design. if o1 was 
launched before o2 but o2 does not exceed it's request, it will be killed 
first. If the same containers run as c1 and c2, c2 will remain and c1 will be 
killed. This means that oversubscription may have a regression in cluster 
utilization and fairness. This would mean that we apply the logic 
killGuaranteedContainerIfOOM for opportunistic containers as well.
{code}
288 return container.getContainerTokenIdentifier() != null &&
{code}
The default should be guaranteed, not opportunistic.
{code}
58 * Test an OOM situation where there is no containers that can be 
killed.
{code}
is->are

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch, YARN-6677.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8384:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-3611

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8384:

Description: 
When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
is set to true, and 
{{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is set 
to nobody.

This will cause the docker to run as nobody:nobody in yarn mode.
The log files will be initialized as nobody:nobody:

{noformat}
rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
{noformat}



  was:
{noformat}
rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt

{noformat}


> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8384:

Labels: docker  (was: )

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>  Labels: docker
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8349) Remove YARN registry entries when a service is killed by the RM

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497004#comment-16497004
 ] 

Wangda Tan commented on YARN-8349:
--

Thanks [~billie.rinaldi], 

A couple of questions / comments. 

1) For the registry cleanup, I think it is safe to place under 
{{RMAppImpl.FinalTransition}} (instead of doing it in kill-transition only). We 
probably don't need to cleanup when user calls destroy service, etc. In my 
mind, cleanup registries it in destroy service is only useful when RM is down, 
which is a rare case to me, and all apps will go to final state.

2) If you agree with #1, the method name should be renamed to something like 
{{cleanUpAfterServiceFinished}}

3) Since now cleanUpRegistry will be called for every app no matter if it is a 
service or not. And this could be called when registry server is not enabled 
(is it possible?) In this case, we need to make sure that if registry server is 
not enabled, it won't cause RM slowdown or crash. Could you confirm this?

> Remove YARN registry entries when a service is killed by the RM
> ---
>
> Key: YARN-8349
> URL: https://issues.apache.org/jira/browse/YARN-8349
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8349.1.patch, YARN-8349.2.patch
>
>
> As the title states, when a service is killed by the RM (for exceeding its 
> lifetime for example), the YARN registry entries should be cleaned up.
> Without cleanup, DNS can contain multiple hostnames for a single IP address 
> in the case where IPs are reused. This impacts reverse lookups, which breaks 
> services, such as kerberos, that depend on those lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7962:
-
Attachment: YARN-7962.005.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7962:
-
Attachment: (was: YARN-7962.005.patch)

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7962:
-
Attachment: YARN-7962.6.patch

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497022#comment-16497022
 ] 

Wangda Tan commented on YARN-7962:
--

The attached patch looks good, +1. Thanks [~belugabehr]. 

I did some minor rebase to apply it on top of trunk. 

[~wilfreds], for your last comment, I'm not sure if I fully understand.

I think we want to make sure the event processing
{{renewerService.execute(new DelegationTokenRenewerRunnable(evt));}}
Will not be rejected by a shutdown call.

If we remove the shutdown out of lock, it is possible that thread B shutdown 
the executor while A is inside {{execute}}. 

Please let me know if I misunderstood this.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-7962:
-
Priority: Critical  (was: Minor)

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Critical
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-7962:
-
Target Version/s: 3.2.0, 3.1.1

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Critical
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497044#comment-16497044
 ] 

Vinod Kumar Vavilapalli commented on YARN-8197:
---

[~sunilg], this looks good. Can commit it if you can fix the import issues.

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8349) Remove YARN registry entries when a service is killed by the RM

2018-05-31 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497045#comment-16497045
 ] 

Billie Rinaldi commented on YARN-8349:
--

Thanks for the review, [~leftnoteasy]. I agree, moving the call to 
FinalTransition and changing the method name make sense. Regarding comment 3, 
the registry cleanup is only performed for applications with type yarn-service. 
The AppAdminClient.createAppAdminClient method uses the application type to 
determine which AppAdminClient subclass to instantiate.

> Remove YARN registry entries when a service is killed by the RM
> ---
>
> Key: YARN-8349
> URL: https://issues.apache.org/jira/browse/YARN-8349
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8349.1.patch, YARN-8349.2.patch
>
>
> As the title states, when a service is killed by the RM (for exceeding its 
> lifetime for example), the YARN registry entries should be cleaned up.
> Without cleanup, DNS can contain multiple hostnames for a single IP address 
> in the case where IPs are reused. This impacts reverse lookups, which breaks 
> services, such as kerberos, that depend on those lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497051#comment-16497051
 ] 

Sunil Govindan commented on YARN-8384:
--

After YARN-7684, I can see below code snippet in container-executor.c
{code:java}
char *init_log_path(const char *container_log_dir, const char *logfile) {
  ..
  ..
  if (change_owner(tmp_buffer, user_detail->pw_uid, user_detail->pw_gid) != 0) {

  }
  ..
  ..
}

{code}
So ideally here the log file owner is change to the incoming user and group is 
also take from same. I am not very sure, but this seems like the pblm.

 

cc [~leftnoteasy] [~eyang]

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>  Labels: docker
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497060#comment-16497060
 ] 

Vinod Kumar Vavilapalli commented on YARN-8258:
---

[~sunilg], can you please add more details? What was the original problem? And 
how is the patch fixing it?

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch, 
> YARN-8258.003.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8258) YARN webappcontext for UI2 should inherit all filters from default context

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497071#comment-16497071
 ] 

Sunil Govindan commented on YARN-8258:
--

Yes [~vinodkv]

UI2 was missing the filters when compared to what is added in to default 
context. {{httpServer.getWebAppContext().getServletHandler()}} was providing 
all filterHolding and filterMappings with {{getFilters}} and 
{{getFilterMappings}} apis. To define filter for Ui2, we have to iterate 
through the list of filterHolder available via {{getFilters}} api and call 
{{HttpServer2.defineFilter}}. While doing this, {{getFilterMappings}} helps to 
get the URL path associated with each filter name and UI2 also should use same 
except for *authentication* filter. In that case, UI2 has to add /*.

With this change, if a custom filter is added such as AuthenticationFIlter or 
JWTAuthHandler, UI2 context will have the servlet details and with correct 
path. 

> YARN webappcontext for UI2 should inherit all filters from default context
> --
>
> Key: YARN-8258
> URL: https://issues.apache.org/jira/browse/YARN-8258
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8258.001.patch, YARN-8258.002.patch, 
> YARN-8258.003.patch
>
>
> Thanks [~ssath...@hortonworks.com] for finding this.
> Ideally all filters from default context has to be inherited to UI2 context 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497074#comment-16497074
 ] 

Sunil Govindan commented on YARN-8197:
--

Thanks [~vinodkv] Updating new patch after fixing checkstyle

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch, YARN-8197.004.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8197:
-
Attachment: YARN-8197.004.patch

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch, YARN-8197.004.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497051#comment-16497051
 ] 

Sunil Govindan edited comment on YARN-8384 at 5/31/18 7:57 PM:
---

After YARN-7684, I can see below code snippet in container-executor.c
{code:java}
char *init_log_path(const char *container_log_dir, const char *logfile) {
  ..
  ..
  if (change_owner(tmp_buffer, user_detail->pw_uid, user_detail->pw_gid) != 0) {

  }
  ..
  ..
}

{code}
So ideally here the log file owner is changed to the incoming user and group. I 
am not very sure, but this seems like the pblm.

 

cc [~leftnoteasy] [~eyang]


was (Author: sunilg):
After YARN-7684, I can see below code snippet in container-executor.c
{code:java}
char *init_log_path(const char *container_log_dir, const char *logfile) {
  ..
  ..
  if (change_owner(tmp_buffer, user_detail->pw_uid, user_detail->pw_gid) != 0) {

  }
  ..
  ..
}

{code}
So ideally here the log file owner is change to the incoming user and group is 
also take from same. I am not very sure, but this seems like the pblm.

 

cc [~leftnoteasy] [~eyang]

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Priority: Major
>  Labels: docker
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497083#comment-16497083
 ] 

Robert Kanter commented on YARN-8197:
-

+1 LGTM too (pending Jenkins).  Thanks for fixing this.

> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch, YARN-8197.004.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-8384:
---

Assignee: Eric Yang

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497100#comment-16497100
 ] 

Eric Yang commented on YARN-8384:
-

[~sunilg] Yes, that is the problem.  I will change this to use node manager gid 
instead for node manager to access the log file.

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8384:

Attachment: YARN-8384.001.patch

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8384:
-
Priority: Blocker  (was: Major)

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Blocker
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8384:
-
Target Version/s: 3.2.0, 3.1.1
Priority: Critical  (was: Blocker)

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497148#comment-16497148
 ] 

genericqa commented on YARN-8342:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
15s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 11s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 10s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
51s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {c

[jira] [Created] (YARN-8386) App log can not be viewed from Logs tab in secure cluster

2018-05-31 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-8386:


 Summary:  App log can not be viewed from Logs tab in secure cluster
 Key: YARN-8386
 URL: https://issues.apache.org/jira/browse/YARN-8386
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Affects Versions: 3.1.0
Reporter: Yesha Vora


App Logs can not be viewed from UI2 logs tab.

Steps:
1) Launch yarn service 
2) Let application finish and go to Logs tab to view AM log

Here, service am api is failing with 401 authentication error.

{code}
Request URL: 
http://xxx:8188/ws/v1/applicationhistory/containers/container_e09_1527737134553_0034_01_01/logs/serviceam.log?_=1527799590942
Request Method: GET
Status Code: 401 Authentication required
 Response 
html>


Error 401 Authentication required

HTTP ERROR 401
Problem accessing 
/ws/v1/applicationhistory/containers/container_e09_1527737134553_0034_01_01/logs/serviceam.log.
 Reason:
Authentication required



[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497157#comment-16497157
 ] 

Eric Yang commented on YARN-8308:
-

[~gsaha] The patch is moving removeHdfsDelegationToken logic to outside of 
doSecureLogin.  Is this step necessary for cluster without Kerberos?  What is 
the reasoning behind adding mandatory CLI options for keytab and principal?  
User can submit yarnfile that includes those two parameters.

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8308.001.patch, YARN-8308.002.patch
>
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> se

[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497173#comment-16497173
 ] 

Eric Yang commented on YARN-8342:
-

The failed unit tests are not related to this patch, and it seems to be related 
to HADOOP-15490.

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch, YARN-8342.002.patch, 
> YARN-8342.003.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497190#comment-16497190
 ] 

genericqa commented on YARN-7962:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}131m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-7962 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925963/YARN-7962.6.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 85663cc4551f 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a58acd9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20914/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20914/testReport/ |
| Max. process+thread count | 847 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/ha

[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497207#comment-16497207
 ] 

genericqa commented on YARN-8197:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 43m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
29s{color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8197 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925977/YARN-8197.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 42e746d9ff44 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 950dea8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20915/testReport/ |
| Max. process+thread count | 301 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy 
U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy 
|
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20915/console |
| Powered by | Ap

[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-31 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8372:
---
Attachment: YARN-8372.2.patch

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8372.1.patch, YARN-8372.2.patch
>
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-31 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497234#comment-16497234
 ] 

Suma Shivaprasad commented on YARN-8372:


Attached patch with fixes to pass keep_containers_across_application_attempts 
option in AM startup options.

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8372.1.patch, YARN-8372.2.patch
>
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-05-31 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497232#comment-16497232
 ] 

Eric Payne commented on YARN-4606:
--

Thanks [~maniraj...@gmail.com] for the updated patch. Here are my comments so 
far:
- I am concerned that this implementation adds code that is specific to 
{{CapacityScheduler}} inside of {{AppSchedulingInfo}}. I feel that this sets a 
precedent that makes it hard to maintain a clean separation between abstract 
and specific scheduler code. Also, this only fixes the problem for the 
{{CapacityScheduler}}. The previous fix in patch 001 was relying on metrics and 
I realize that is risky, but it was a more generic fix. I would be interested 
to hear thoughts from [~sunilg] and [~leftnoteasy].
- Only the {{CapacityScheduler}} has been changed to handle the new 
{{AppAMAttemptsFailedSchedulerEvent}}. Should the other schedulers handle that 
as well? If they don't handle it, don't we risk them getting unhandled event 
exceptions?
- In all places where new {{LOG.debug(...)}} statementes are added, please also 
enclose them with {{if (LOG.isDebugEnabled())}}. This is for the sake of 
performance, so that the strings are not built, passed to {{LOG.debug()}}, and 
then thrown away if log debugging is not enabled.


> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.1.poc.patch, YARN-4606.POC.2.patch, 
> YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8259) Revisit liveliness checks for Docker containers

2018-05-31 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497239#comment-16497239
 ] 

Eric Badger commented on YARN-8259:
---

For proposal #1, if the yarn user is whitelisted for hidepid, then isn't that 
going to get you basically the same situation as checking pids as a privileged 
user? I.e. you'll be able to see all arbitrary pids if you are able to 
compromise the yarn user. If that's a non-starter, then we have no choice but 
to go with proposal #4 (even though I would prefer #1). 

> Revisit liveliness checks for Docker containers
> ---
>
> Key: YARN-8259
> URL: https://issues.apache.org/jira/browse/YARN-8259
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.2, 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: Docker
> Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN 
> run as user, sending the null signal for liveliness checks could fail. We 
> need to reconsider how liveliness checks are handled in the Docker case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497249#comment-16497249
 ] 

genericqa commented on YARN-8384:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
37m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m  7s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8384 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925982/YARN-8384.001.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux daa0861ae5f2 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 950dea8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20916/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20916/testReport/ |
| Max. process+thread count | 311 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20916/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
> 

[jira] [Commented] (YARN-8259) Revisit liveliness checks for Docker containers

2018-05-31 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497266#comment-16497266
 ] 

Shane Kumpf commented on YARN-8259:
---

Thanks for the feedback, [~ebadger].
{quote}if the yarn user is whitelisted for hidepid, then isn't that going to 
get you basically the same situation as checking pids as a privileged user?
{quote}
Perhaps non-starter was a bit harsh. I do see what you mean but I think they 
are a bit different. To clarify, if the admin has explicitly enabled hidepid, 
allowing yarn to bypass that protection via c-e would be surprising behavior, 
IMO. If hidepid is disabled or the yarn user is explicitly whitelisted, then 
the admin should not be surprised that the yarn user can see all pids.

> Revisit liveliness checks for Docker containers
> ---
>
> Key: YARN-8259
> URL: https://issues.apache.org/jira/browse/YARN-8259
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.2, 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: Docker
> Attachments: YARN-8259.001.patch
>
>
> As privileged containers may execute as a user that does not match the YARN 
> run as user, sending the null signal for liveliness checks could fail. We 
> need to reconsider how liveliness checks are handled in the Docker case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6677:
-
Issue Type: Improvement  (was: Sub-task)
Parent: (was: YARN-1011)

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch, YARN-6677.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497295#comment-16497295
 ] 

genericqa commented on YARN-8372:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:
 The patch generated 2 new + 135 unchanged - 5 fixed = 137 total (was 140) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 59s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8372 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926004/YARN-8372.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ece9b9b216f8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 950dea8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20917/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applicati

[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497320#comment-16497320
 ] 

Eric Yang commented on YARN-8342:
-

[~shaneku...@gmail.com] Could you help to review this patch?  Thanks

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch, YARN-8342.002.patch, 
> YARN-8342.003.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497333#comment-16497333
 ] 

Gour Saha commented on YARN-8308:
-

Thanks for reviewing the patch [~eyang]. I have updated the patch to ensure 
removeHdfsDelegationToken gets called for secure cluster only. The keytab and 
principal options are not mandatory in the CLI. Only service name is mandatory.

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8308.001.patch, YARN-8308.002.patch, 
> YARN-8308.003.patch
>
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.s

[jira] [Updated] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-8308:

Attachment: YARN-8308.003.patch

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8308.001.patch, YARN-8308.002.patch, 
> YARN-8308.003.patch
>
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=152642

[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497334#comment-16497334
 ] 

Gour Saha commented on YARN-8308:
-

I uploaded patch 003 with the fixes.

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8308.001.patch, YARN-8308.002.patch, 
> YARN-8308.003.patch
>
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, rene

[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497342#comment-16497342
 ] 

Vinod Kumar Vavilapalli commented on YARN-8384:
---

bq.  I will change this to use node manager gid instead for node manager to 
access the log file.
My understanding is different. The log files are completely specified by the 
user-land, so NM / container-executor don't even know what the file-names are 
going to be.

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497358#comment-16497358
 ] 

Wangda Tan commented on YARN-8384:
--

Thanks [~eyang] for working on this patch.

Discussed with [~eyang], the patch looks good and there's no backward 
incompatible change. 

[~eyang], have u done any verification of the patch? I will commit the patch if 
it is verified.

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8349) Remove YARN registry entries when a service is killed by the RM

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497362#comment-16497362
 ] 

Wangda Tan commented on YARN-8349:
--

Gotcha, make sense to me, thanks [~billie.rinaldi]! 

> Remove YARN registry entries when a service is killed by the RM
> ---
>
> Key: YARN-8349
> URL: https://issues.apache.org/jira/browse/YARN-8349
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8349.1.patch, YARN-8349.2.patch
>
>
> As the title states, when a service is killed by the RM (for exceeding its 
> lifetime for example), the YARN registry entries should be cleaned up.
> Without cleanup, DNS can contain multiple hostnames for a single IP address 
> in the case where IPs are reused. This impacts reverse lookups, which breaks 
> services, such as kerberos, that depend on those lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497368#comment-16497368
 ] 

Wangda Tan commented on YARN-7962:
--

The failed tests happened in other JIRAs as well: 
https://builds.apache.org/job/PreCommit-YARN-Build/20853/testReport/ 

If everybody agrees, I will commit the patch by tomorrow.

> Race Condition When Stopping DelegationTokenRenewer
> ---
>
> Key: YARN-7962
> URL: https://issues.apache.org/jira/browse/YARN-7962
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Critical
> Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, 
> YARN-7962.4.patch, YARN-7962.6.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>   DelegationTokenRenewerEvent evt) {
> serviceStateLock.readLock().lock();
> try {
>   if (isServiceStarted) {
> renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>   } else {
> pendingEventQueue.add(evt);
>   }
> } finally {
>   serviceStateLock.readLock().unlock();
> }
>   }
>   @Override
>   protected void serviceStop() {
> if (renewalTimer != null) {
>   renewalTimer.cancel();
> }
> appTokens.clear();
> allTokens.clear();
> this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2
>  rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not 
> setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the 
> {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before 
> shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497377#comment-16497377
 ] 

Eric Yang commented on YARN-8384:
-

[~vinodkv], there are 3 different paths to start docker container:

1.  Someone who runs distributed shell that append:
{code}
 1> /stdout 2> LOG_DIR>/stderr
{code}
Filename is user defined, but the file permission on the file depending on 
umask of the docker image.  By default umask is 022, and anyone can read the 
file using other bit.  The file is owned by uid:gid of the submission user in 
secure mode, or  nobody:nobody in insecure mode.  This is a bit leaky by 
security standard.  Hadoop 3.1 implementation does not break the backward 
compatibility for this mode.
 
2. Yarn Native Service yarn mode
This mode initializes stdout.txt and stderr.txt to uid of submission user, and 
gid of node manager.  End user or viewing log file via node manager web 
application is the only two possible users to look at the log.  If end user 
tries to add redirection of logs to other filename, the generated file 
permission will end up as the docker container uid:gid and umask of the docker 
container.  However, the output will end up in stdout.txt and stderr.txt 
because those redirection are appended last in the launch command.

3. Yarn Service docker mode (ENTRY_POINT)
When using ENTRY_POINT, the stdout and stderr are written to stdout.txt and 
stderr.txt through dup2 redirection.  It is not possible to use shell command 
redirection because the execution is via execvp without shell expansion.  User 
can choose to write log to additional mount directories, but the custom logs 
will not be aggregated by YARN framework.

Option 1 and 2 are kept around for backward compatibility reasons, but it is 
possible for container to write file with permission that node manager can not 
process.  The setup of stdout.txt and stderr.txt to owned by launching user, 
and readable by node manager in option 3 is safest and recommended for future 
development.

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8197) Tracking URL in the app state does not get redirected to MR ApplicationMaster for Running applications

2018-05-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497379#comment-16497379
 ] 

Hudson commented on YARN-8197:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14331 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14331/])
YARN-8197. Fixed AM IP Filter and Webapp proxy to redirect app (vinodkv: rev 
6b74f5d7fc509c55c331249256eec78b7e53b6ce)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/test/java/org/apache/hadoop/yarn/server/webproxy/amfilter/TestSecureAmFilter.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/test/resources/krb5.conf
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/pom.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmIpFilter.java


> Tracking URL in the app state does not get redirected to MR ApplicationMaster 
> for Running applications
> --
>
> Key: YARN-8197
> URL: https://issues.apache.org/jira/browse/YARN-8197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8197.001.patch, YARN-8197.002.patch, 
> YARN-8197.003.patch, YARN-8197.004.patch
>
>
> {code}
> org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: Could not determine the proxy server for 
> redirection
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:211)
>   at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:145)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1617)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497403#comment-16497403
 ] 

Eric Yang commented on YARN-8384:
-

[~leftnoteasy] I verified the file permission is setup correctly with this 
patch:

{code}
[root@eyang-4 container_1527798256852_0001_01_03]# ls -la
total 4
drwxr-s---. 2 nobody hadoop  40 May 31 20:24 .
drwxr-s---. 3 nobody hadoop  51 May 31 20:24 ..
-rw-r-. 1 nobody hadoop   0 May 31 20:24 stderr.txt
-rw-r-. 1 nobody hadoop 460 May 31 20:24 stdout.txt
{code}

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8384) stdout, stderr logs of a Native Service container is coming with group as nobody

2018-05-31 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497406#comment-16497406
 ] 

Wangda Tan commented on YARN-8384:
--

Thanks [~eyang], will commit the patch by tomorrow if no objections.

> stdout, stderr logs of a Native Service container is coming with group as 
> nobody
> 
>
> Key: YARN-8384
> URL: https://issues.apache.org/jira/browse/YARN-8384
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Sunil Govindan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: docker
> Attachments: YARN-8384.001.patch
>
>
> When {{yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users}} 
> is set to true, and 
> {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is 
> set to nobody.
> This will cause the docker to run as nobody:nobody in yarn mode.
> The log files will be initialized as nobody:nobody:
> {noformat}
> rw-rr- 1 nobody hadoop 354 May 31 17:33 container-localizer-syslog
> rw-rr- 1 nobody hadoop 1042 May 31 17:35 directory.info
> rw-r 1 nobody hadoop 4944 May 31 17:35 launch_container.sh
> rw-rr- 1 nobody hadoop 440 May 31 17:35 prelaunch.err
> rw-rr- 1 nobody hadoop 100 May 31 17:35 prelaunch.out
> rw-r 1 nobody nobody 18733 May 31 17:37 stderr.txt
> rw-r 1 nobody nobody 400 May 31 17:35 stdout.txt
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497409#comment-16497409
 ] 

genericqa commented on YARN-8308:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
39s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8308 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12926022/YARN-8308.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e623aea95ab2 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 32671d8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20918/testReport/ |
| Max. process+thread count | 758 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20918/console |

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-31 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497411#comment-16497411
 ] 

Miklos Szegedi commented on YARN-8375:
--

I updated a simple patch that runs the only verified platform where this is not 
flaky.

> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-8375.000.patch, YARN-8375.001.patch, 
> YARN-8375.002.patch
>
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >