[jira] [Updated] (YARN-7756) AMRMProxyService cann't enable ’hadoop.security.authorization‘

2018-01-31 Thread leiqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leiqiang updated YARN-7756:
---
Attachment: YARN-7756.v1.patch

> AMRMProxyService cann't enable ’hadoop.security.authorization‘
> --
>
> Key: YARN-7756
> URL: https://issues.apache.org/jira/browse/YARN-7756
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0
>Reporter: leiqiang
>Priority: Major
> Attachments: YARN-7756.v0.patch, YARN-7756.v1.patch
>
>
> after set hadoop.security.authorization=true, start AMRMProxyService  will 
> has such error
> {quote}org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
>  failed in state STARTED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.security.authorize.AuthorizationException: Protocol 
> interface org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB is not known.
>  org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.security.authorize.AuthorizationException: Protocol 
> interface org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB is not known.
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:177)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:121)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:250)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:844)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1114)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1529)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1803)
>  at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1525)
>  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1458)
>  Caused by: org.apache.hadoop.security.authorize.AuthorizationException: 
> Protocol interface org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB is 
> not known.
>  at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>  at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
>  at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>  at com.sun.proxy.$Proxy36.registerApplicationMaster(Unknown Source)
>  at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:161)
>  ... 14 more
>  Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>  Protocol interface org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB is 
> not known.
>  at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>  at com.sun.proxy.$Proxy35.registerApplicationMaster(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
>  ... 21 more
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7855) RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception

2018-01-31 Thread Zhizhen Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348122#comment-16348122
 ] 

Zhizhen Hou edited comment on YARN-7855 at 2/1/18 7:14 AM:
---

I have reproduced this error. I run a mapreduce job. At runtime, I find the 
MRAppMaster process and kill it. The NodeManager will report this to 
ResourceManager. The ResourceManager will report it to RMAppImpl object, and it 
will recreate a RMAppAttempt as current RMAppAttempt. But during this period, 
the containers request by former MRAppMaster will be allocated to current 
RMAppAttempt. The current RMAppAttempt can not deal this message . The state 
machine does not include transition from current to CONTAINER_ALLOCATED.


was (Author: houzhizhen):
I have reproduce this error. I run a mapreduce job. At runtime, I find the 
MRAppMaster process and kill it. The NodeManager will report this to 
ResourceManager. The ResourceManager will report it to RMAppImpl object, and it 
will recreate a RMAppAttempt as current RMAppAttempt. But during this period, 
the containers request by former MRAppMaster will be allocated to current 
RMAppAttempt. The current RMAppAttempt can not deal this message . The state 
machine does not include transition from current to CONTAINER_ALLOCATED.

> RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at 
> ALLOCATED_SAVING Exception
> 
>
> Key: YARN-7855
> URL: https://issues.apache.org/jira/browse/YARN-7855
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> After upgrade hadoop from hadoop 2.6 to hadoop 2.7.5, the resourcemanager 
> report the following error log occasionally.
>  
> {code:java}
> 2018-01-30 14:12:41,349 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> CONTAINER_ALLOCATED at ALLOCATED_SAVING
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>     at java.lang.Thread.run(Thread.java:745)
> 2018-01-30 14:12:41,351 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> CONTAINER_ALLOCATED at ALLOCATED_SAVING
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>     at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-

[jira] [Commented] (YARN-7855) RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception

2018-01-31 Thread Zhizhen Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348122#comment-16348122
 ] 

Zhizhen Hou commented on YARN-7855:
---

I have reproduce this error. I run a mapreduce job. At runtime, I find the 
MRAppMaster process and kill it. The NodeManager will report this to 
ResourceManager. The ResourceManager will report it to RMAppImpl object, and it 
will recreate a RMAppAttempt as current RMAppAttempt. But during this period, 
the containers request by former MRAppMaster will be allocated to current 
RMAppAttempt. The current RMAppAttempt can not deal this message . The state 
machine does not include transition from current to CONTAINER_ALLOCATED.

> RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at 
> ALLOCATED_SAVING Exception
> 
>
> Key: YARN-7855
> URL: https://issues.apache.org/jira/browse/YARN-7855
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> After upgrade hadoop from hadoop 2.6 to hadoop 2.7.5, the resourcemanager 
> report the following error log occasionally.
>  
> {code:java}
> 2018-01-30 14:12:41,349 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> CONTAINER_ALLOCATED at ALLOCATED_SAVING
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>     at java.lang.Thread.run(Thread.java:745)
> 2018-01-30 14:12:41,351 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> CONTAINER_ALLOCATED at ALLOCATED_SAVING
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>     at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>     at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>     at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348116#comment-16348116
 ] 

Eric Yang commented on YARN-7221:
-

[~shaneku...@gmail.com] [~ebadger] Thanks for the review.  I agree with Eric 
that a user without sudo privileges should not allowed to run privileged 
container.  This is some what stated in [Docker 
security|https://docs.docker.com/engine/security/security/] document. Sudo 
check is the most common mechanism without reinventing the user management 
aspect of Linux.


> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348106#comment-16348106
 ] 

wangwj commented on YARN-7859:
--

Hi [~yufeigu],Thank you for your reply.
I think that no violation of the core part of fair scheduler if enable this 
feature(deadline).Because I only Improve the phenomenon that one queue has not 
shceduled for a long time.When the resource of a queue is sufficient,and this 
queue  has not shceduled for a long time.We mandatory scheduler this queue 
once.I do not modify the fairShare of any schedulable.The fairness of each 
schedulable is still in accordance with the previous algorithm.
I think it is necessary to explain the phenomenon of scheduling 
starvation.The apps in one queue are pending and the resource used of this 
queue is smaller than minResource of the queue.
Thanks...

   

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348103#comment-16348103
 ] 

genericqa commented on YARN-7859:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 31 unchanged - 0 fixed = 32 total (was 31) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 6 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 41s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector 
|
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7859 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908722/YARN-7859-v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 651d84d0b442 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0bee384 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19559/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/19559/arti

[jira] [Updated] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7778:
--
Attachment: YARN-7778.004.patch

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, 
> YARN-7778.003.patch, YARN-7778.004.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348055#comment-16348055
 ] 

genericqa commented on YARN-7819:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 84 unchanged - 0 fixed = 86 total (was 84) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 51s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
10s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m  1s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
 to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt 
in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt,
 SchedulingRequest, SchedulerNode)  At 
FairScheduler.java:org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt
 in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptAllocationOnNode(SchedulerApplicationAttempt,
 SchedulingRequest, SchedulerNode)  At FairScheduler.java:[line 1882] |
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-781

[jira] [Commented] (YARN-5848) public/crossdomain.xml is problematic

2018-01-31 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348049#comment-16348049
 ] 

Allen Wittenauer commented on YARN-5848:


I'm raising this to a blocker, now that these cross domain files are making the 
nightly builds fail due to broken XML formatting.

> public/crossdomain.xml is problematic
> -
>
> Key: YARN-5848
> URL: https://issues.apache.org/jira/browse/YARN-5848
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha2, 3.1.0
>Reporter: Allen Wittenauer
>Priority: Blocker
>
> crossdomain.xml should really have an ASF header in it and be in the src 
> directory somewhere.  There's zero reason for it to have RAT exception given 
> that comments are possible in xml files.  It's also not in a standard maven 
> location, which should really be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5848) public/crossdomain.xml is problematic

2018-01-31 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-5848:
---
Affects Version/s: 3.1.0

> public/crossdomain.xml is problematic
> -
>
> Key: YARN-5848
> URL: https://issues.apache.org/jira/browse/YARN-5848
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha2, 3.1.0
>Reporter: Allen Wittenauer
>Priority: Major
>
> crossdomain.xml should really have an ASF header in it and be in the src 
> directory somewhere.  There's zero reason for it to have RAT exception given 
> that comments are possible in xml files.  It's also not in a standard maven 
> location, which should really be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5848) public/crossdomain.xml is problematic

2018-01-31 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-5848:
---
Priority: Blocker  (was: Major)

> public/crossdomain.xml is problematic
> -
>
> Key: YARN-5848
> URL: https://issues.apache.org/jira/browse/YARN-5848
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha2, 3.1.0
>Reporter: Allen Wittenauer
>Priority: Blocker
>
> crossdomain.xml should really have an ASF header in it and be in the src 
> directory somewhere.  There's zero reason for it to have RAT exception given 
> that comments are possible in xml files.  It's also not in a standard maven 
> location, which should really be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7864) YARN Federation document has error. spelling mistakes.

2018-01-31 Thread maobaolong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348048#comment-16348048
 ] 

maobaolong commented on YARN-7864:
--

What a big mistake. Thank you yiran, you did a great job.

> YARN Federation document has error. spelling mistakes.
> --
>
> Key: YARN-7864
> URL: https://issues.apache.org/jira/browse/YARN-7864
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.9.0, 3.0.0, 2.9.1
> Environment: 3.0.0
>Reporter: Yiran Wu
>Priority: Major
> Attachments: YARN-7864.001.patch, image-2018-01-31-19-01-12-739.png
>
>
> YARN Federation document has error. spelling mistakes.
> yarn.resourcemanger.scheduler.address -> 
> yarn.resourcemanager.scheduler.address
>  
> !image-2018-01-31-19-01-12-739.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7864) YARN Federation document has error. spelling mistakes.

2018-01-31 Thread Yiran Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348038#comment-16348038
 ] 

Yiran Wu commented on YARN-7864:


Cc [~Naganarasimha], [~sunilg], [~bibinchundatt], [~leftnoteasy] and 
[~LambertYe].

> YARN Federation document has error. spelling mistakes.
> --
>
> Key: YARN-7864
> URL: https://issues.apache.org/jira/browse/YARN-7864
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.9.0, 3.0.0, 2.9.1
> Environment: 3.0.0
>Reporter: Yiran Wu
>Priority: Major
> Attachments: YARN-7864.001.patch, image-2018-01-31-19-01-12-739.png
>
>
> YARN Federation document has error. spelling mistakes.
> yarn.resourcemanger.scheduler.address -> 
> yarn.resourcemanager.scheduler.address
>  
> !image-2018-01-31-19-01-12-739.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348019#comment-16348019
 ] 

Yufei Gu commented on YARN-7859:


Hi [~wangwj], not sure I understand your propose. You seems introduce one more 
property(deadline) for queues, which has the highest priority than others 
according to your code, e.g. fair share. In that case, you can't really ensure 
any fairness of resource usage, which is core part of fair scheduler.

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7820) Fix the currentAppAttemptId error in AHS when an application is running

2018-01-31 Thread Jinjiang Ling (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348011#comment-16348011
 ] 

Jinjiang Ling commented on YARN-7820:
-

The test's failure is caused by YARN-7817 and YARN-7860.

> Fix the currentAppAttemptId error in AHS when an application is running
> ---
>
> Key: YARN-7820
> URL: https://issues.apache.org/jira/browse/YARN-7820
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Jinjiang Ling
>Assignee: Jinjiang Ling
>Priority: Major
> Attachments: YARN-7820.001.patch, YARN-7820.003.patch, 
> YARN-7820.003.patch, image-2018-01-26-14-35-09-796.png
>
>
> When I using the REST API of the AHS to get a running app's latest attempt 
> id, it always returns a invalid id like 
> *appattempt_1516873125047_0013_{color:#FF}-01{color}*. 
> But when the app is finished, the RM will push a finished event which 
> contains the latest attempt id to TimelineServer, so the id will transitive 
> to a correct one in the end of the application. 
> I think as the app is running, this value should be a correct one, so I add 
> the latest attempt id in the other info of the app's entity when the app 
> trans to RUNNING state. Then the AHS will use this value to set the 
> currentAppAttemptId.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwj updated YARN-7859:
-
Attachment: YARN-7859-v1.patch

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwj updated YARN-7859:
-
Attachment: (was: YARN-7859-v1.patch)

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: log, screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347990#comment-16347990
 ] 

genericqa commented on YARN-7778:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 53s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 58s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7778 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908704/YARN-7778.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 025fb8e91705 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0bee384 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19557/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19557/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yar

[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-31 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347983#comment-16347983
 ] 

Arun Suresh commented on YARN-7819:
---

Thanks for the review [~templedf] and [~haibochen]. Updated patch based on 
suggestions.

[~templedf],
bq. Seems like there should be a cleaner way to do this..
Yeah, I plan to move that code out of there anyway in YARN-7839, where we 
normalize upfront.





> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-31 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7819:
--
Attachment: YARN-7819.002.patch

> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch, YARN-7819.002.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-01-31 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347942#comment-16347942
 ] 

Gour Saha commented on YARN-7781:
-

Thanks [~jianhe] . Patch 02 looks good. +1 for commit.

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7816) YARN Service - Two different users are unable to launch a service of the same name

2018-01-31 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-7816:

Fix Version/s: 3.1.0

> YARN Service - Two different users are unable to launch a service of the same 
> name
> --
>
> Key: YARN-7816
> URL: https://issues.apache.org/jira/browse/YARN-7816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7816.001.patch, YARN-7816.002.patch, 
> YARN-7816.003.patch
>
>
> Now that YARN-7605 is committed, I am able to create a service in an 
> unsecured cluster from cmd line as the logged in user. However after creating 
> an app of name "myapp" say as user A, and then I login as a different user 
> user B, I am unable to create a service of the exact same name ("myapp" in 
> this case). This feature should be supported in a multi-user setup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347916#comment-16347916
 ] 

Weiwei Yang edited comment on YARN-7778 at 2/1/18 2:24 AM:
---

Hi [~kkaranasos]

Thanks for your comments, I agree with both #1 and #2 and get them done in next 
patch. I also took a look #3, that code fragment removes null or dup entries in 
a list of {{PlacementConstraint}} and then return us a list of 
{{AbstractConstraint}}, it was not straightforward to me how other place is 
going to re-use this. But let me know if you have a strong preference to do so 
(tried to split a method to "trim" a list of AbstractConstraint but that's just 
a single line code of java expression :P and except calling this method in its 
original place I still need to add the map function to transfer each entry 
type, so a bit overhead to me). Thanks.


was (Author: cheersyang):
Hi [~kkaranasos]

Thanks for your comments, I agree with both #1 and #2 and get them done in next 
patch. I also took a look #3, that code fragment removes null or dup entries in 
a list of {{PlacementConstraint}} and then return us a list of 
{{AbstractConstraint}}, it was not straightforward to me how other place is 
going to re-use this. But let me know if you have a strong preference to do so 
(tried to split a method to "trim" a list of AbstractConstraint but that's just 
a single line code of java expression :P). Thanks.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, 
> YARN-7778.003.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347916#comment-16347916
 ] 

Weiwei Yang edited comment on YARN-7778 at 2/1/18 2:23 AM:
---

Hi [~kkaranasos]

Thanks for your comments, I agree with both #1 and #2 and get them done in next 
patch. I also took a look #3, that code fragment removes null or dup entries in 
a list of {{PlacementConstraint}} and then return us a list of 
{{AbstractConstraint}}, it was not straightforward to me how other place is 
going to re-use this. But let me know if you have a strong preference to do so 
(tried to split a method to "trim" a list of AbstractConstraint but that's just 
a single line code of java expression :P). Thanks.


was (Author: cheersyang):
Hi [~kkaranasos]

Thanks for your comments, I agree with both #1 and #2 and get them done in next 
patch. I also took a look #3, that code fragment removes null or dup entries in 
a list of {{PlacementConstraint}} and then return us a list of 
{{AbstractConstraint}}, it was not straightforward to me how other place is 
going to re-use this. But let me know if you have a strong preference to do so. 
Thanks.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, 
> YARN-7778.003.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7778:
--
Attachment: YARN-7778.003.patch

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch, 
> YARN-7778.003.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347916#comment-16347916
 ] 

Weiwei Yang commented on YARN-7778:
---

Hi [~kkaranasos]

Thanks for your comments, I agree with both #1 and #2 and get them done in next 
patch. I also took a look #3, that code fragment removes null or dup entries in 
a list of {{PlacementConstraint}} and then return us a list of 
{{AbstractConstraint}}, it was not straightforward to me how other place is 
going to re-use this. But let me know if you have a strong preference to do so. 
Thanks.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347913#comment-16347913
 ] 

genericqa commented on YARN-7781:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 16s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7781 |
| JIRA Patch URL | 
https://issues.apache.

[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwj updated YARN-7859:
-
Attachment: (was: image-2018-02-01-10-16-33-248.png)

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346762#comment-16346762
 ] 

wangwj edited comment on YARN-7859 at 2/1/18 2:17 AM:
--

In my cluster,I did an experiment.
There are two queues in my cluster:
 !screenshot-1.png! 
And the configuration associated with this issue are : 
 !screenshot-3.png! 
And I run two jobs in each queue.
Of course,before the experiment, I add some log In the code.
After two jobs completed,I intercepted some of the logs...
>From the log we can see one queue has scheduled mandatory if the queue was not 
>scheduled within 3S.


was (Author: wangwj):
In my cluster,I did an experiment.
There are two queues in my cluster:
 !screenshot-1.png! 
And the configuration associated with this issue are : 
 !image-2018-02-01-10-16-55-271.png! 
And I run two jobs in each queue.
Of course,before the experiment, I add some log In the code.
After two jobs completed,I intercepted some of the logs...
>From the log we can see one queue has scheduled mandatory if the queue was not 
>scheduled within 3S.

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, log, screenshot-1.png, 
> screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwj updated YARN-7859:
-
Attachment: (was: image-2018-02-01-10-16-55-271.png)

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, image-2018-02-01-10-16-33-248.png, 
> log, screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-01-31 Thread wangwj (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346762#comment-16346762
 ] 

wangwj edited comment on YARN-7859 at 2/1/18 2:16 AM:
--

In my cluster,I did an experiment.
There are two queues in my cluster:
 !screenshot-1.png! 
And the configuration associated with this issue are : 
 !image-2018-02-01-10-16-55-271.png! 
And I run two jobs in each queue.
Of course,before the experiment, I add some log In the code.
After two jobs completed,I intercepted some of the logs...
>From the log we can see one queue has scheduled mandatory if the queue was not 
>scheduled within 3S.


was (Author: wangwj):
In my cluster,I did an experiment.
There are two queues in my cluster:
 !screenshot-1.png! 
And the configuration associated with this issue are : 
 !screenshot-3.png! 
And I run two jobs in each queue.
Of course,before the experiment, I add some log In the code.
After two jobs completed,I intercepted some of the logs...
>From the log we can see one queue has scheduled mandatory if the queue was not 
>scheduled within 3S.

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Fix For: 3.0.0
>
> Attachments: YARN-7859-v1.patch, image-2018-02-01-10-16-33-248.png, 
> image-2018-02-01-10-16-55-271.png, log, screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> Now the way of community solves queue scheduling to starvation is preempt 
> container.But this way may increases the failure rate of the job.
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7816) YARN Service - Two different users are unable to launch a service of the same name

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347904#comment-16347904
 ] 

Hudson commented on YARN-7816:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13595 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13595/])
YARN-7816.  Allow same application name submitted by multiple users.  (eyang: 
rev 0bee3849e323bf71925024992f9e655aee2d75f9)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestYarnNativeServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/ServiceTestUtils.java


> YARN Service - Two different users are unable to launch a service of the same 
> name
> --
>
> Key: YARN-7816
> URL: https://issues.apache.org/jira/browse/YARN-7816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-7816.001.patch, YARN-7816.002.patch, 
> YARN-7816.003.patch
>
>
> Now that YARN-7605 is committed, I am able to create a service in an 
> unsecured cluster from cmd line as the logged in user. However after creating 
> an app of name "myapp" say as user A, and then I login as a different user 
> user B, I am unable to create a service of the exact same name ("myapp" in 
> this case). This feature should be supported in a multi-user setup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347892#comment-16347892
 ] 

genericqa commented on YARN-7840:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 1s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
7s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
22s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-3409 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 10 unchanged - 1 fixed = 10 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
44s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
15s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7840 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908686/YARN-7840-YARN-3409.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 60ebc4bd90cc 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/pre

[jira] [Assigned] (YARN-7614) [RESERVATION] Support Reservation APIs in Federation Router

2018-01-31 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino reassigned YARN-7614:
--

Assignee: Giovanni Matteo Fumarola

> [RESERVATION] Support Reservation APIs in Federation Router
> ---
>
> Key: YARN-7614
> URL: https://issues.apache.org/jira/browse/YARN-7614
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, reservation system
>Reporter: Carlo Curino
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7404) [GQ] propagate to GPG queue-level utilization/pending information

2018-01-31 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino reassigned YARN-7404:
--

Assignee: Jose Miguel Arreola

> [GQ] propagate to GPG queue-level utilization/pending information
> -
>
> Key: YARN-7404
> URL: https://issues.apache.org/jira/browse/YARN-7404
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Carlo Curino
>Assignee: Jose Miguel Arreola
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7870) [PERF/TEST] Performance testing of ReservationSystem at high job submission rates

2018-01-31 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino reassigned YARN-7870:
--

Assignee: Xiaohua (Victor) Liang

> [PERF/TEST] Performance testing of ReservationSystem at high job submission 
> rates
> -
>
> Key: YARN-7870
> URL: https://issues.apache.org/jira/browse/YARN-7870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Xiaohua (Victor) Liang
>Priority: Major
>
> To leverage the ReservationSystem as a gang-semantics enforcer for all jobs 
> of  a large federation, we need to evaluate it can sustain large number of 
> job submissions (and replanning) per second. This Jira tracks this validation 
> effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7869) [PERF/TEST] Performance testing of CapacityScheudler at many-thousands of queues

2018-01-31 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino reassigned YARN-7869:
--

Assignee: Abhishek Modi

> [PERF/TEST] Performance testing of CapacityScheudler at many-thousands of 
> queues
> 
>
> Key: YARN-7869
> URL: https://issues.apache.org/jira/browse/YARN-7869
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Abhishek Modi
>Priority: Major
>
> The CapacityScheduler is known to work well at tens to hundreds of queues. 
> This Jira tracks performance testing at much larger scale thousands of 
> queues, and deep queue hierachies >10 levels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7738) CapacityScheduler: Support refresh maximum allocation for multiple resource types

2018-01-31 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347883#comment-16347883
 ] 

Xiang Li commented on YARN-7738:


[~leftnoteasy] thanks for taking care of this! I see. I will study it and get 
back here if I could have more.

> CapacityScheduler: Support refresh maximum allocation for multiple resource 
> types
> -
>
> Key: YARN-7738
> URL: https://issues.apache.org/jira/browse/YARN-7738
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: YARN-7738.001.patch, YARN-7738.002.patch, 
> YARN-7738.003.patch, YARN-7738.004.patch
>
>
> Currently CapacityScheduler fails to refresh maximum allocation for multiple 
> resource types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7816) YARN Service - Two different users are unable to launch a service of the same name

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347882#comment-16347882
 ] 

Eric Yang commented on YARN-7816:
-

+1 looks good.  I just committed this.  Thank you [~gsaha].

> YARN Service - Two different users are unable to launch a service of the same 
> name
> --
>
> Key: YARN-7816
> URL: https://issues.apache.org/jira/browse/YARN-7816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-7816.001.patch, YARN-7816.002.patch, 
> YARN-7816.003.patch
>
>
> Now that YARN-7605 is committed, I am able to create a service in an 
> unsecured cluster from cmd line as the logged in user. However after creating 
> an app of name "myapp" say as user A, and then I login as a different user 
> user B, I am unable to create a service of the exact same name ("myapp" in 
> this case). This feature should be supported in a multi-user setup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347877#comment-16347877
 ] 

Sunil G commented on YARN-7840:
---

Had a quick offline syncup with [~Naganarasimha]. I missed one thing where 
NodeAttributeProto is used both cases
 # When NM/Admin defines a mapping of attributes with nodes where type will be 
mentioned alone (no values)
 # When AM specifies attributes in placement constraints (value and type used 
here)

Since we reuse the same proto for both cases, my point is only valid for #2. 
Also wrapping value in NodeAttributeTypeProto is fine, but may not be as clean 
as going forward. [~Naganarasimha] pls provide ur thoughts here.

Overall we can take more consideration. Going on if more entries are coming in 
NodeAttributeProto, we should not take out the relation b/w type and value. So 
more params if comes, we should put it under  a common sub category. 

 

[~Naganarasimha] pls add if I missed something here. Thank you

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347865#comment-16347865
 ] 

Eric Yang commented on YARN-7446:
-

[~shaneku...@gmail.com] I understand that docker can run as user defined in the 
image or as someone else.  The output generated by the user in the docker 
container will impact localized directory clean up.

The described problem only exists in yarn mode (where we bind localized 
directory to docker container).  
We can solve the logging problem for yarn mode is to prevent multi-users 
container and disallow privileged container for yarn mode.  This will align 
yarn-mode to the same design as YARN in Hadoop 2.  The alternative is to tap 
into docker logs, and pipe (| tee /fileename) the stdout, stderr from the 
launch command to localize the output.  Therefore the content is written to 
disk using end user credential instead of root user or other user that exists 
in the docker image.

For docker mode (where we sandbox docker, and drop all mounts for untrusted 
image) and trusted image must reflect the uid/gid consistent to the host OS, 
hence writing to any remote volumes don't create security problems.  We can 
call docker logs command to retrieve logs, which docker already buffer and 
manage properly.  Docker rm command will delete the logs in the sandbox without 
privileges issue.  This will not be an issue with log clean up.  Let me know 
what you think about these approaches to solve the logging problem.  Thanks


> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347863#comment-16347863
 ] 

Weiwei Yang commented on YARN-7757:
---

Sure [~Naganarasimha], please share your feedback and I'll look at them at 
first place. Thanks

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, YARN-7757-YARN-3409.005.patch, 
> nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-31 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347847#comment-16347847
 ] 

Sunil G commented on YARN-7757:
---

Sure [~Naganarasimha]. 

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, YARN-7757-YARN-3409.005.patch, 
> nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347845#comment-16347845
 ] 

Sunil G commented on YARN-7840:
---

Thanks [~Naganarasimha] for the detailed comment. I also had a concern of map 
initially and was trying to correlate it. 
{quote}If while adding the mapping we are just adding the type for the proto 
and no value is mapped and hence no type attached to the Name , so  while 
submitting the RR what operations are permitted  ? how to validate.
{quote}
So IIUC, you were mentioning abt the validation of type vs value (cases where 
one is empty and another is not etc). This will also come when its given as two 
different entities. After thinking a bit, could we add this value to 
NodeAttributeTypeProto itself as optional param.
{quote}Assume one node sends with No value and other nodes sends it with value. 
validations cannot be thought of
{quote}
Similar to above, cud we think of moving value with type.

 

Ideally I was thinking a case where we have some more items coming in the proto 
and type vs value may loose its bindness a bit, hence was thinking to correlate 
a bit. [~Naganarasimha] do you think of any other good options here.

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-31 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347834#comment-16347834
 ] 

Daniel Templeton commented on YARN-7819:


Comments:
* The write lock should be locked outside the {{try}} in {{allocateOnNode()}}
* Can we please move {{setQueueName()}} up to {{RMContainer}}?  Every single 
use of it involves a cast from {{RMContainer}} to {{RMContainerImpl}}, and 
there are no other subclasses of {{RMContainer}}.
* Will calling the assignment node-local in the metrics update confuse the 
metrics?  What if it's not actually node local?
* The log messages on FSLeafQueue:360,364 is pretty cryptic.  Either provide a 
clear explanation of what happened and what should be done about it or consider 
making then info or debug messages (or both!)
* The log message on FairScheduler:1879 definitely shouldn't be a warn.  See 
story as my previous point.
* Seems like there should be a cleaner way to do this:{code}Resource 
resource =
schedulingRequest.getResourceSizing().getResources();
schedulingRequest.getResourceSizing().setResources(
getNormalizedResource(resource));{code} Like maybe add a normalize 
method to the resource sizing and/or move this operation into the 
{{createRMContainer()}} method.
* {{Resources.greaterThan(none)}} probably isn't what you want.  You probably 
want {{!Resources.isNone()}}.


On [~haibo.chen]'s synchronization concerns, the only thing that concerns me is 
the {{FSAppAttempt}} state.  I'd have to dig to see if it's really an issue, 
though.

> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347829#comment-16347829
 ] 

Naganarasimha G R commented on YARN-7757:
-

I would like to take look at the patch please hold...

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, YARN-7757-YARN-3409.005.patch, 
> nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347825#comment-16347825
 ] 

Naganarasimha G R commented on YARN-7840:
-

[~bibinchundatt] & [~cheersyang] Uploaded a patch fixing the direct issues but 
i doubt TestPBImplRecords is good enough. I had copy error of using 
{{builder.setAttributeName(attributePrefix);}} for which it should have failed 
but it did not flag any error even in manual run. Will look into it...

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-7840:

Attachment: YARN-7840-YARN-3409.003.patch

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch, YARN-7840-YARN-3409.003.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-01-31 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7781:
--
Attachment: YARN-7781.02.patch

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch, YARN-7781.02.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7781) Update YARN-Services-Examples.md to be in sync with the latest code

2018-01-31 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347820#comment-16347820
 ] 

Jian He commented on YARN-7781:
---

Thanks Gour for the comments,
bq. Don't we support a PUT URL – 
http://localhost:8088/app/v1/services/hello-world where we can pass a single 
JSON and flex 
This is recently added in YARN-7540 and YARN-7605. I updated the document and 
fix others too

> Update YARN-Services-Examples.md to be in sync with the latest code
> ---
>
> Key: YARN-7781
> URL: https://issues.apache.org/jira/browse/YARN-7781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
>Priority: Major
> Attachments: YARN-7781.01.patch
>
>
> Update YARN-Services-Examples.md to make the following additions/changes:
> 1. Add an additional URL and PUT Request JSON to support flex:
> Update to flex up/down the no of containers (instances) of a component of a 
> service
> PUT URL – http://localhost:8088/app/v1/services/hello-world
> PUT Request JSON
> {code}
> {
>   "components" : [ {
> "name" : "hello",
> "number_of_containers" : 3
>   } ]
> }
> {code}
> 2. Modify all occurrences of /ws/ to /app/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7819) Allow PlacementProcessor to be used with the FairScheduler

2018-01-31 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347818#comment-16347818
 ] 

Haibo Chen commented on YARN-7819:
--

Thanks [~asuresh] for the patch. I took a quick look at YARN-6592 before 
reviewing this patch, but my understanding maybe incomplete at best.

Two things I noticed here. There are multiple threads in fair scheduler that 
try to update its states, so we probably need to add locking for safety. In the 
current version, a placed scheduler resource request is accepted as long as 
there is enough resources to fit the request. This may however, not be a 
desirable behavior from a fair-share points of view, for example. There may be 
other factors which we have to take into account to decide if a placed 
scheduler request is acceptable. +[~yufeigu], [~templedf]

 

> Allow PlacementProcessor to be used with the FairScheduler
> --
>
> Key: YARN-7819
> URL: https://issues.apache.org/jira/browse/YARN-7819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-7819-YARN-6592.001.patch, 
> YARN-7819-YARN-7812.001.patch
>
>
> The FairScheduler needs to implement the 
> {{ResourceScheduler#attemptAllocationOnNode}} function for the processor to 
> support the FairScheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7865) Node attributes documentation

2018-01-31 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-7865:
---

Assignee: Naganarasimha G R

> Node attributes documentation
> -
>
> Key: YARN-7865
> URL: https://issues.apache.org/jira/browse/YARN-7865
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Major
>
> We need proper docs to introduce how to enable node-attributes how to 
> configure providers, how to specify script paths, arguments in configuration, 
> what should be the proper permission of the script and who will run the 
> script. Also it would be good to add more info to the description of the 
> configuration properties.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347810#comment-16347810
 ] 

Naganarasimha G R commented on YARN-7840:
-

Thanks for the comments [~sunilg] , [~bibinchundatt] & [~cheersyang],  Agree on 
bibin's comment and will upload a patch for it.

And for Sunil's comment i had thought about it and based on following points 
was wondering whether required
 # If while adding the mapping we are just adding the type for the proto and no 
value is mapped and hence no type attached to the Name , so  while submitting 
the RR what operations are permitted  ? how to validate.
 # Assume one node sends with No value and other nodes sends it with value. 
validations cannot be thought of
 # Did not want the interface to have maps as we do not define the label 
separately and mapping with value separately and its a single operation. 

I felt we could have the type with *NodeAttributeProto* and have a map to 
String Value. Only place i feel its useful is when we are trying to get unique 
labels then we have this object which is having value which does not signify 
anything. But in all other places it suffices to have Set/List instead of a Map.

 

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347796#comment-16347796
 ] 

Naganarasimha G R commented on YARN-7842:
-

Oops, just missed to provide the comments on time. I had same thoughts as 
[~sunilg] but it requires a larger piece of approach where we need to decide 
how we are planning to send across the confirmation information back to the NM. 
Ok in handling it in YARN-7856  

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: yarn-3409
>
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7868) Provide improved error message when YARN service is disabled

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347790#comment-16347790
 ] 

genericqa commented on YARN-7868:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7868 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908672/YARN-7868.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1c0e18880e5b 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5a725bb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19554/testReport/ |
| Max. process+thread count | 409 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19554/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http

[jira] [Updated] (YARN-6578) Return container resource utilization from NM ContainerStatus call

2018-01-31 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6578:
-
Issue Type: Improvement  (was: New Feature)

> Return container resource utilization from NM ContainerStatus call
> --
>
> Key: YARN-6578
> URL: https://issues.apache.org/jira/browse/YARN-6578
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Yang Wang
>Priority: Major
> Attachments: YARN-6578.001.patch
>
>
> When the applicationMaster wants to change(increase/decrease) resources of an 
> allocated container, resource utilization is an important reference indicator 
> for decision making.  So, when AM call NMClient.getContainerStatus, resource 
> utilization needs to be returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5907) [Umbrella] [YARN-1042] add ability to specify affinity/anti-affinity in container requests

2018-01-31 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347787#comment-16347787
 ] 

Haibo Chen commented on YARN-5907:
--

[~leftnoteasy] While reading YARN-6592, I wonder if this has been addressed 
there as part of the efforts, thus this is now a duplicate?

> [Umbrella] [YARN-1042] add ability to specify affinity/anti-affinity in 
> container requests
> --
>
> Key: YARN-5907
> URL: https://issues.apache.org/jira/browse/YARN-5907
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Steve Loughran
>Assignee: Wangda Tan
>Priority: Major
>
> container requests to the AM should be able to request anti-affinity to 
> ensure that things like Region Servers don't come up on the same failure 
> zones. 
> Similarly, you may be able to want to specify affinity to same host or rack 
> without specifying which specific host/rack. Example: bringing up a small 
> giraph cluster in a large YARN cluster would benefit from having the 
> processes in the same rack purely for bandwidth reasons.
> {color:red}
> This JIRA is cloned umbrella JIRA of YARN-1042, discussions / designs / POC 
> patches, etc. please refer to YARN-1042.
> {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347785#comment-16347785
 ] 

Botong Huang commented on YARN-7849:


Actually I think we can remove the async heartbeat to avoid confusion. The 
normal heartbeat will just pick the new NodeStatus and carry the updated info 
to RM. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7870) [PERF/TEST] Performance testing of ReservationSystem at high job submission rates

2018-01-31 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347741#comment-16347741
 ] 

Carlo Curino commented on YARN-7870:


Yes! It is, in fact, already extended to support reservations (YARN-6363 if I 
am not mistaken), and to run a {{MetricsInvariantChecker}} (YARN-6451 and 
YARN-6547) to validate some of the performance/correctness. In this Jira (and 
others in the same umbrella and in the SLS umbrella, e.g., YARN-7798) we plan 
to build upon it to give us a solid testing and perf-testing platform for the 
various algorithmic/protocol additions that we are planning in YANR-7402 (and 
YARN in general). 

> [PERF/TEST] Performance testing of ReservationSystem at high job submission 
> rates
> -
>
> Key: YARN-7870
> URL: https://issues.apache.org/jira/browse/YARN-7870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Priority: Major
>
> To leverage the ReservationSystem as a gang-semantics enforcer for all jobs 
> of  a large federation, we need to evaluate it can sustain large number of 
> job submissions (and replanning) per second. This Jira tracks this validation 
> effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7868) Provide improved error message when YARN service is disabled

2018-01-31 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-7868:
---

Assignee: Eric Yang

> Provide improved error message when YARN service is disabled
> 
>
> Key: YARN-7868
> URL: https://issues.apache.org/jira/browse/YARN-7868
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7868.001.patch
>
>
> Some YARN CLI command will throw verbose error message when YARN service is 
> disabled.  The error message looks like this:
> {code}
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: A message body reader for Java class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and Java type class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and MIME media type 
> application/octet-stream was not found
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: The registered message body readers compatible with the MIME media 
> type are:
> application/octet-stream ->
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.RenderedImageProvider
> */* ->
>   com.sun.jersey.core.impl.provider.entity.FormProvider
>   com.sun.jersey.core.impl.provider.entity.StringProvider
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.ReaderProvider
>   com.sun.jersey.core.impl.provider.entity.DocumentProvider
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader
>   com.sun.jersey.json.impl.provider.entity.JSONJAXBElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONArrayProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.EntityHolderReader
>   com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JacksonProviderProxy
>   com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider
> 2018-01-31 16:24:46,415 ERROR client.ApiServiceClient: 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7868) Provide improved error message when YARN service is disabled

2018-01-31 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7868:

Attachment: YARN-7868.001.patch

> Provide improved error message when YARN service is disabled
> 
>
> Key: YARN-7868
> URL: https://issues.apache.org/jira/browse/YARN-7868
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Priority: Major
> Attachments: YARN-7868.001.patch
>
>
> Some YARN CLI command will throw verbose error message when YARN service is 
> disabled.  The error message looks like this:
> {code}
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: A message body reader for Java class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and Java type class 
> org.apache.hadoop.yarn.service.api.records.ServiceStatus, and MIME media type 
> application/octet-stream was not found
> Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
> SEVERE: The registered message body readers compatible with the MIME media 
> type are:
> application/octet-stream ->
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.RenderedImageProvider
> */* ->
>   com.sun.jersey.core.impl.provider.entity.FormProvider
>   com.sun.jersey.core.impl.provider.entity.StringProvider
>   com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
>   com.sun.jersey.core.impl.provider.entity.FileProvider
>   com.sun.jersey.core.impl.provider.entity.InputStreamProvider
>   com.sun.jersey.core.impl.provider.entity.DataSourceProvider
>   com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.ReaderProvider
>   com.sun.jersey.core.impl.provider.entity.DocumentProvider
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
>   com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader
>   com.sun.jersey.json.impl.provider.entity.JSONJAXBElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONArrayProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
>   com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General
>   com.sun.jersey.core.impl.provider.entity.EntityHolderReader
>   com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General
>   com.sun.jersey.json.impl.provider.entity.JacksonProviderProxy
>   com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider
> 2018-01-31 16:24:46,415 ERROR client.ApiServiceClient: 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7870) [PERF/TEST] Performance testing of ReservationSystem at high job submission rates

2018-01-31 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347717#comment-16347717
 ] 

Vinod Kumar Vavilapalli commented on YARN-7870:
---

Is there a way SLS can be enhanced for making this repeatable?

> [PERF/TEST] Performance testing of ReservationSystem at high job submission 
> rates
> -
>
> Key: YARN-7870
> URL: https://issues.apache.org/jira/browse/YARN-7870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Priority: Major
>
> To leverage the ReservationSystem as a gang-semantics enforcer for all jobs 
> of  a large federation, we need to evaluate it can sustain large number of 
> job submissions (and replanning) per second. This Jira tracks this validation 
> effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347707#comment-16347707
 ] 

Botong Huang edited comment on YARN-7849 at 1/31/18 10:39 PM:
--

I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. It is possible for a normal heartbeat to happen 
concurrently as the injected heartbeat, but they will be sharing the same 
NodeStatus object and thus the same responseId. In RM side, one will be 
processed and the other will be treated as duplicate heartbeat, both will be 
returned with the same response without exception. Note that only the normal 
heartbeat records and updates the lastResponseId in the response, changing the 
responseId to be used by the next heartbeat. 


was (Author: botong):
I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. It is possible for a normal heartbeat to happen 
concurrently as the injected heartbeat, but they will be sharing the same 
NodeStatus object and thus the same responseId. In RM side, one will be 
processed and the other will be treated as duplicate heartbeat, both will be 
returned with the same response without exception. Note that only the normal 
heartbeat records and updates the lastResponseId in the response and changes 
the responseId to be used by the next heartbeat. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347707#comment-16347707
 ] 

Botong Huang edited comment on YARN-7849 at 1/31/18 10:38 PM:
--

I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. It is possible for a normal heartbeat to happen 
concurrently as the injected heartbeat, but they will be sharing the same 
NodeStatus object and thus the same responseId. In RM side, one will be 
processed and the other will be treated as duplicate heartbeat, both will be 
returned with the same response without exception. Note that only the normal 
heartbeat records and updates the lastResponseId in the response and changes 
the responseId to be used by the next heartbeat. 


was (Author: botong):
I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. It is possible for a normal heartbeat to happen 
concurrently as the injected heartbeat, but they will be sharing the same 
NodeStatus object and thus the same responseId. In RM side, one will be 
processed and the other will be treated as duplicate heartbeat, both will be 
returned with the same response without exception. Note that only the normal 
heartbeat records and updates the lastResponseId in the response. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347707#comment-16347707
 ] 

Botong Huang edited comment on YARN-7849 at 1/31/18 10:34 PM:
--

I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. It is possible for a normal heartbeat to happen 
concurrently as the injected heartbeat, but they will be sharing the same 
NodeStatus object and thus the same responseId. In RM side, one will be 
processed and the other will be treated as duplicate heartbeat, both will be 
returned with the same response without exception. Note that only the normal 
heartbeat records and updates the lastResponseId in the response. 


was (Author: botong):
I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. Note that only the normal heartbeat records and 
updates the lastResponseId in the response. It is possible for a normal 
heartbeat to happen concurrently as the injected heartbeat, but they will be 
sharing the same NodeStatus object and thus the same responseId. In RM side, 
one will be processed and the other will be treated as duplicate heartbeat, 
both will be returned with the same response without exception.

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347707#comment-16347707
 ] 

Botong Huang commented on YARN-7849:


I see your concern, let me explain the nuance here. In the _setup()_, after the 
MiniYarnCluster starts running, a new NodeStatus object is replaced into 
CustomNodeManager. After that all normal and injected heartbeats use the new 
NodeStatus object, which picks up the correct responseId everytime in 
_getSimulatedNodeStatus()_. Note that only the normal heartbeat records and 
updates the lastResponseId in the response. It is possible for a normal 
heartbeat to happen concurrently as the injected heartbeat, but they will be 
sharing the same NodeStatus object and thus the same responseId. In RM side, 
one will be processed and the other will be treated as duplicate heartbeat, 
both will be returned with the same response without exception.

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7833) [PERF/TEST] Extend SLS to support simulation of a Federated Environment

2018-01-31 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-7833:
---
Summary: [PERF/TEST] Extend SLS to support simulation of a Federated 
Environment  (was: Extend SLS to support simulation of a Federated Environment)

> [PERF/TEST] Extend SLS to support simulation of a Federated Environment
> ---
>
> Key: YARN-7833
> URL: https://issues.apache.org/jira/browse/YARN-7833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Jose Miguel Arreola
>Priority: Major
>
> To develop algorithms for federation, it would be of great help to have a 
> version of SLS that supports multi RMs and GPG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7870) [PERF/TEST] Performance testing of ReservationSystem at high job submission rates

2018-01-31 Thread Carlo Curino (JIRA)
Carlo Curino created YARN-7870:
--

 Summary: [PERF/TEST] Performance testing of ReservationSystem at 
high job submission rates
 Key: YARN-7870
 URL: https://issues.apache.org/jira/browse/YARN-7870
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino


To leverage the ReservationSystem as a gang-semantics enforcer for all jobs of  
a large federation, we need to evaluate it can sustain large number of job 
submissions (and replanning) per second. This Jira tracks this validation 
effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7869) [PERF/TEST] Performance testing of CapacityScheudler at many-thousands of queues

2018-01-31 Thread Carlo Curino (JIRA)
Carlo Curino created YARN-7869:
--

 Summary: [PERF/TEST] Performance testing of CapacityScheudler at 
many-thousands of queues
 Key: YARN-7869
 URL: https://issues.apache.org/jira/browse/YARN-7869
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino


The CapacityScheduler is known to work well at tens to hundreds of queues. This 
Jira tracks performance testing at much larger scale thousands of queues, and 
deep queue hierachies >10 levels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7868) Provide improved error message when YARN service is disabled

2018-01-31 Thread Eric Yang (JIRA)
Eric Yang created YARN-7868:
---

 Summary: Provide improved error message when YARN service is 
disabled
 Key: YARN-7868
 URL: https://issues.apache.org/jira/browse/YARN-7868
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn-native-services
Affects Versions: 3.1.0
Reporter: Eric Yang


Some YARN CLI command will throw verbose error message when YARN service is 
disabled.  The error message looks like this:

{code}
Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
SEVERE: A message body reader for Java class 
org.apache.hadoop.yarn.service.api.records.ServiceStatus, and Java type class 
org.apache.hadoop.yarn.service.api.records.ServiceStatus, and MIME media type 
application/octet-stream was not found
Jan 31, 2018 4:24:46 PM com.sun.jersey.api.client.ClientResponse getEntity
SEVERE: The registered message body readers compatible with the MIME media type 
are:
application/octet-stream ->
  com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
  com.sun.jersey.core.impl.provider.entity.FileProvider
  com.sun.jersey.core.impl.provider.entity.InputStreamProvider
  com.sun.jersey.core.impl.provider.entity.DataSourceProvider
  com.sun.jersey.core.impl.provider.entity.RenderedImageProvider
*/* ->
  com.sun.jersey.core.impl.provider.entity.FormProvider
  com.sun.jersey.core.impl.provider.entity.StringProvider
  com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
  com.sun.jersey.core.impl.provider.entity.FileProvider
  com.sun.jersey.core.impl.provider.entity.InputStreamProvider
  com.sun.jersey.core.impl.provider.entity.DataSourceProvider
  com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
  com.sun.jersey.core.impl.provider.entity.ReaderProvider
  com.sun.jersey.core.impl.provider.entity.DocumentProvider
  com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader
  com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
  com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader
  com.sun.jersey.json.impl.provider.entity.JSONJAXBElementProvider$General
  com.sun.jersey.json.impl.provider.entity.JSONArrayProvider$General
  com.sun.jersey.json.impl.provider.entity.JSONObjectProvider$General
  com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
  com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
  com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General
  com.sun.jersey.core.impl.provider.entity.EntityHolderReader
  com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider$General
  com.sun.jersey.json.impl.provider.entity.JSONListElementProvider$General
  com.sun.jersey.json.impl.provider.entity.JacksonProviderProxy
  com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider

2018-01-31 16:24:46,415 ERROR client.ApiServiceClient: 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7867) Enable YARN service by default

2018-01-31 Thread Eric Yang (JIRA)
Eric Yang created YARN-7867:
---

 Summary: Enable YARN service by default
 Key: YARN-7867
 URL: https://issues.apache.org/jira/browse/YARN-7867
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn-native-services
Affects Versions: 3.1.0
Reporter: Eric Yang


YARN service REST API is disabled by default.  We will make the decision to 
turn on this feature by default when the code is mature enough to be consumed 
by public.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7835) [Atsv2] Race condition in NM while publishing events if second attempt launched on same node

2018-01-31 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347584#comment-16347584
 ] 

Haibo Chen commented on YARN-7835:
--

Thanks [~rohithsharma] for the clarification! I agree that this needs to be 
addressed on NM side.

It sounds like on RM side a collector is mapped to an APP whereas on NM side a 
collector is mapped to an APP_ATTEMPT.

  An alternative would be be to only clean up the collector when the 
application finishes instead of when an AM container finishes, that is, when 
PerNodeTimelineCollectorsAuxService.stopApplication() is called by NM, so that 
no additional state (the App -> set of AM containers) need be persisted for 
recovery. This does defer the collector clean up with respect to the current 
behavior that the collector is clean up as soon as the app attempt finishes. 
Though I am not sure how much impact it is going to have from the late cleanup. 
Thoughts? [~rohithsharma] [~vrushalic] [~varun_saxena]

> [Atsv2] Race condition in NM while publishing events if second attempt 
> launched on same node
> 
>
> Key: YARN-7835
> URL: https://issues.apache.org/jira/browse/YARN-7835
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-7835.001.patch
>
>
> It is observed race condition that if master container is killed for some 
> reason and launched on same node then NMTimelinePublisher doesn't add 
> timelineClient. But once completed container for 1st attempt has come then 
> NMTimelinePublisher removes the timelineClient. 
>  It causes all subsequent event publishing from different client fails to 
> publish with exception Application is not found. !



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347559#comment-16347559
 ] 

Jason Lowe commented on YARN-7849:
--

Thanks for the patch!

I'm worried that this test is still fragile even after the fix.  The unit test 
stands up a minicluster and attempts to asynchronously simulate a node 
heartbeat.  Maybe I'm missing something, but the minicluster is going to be 
automatically heartbeating during this test, and at some random time some 
external heartbeat will be injected into the ResourceTrackerService in the RM.  
If the "normal" heartbeating of the minicluster node happens at just the right 
(or wrong) time then I think the sequence number could still be off and fail 
this test.

IMHO the test should not be using a full minicluster at all if it needs to 
carefully control the node heartbeats.  It should stand up as much of the RM as 
it needs then manually inject the mock heartbeats.  Letting a "real" 
nodemanager continue heartbeating asynchronously to the mocked heartbeats is 
going to be racy.


> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7572) Make the service status output more readable

2018-01-31 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347526#comment-16347526
 ] 

Chandni Singh commented on YARN-7572:
-

Should we change the status output to Application Report?
{code:java}
Application Report :

Application-Id : application_1517426742923_0001

Application-Name : test2

Application-Type : yarn-service

User : cnisingh

Queue : default

Application Priority : 0

Start-Time : 1517426968527

Finish-Time : 0

Progress : 100%

State : RUNNING

Final-State : UNDEFINED

Tracking-URL : N/A

RPC Port : 62842

AM Host : 192.168.99.1

Aggregate Resource Allocation : 6166567 MB-seconds, 6019 vcore-seconds

Aggregate Resource Preempted : 0 MB-seconds, 0 vcore-seconds

Log Aggregation Status : DISABLED

Diagnostics :

Unmanaged Application : false

Application Node Label Expression : 

AM container Node Label Expression : 

TimeoutType : LIFETIME ExpiryTime : UNLIMITED RemainingTime : -1seconds{code}
 

> Make the service status output more readable 
> -
>
> Key: YARN-7572
> URL: https://issues.apache.org/jira/browse/YARN-7572
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Chandni Singh
>Priority: Major
> Fix For: yarn-native-services
>
>
> Currently the service status output is just a JSON spec, we can make it more 
> human readable



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-01-31 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347514#comment-16347514
 ] 

Konstantinos Karanasos commented on YARN-7778:
--

Thanks for the patch, [~cheersyang]. A couple of comments:
 * Let's rename the getResourceConstraint() to something like 
getMultilevelConstraint? Also the initial part of the comment "Retrieve the 
placement constraint of a given scheduling request from the PCM point of view" 
could be made more like "Consider all levels of constraints (resource request, 
app, cluster) and return a merged constraint". This is because we can also 
return a merged constraint of app and cluster level, even if there are no 
resource request constraints.
 * I would also change the signature of the method to add directly a Constraint 
coming from the resource request and a set of tags, rather than pass the 
SchedulingRequest object. This way we make clear that the set of tags is 
separate from the specific constraint.
 * Do we need the "Remove all null or duplicate constraints"? I was thinking 
that this should belong to a separate step that will do constraint 
simplification/minimization. If we keep it, I would suggest to put it in a 
separate method in the PCM, so that we can use it elsewhere too or extend it in 
the future easily.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf, 
> YARN-7778-YARN-7812.001.patch, YARN-7778-YARN-7812.002.patch
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347486#comment-16347486
 ] 

genericqa commented on YARN-7849:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
57s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.8 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-7849 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908629/YARN-7849-branch-2.8.v1.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a9388c9a 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.8 / d9132bf |
| maven | version: Apache Maven 3.0.5 |
| Default Java | 1.7.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19553/testReport/ |
| Max. process+thread count | 548 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19553/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


[jira] [Commented] (YARN-7516) Security check for trusted docker image

2018-01-31 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347469#comment-16347469
 ] 

Billie Rinaldi commented on YARN-7516:
--

bq.  For example, a spark image that can run YARN cluster mode as well as a 
standalone spark may require the image to present in both lists. If we need to 
build two spark images for each programming paradigm, then we inconveniently 
double the work for our developers.

You wouldn't have to build two images, just tag the same image as a different 
name.

> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, 
> YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, 
> YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, 
> YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, 
> YARN-7516.015.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347458#comment-16347458
 ] 

genericqa commented on YARN-7446:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
28m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
12s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7446 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12896220/YARN-7446.001.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 77dae6f0de44 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 12eaae3 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19552/testReport/ |
| Max. process+thread count | 306 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19552/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid dup

[jira] [Commented] (YARN-7516) Security check for trusted docker image

2018-01-31 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347441#comment-16347441
 ] 

Shane Kumpf commented on YARN-7516:
---

{quote}How do we resolve the conflict of having the same image exist in both 
list? For example, a spark image that can run YARN cluster mode as well as a 
standalone spark may require the image to present in both lists
{quote}
Wouldn't we have the same issue with this patch? If the image is in 
"docker.privileged-containers.registries" it would run in "yarn-mode", if it is 
not, it runs in "default-mode". I can't really think of a reason why an image 
would want to regress to the stripped down "default-mode" after the work has 
been done to enable "yarn-mode" for that application.
{quote}I think that problem needs to be discussed in a separate JIRA 
(YARN-7654) to prevent morphing of the current one, if you don't mind.
{quote}
I believe the discussion is relevant here and very closely aligns to what you 
are doing. The current naming is very confusing and needs to change, IMO. I'm 
open to suggestions on the modes and naming, but thought "modes" might be an 
easy way to describe this to users.
{quote}We might want to allow CHOWN,SETGID,SETUID,NET_BIND_SERVICE,KILL 
capabilities for multi-users docker image to function.
{quote}
While I'd prefer "no features" to make it easy to describe, If we need to add 
those capabilities to do anything useful, I think that makes sense. Do you 
think this should be the default capabilities for all containers and admins can 
then add what they need? I believe the current defaults in YARN mirror the 
Docker defaults, but they may have diverged over time.

As is, the patch makes it so no docker container can run. At a minimum I think 
we need to fix that by removing the launch_command. I think removing --user 
makes sense too.

> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, 
> YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, 
> YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, 
> YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, 
> YARN-7516.015.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347439#comment-16347439
 ] 

Eric Yang commented on YARN-7815:
-

{quote}
I think that leaves us with this proposal which should accomplish that and 
remove one of the mounts being made today:

1. nm-local-dir/filecache mounted read-only for access to localized public files
2. nm-local-dir/usercache/user/filecache mounted read-only for access to 
localized user-private files
3. nm-local-dir/usercache/user/appcache/applicationId mounted read-write for 
access to the application work area and underlying container working directory
{quote}

Looks good.

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7142) Support placement policy in yarn native services

2018-01-31 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha reassigned YARN-7142:
---

Assignee: Gour Saha
Target Version/s: 3.1.0

YARN-6592 has been merged to trunk. I will work on a patch for this.

> Support placement policy in yarn native services
> 
>
> Key: YARN-7142
> URL: https://issues.apache.org/jira/browse/YARN-7142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Major
>
> Placement policy exists in the API but is not implemented yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7849:
---
Attachment: YARN-7849-branch-2.8.v1.patch

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849-branch-2.8.v1.patch, YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347401#comment-16347401
 ] 

Botong Huang commented on YARN-7849:


The unit test failure for v1 patch is irrelevant. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7836) YARN Service component update PUT API should not use component name from JSON body

2018-01-31 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha reassigned YARN-7836:
---

Assignee: Gour Saha
Target Version/s: 3.1.0

I will work on a patch for this

> YARN Service component update PUT API should not use component name from JSON 
> body
> --
>
> Key: YARN-7836
> URL: https://issues.apache.org/jira/browse/YARN-7836
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, yarn-native-services
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
>
> The YARN Service PUT API for component update should not use component name 
> from the JSON body. The component update PUT URI is as follows -
>  [http://localhost:9191/app/v1/services/]/components/
> e.g. [http://localhost:9191/app/v1/services/hello-world/components/hello]
> The component name is already in the URI, hence the JSON body expected should 
> be only -
> {noformat}
> {
> "number_of_containers": 3
> }
> {noformat}
> It should not expect the name attribute in the JSON body. In fact, if the 
> JSON body contains a name attribute with value anything other than the 
>  in the path param, we should send a 400 bad request saying they 
> do not match. If they are the same, it should be okay and we can process the 
> request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7816) YARN Service - Two different users are unable to launch a service of the same name

2018-01-31 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-7816:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-7054

> YARN Service - Two different users are unable to launch a service of the same 
> name
> --
>
> Key: YARN-7816
> URL: https://issues.apache.org/jira/browse/YARN-7816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-7816.001.patch, YARN-7816.002.patch, 
> YARN-7816.003.patch
>
>
> Now that YARN-7605 is committed, I am able to create a service in an 
> unsecured cluster from cmd line as the logged in user. However after creating 
> an app of name "myapp" say as user A, and then I login as a different user 
> user B, I am unable to create a service of the exact same name ("myapp" in 
> this case). This feature should be supported in a multi-user setup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7516) Security check for trusted docker image

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347388#comment-16347388
 ] 

Eric Yang commented on YARN-7516:
-

[~billie.rinaldi] [~shaneku...@gmail.com] How do we resolve the conflict of 
having the same image exist in both list?  For example, a spark image that can 
run YARN cluster mode as well as a standalone spark may require the image to 
present in both lists.  If we need to build two spark images for each 
programming paradigm, then we inconveniently double the work for our 
developers.  IMHO, security enforcement should not interfere with programming 
paradigm.  We might be digress from the original goal to sandbox untrusted 
images and provide minimum amount of capabilities to test software from 
Internet.  It might be preferred to build a user controllable flag for 
switching between yarn-mode vs default-mode for better control to decide 
whether localizer directories and launcher script should be mounted.  I think 
that problem needs to be discussed in a separate JIRA (YARN-7654) to prevent 
morphing of the current one, if you don't mind.

My concern over this current patch, drop all capabilities for untrusted image 
might limit too much.  We might want to allow 
CHOWN,SETGID,SETUID,NET_BIND_SERVICE,KILL capabilities for multi-users docker 
image to function.

> Security check for trusted docker image
> ---
>
> Key: YARN-7516
> URL: https://issues.apache.org/jira/browse/YARN-7516
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch, 
> YARN-7516.006.patch, YARN-7516.007.patch, YARN-7516.008.patch, 
> YARN-7516.009.patch, YARN-7516.010.patch, YARN-7516.011.patch, 
> YARN-7516.012.patch, YARN-7516.013.patch, YARN-7516.014.patch, 
> YARN-7516.015.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-01-31 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347382#comment-16347382
 ] 

Shane Kumpf commented on YARN-7446:
---

Thanks for the explanation. I do agree that this model would require an extra 
step to get to root, which I'm not sure is a bad thing per se. "To grant root 
power or not to grant" is well put. :)

One benefit of setting --user is that in "yarn-mode" logging and localization 
will just work. Of course, it does work with root as well, but that impacts 
clean up and log aggregation. Which begs the question, should --privileged 
disable/change what mounts are allowed since the root user can write files that 
can't be cleaned up? My vote would be no, as I believe it is too restrictive. I 
would vote we explore letting the NM clean up those files regardless as 
containers running as root will be needed.

If we are going to flat out say that --privileged containers should always run 
as root, then we still need to set --user, but hard coded to 0:0. Otherwise, 
the running user is left up to the image. Here is an example that shows you 
aren't guaranteed to run as root by just removing --user 

Dockerfile
{code:java}
FROM centos

RUN useradd http

USER http

CMD touch /tmp/file && ls -la /tmp/file{code}
Note that the USER directive is specified in the image.

No user:
{code:java}
[root@y7001 docker_image_user]# docker run usertest
-rw-r--r--. 1 http http 0 Jan 31 18:35 /tmp/file{code}
User set to root:
{code:java}
[root@y7001 docker_image_user]# docker run -u 0:0 --group-add 0 usertest
-rw-r--r--. 1 root root 0 Jan 31 18:46 /tmp/file{code}
 

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7863) Modify placement constraints to support node attributes

2018-01-31 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347372#comment-16347372
 ] 

Konstantinos Karanasos commented on YARN-7863:
--

Ideally there should be no API changes for this, right? Since we already allow 
node attributes in the target.

I agree with [~asuresh] that there are two aspects to this (support node 
attributes in the target and in the scope), and we should tackle them in 
separate JIRAs. So we can keep this one for the target only. For the scope 
case, I was thinking whether we should only support a subset of the node 
attributes or whether it is no extra effort to support any attribute. But we 
can continue the discussions about this in YARN-7858.

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Major
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347360#comment-16347360
 ] 

genericqa commented on YARN-7849:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 51s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 56s{color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7849 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908618/YARN-7849.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 89410b6f3032 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 12eaae3 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/P

[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-01-31 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347329#comment-16347329
 ] 

Eric Payne commented on YARN-4606:
--

My understanding is that user limit would use {{activeUsers}} and things like 
max AM limit per user, we'd use {{activeUsers}} + {{activeUsersOfPendingApps}}

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-4606.1.poc.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347300#comment-16347300
 ] 

Eric Yang edited comment on YARN-7446 at 1/31/18 6:13 PM:
--

Hi [~shaneku...@gmail.com], to carry out the conversation on YARN-7516 
regarding privileged and user flag being mutually exclusive.  Base on 
[~ebadger]'s comments on YARN-7516, privileged and cap-add/cap-drop are not 
addictive.  When privileged is given, and we drop the starting user to a normal 
uid/gid.  This instance of container is still running with root privileges, for 
the end user to regain kernel level access, the image needs to have either a 
sudoers list with sudo binary and sticky bits prebuild or some executable 
binary with sticky bits to regain control of root privileges.  Once user can 
regain control of the root power in the image, then it defeats the purpose to 
drop privileges in the first place from security point of view.  "To grant root 
power, or not to grant" is the question.  When this question is asked upfront, 
there is little purpose to drop to normal user uid/gid because normal user will 
need to spend more effort to resume root power form usability point of view.  
The initial decision for privileged flag makes the user parameter irrelevant 
from both usability point of view or security point of view.  Thoughts?


was (Author: eyang):
Hi [~shaneku...@gmail.com], to carry out the conversation on YARN-7516 
regarding --privileged and -u flag being mutually exclusive.  Base on 
[~ebadger]'s comments on YARN-7516, --privileged and --cap-add/--cap-drop are 
not addictive.  When --privileged is given, and we drop the starting user to a 
normal uid/gid.  This instance of container is still running with root 
privileges, for the end user to regain kernel level access, the image needs to 
have either a sudoers list with sudo binary and sticky bits prebuild or some 
executable binary with sticky bits to regain control of root privileges.  Once 
user can regain control of the root power in the image, then it defeats the 
purpose to drop privileges in the first place from security point of view.  "To 
grant root power, or not to grant" is the question.  When this question is 
asked upfront, there is little purpose to drop to normal user uid/gid because 
normal user will need to spend more effort to resume root power form usability 
point of view.  The initial decision for privileged flag makes the user 
parameter irrelevant from both usability point of view or security point of 
view.  Thoughts?

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7446) Docker container privileged mode and --user flag contradict each other

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347300#comment-16347300
 ] 

Eric Yang commented on YARN-7446:
-

Hi [~shaneku...@gmail.com], to carry out the conversation on YARN-7516 
regarding --privileged and -u flag being mutually exclusive.  Base on 
[~ebadger]'s comments on YARN-7516, --privileged and --cap-add/--cap-drop are 
not addictive.  When --privileged is given, and we drop the starting user to a 
normal uid/gid.  This instance of container is still running with root 
privileges, for the end user to regain kernel level access, the image needs to 
have either a sudoers list with sudo binary and sticky bits prebuild or some 
executable binary with sticky bits to regain control of root privileges.  Once 
user can regain control of the root power in the image, then it defeats the 
purpose to drop privileges in the first place from security point of view.  "To 
grant root power, or not to grant" is the question.  When this question is 
asked upfront, there is little purpose to drop to normal user uid/gid because 
normal user will need to spend more effort to resume root power form usability 
point of view.  The initial decision for privileged flag makes the user 
parameter irrelevant from both usability point of view or security point of 
view.  Thoughts?

> Docker container privileged mode and --user flag contradict each other
> --
>
> Key: YARN-7446
> URL: https://issues.apache.org/jira/browse/YARN-7446
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7446.001.patch
>
>
> In the current implementation, when privileged=true, --user flag is also 
> passed to docker for launching container.  In reality, the container has no 
> way to use root privileges unless there is sticky bit or sudoers in the image 
> for the specified user to gain privileges again.  To avoid duplication of 
> dropping and reacquire root privileges, we can reduce the duplication of 
> specifying both flag.  When privileged mode is enabled, --user flag should be 
> omitted.  When non-privileged mode is enabled, --user flag is supplied.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6492) Generate queue metrics for each partition

2018-01-31 Thread Manikandan R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-6492:
---
Attachment: PartitionQueueMetrics_y_partition.txt
PartitionQueueMetrics_x_partition.txt
PartitionQueueMetrics_default_partition.txt

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2018-01-31 Thread Manikandan R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347289#comment-16347289
 ] 

Manikandan R commented on YARN-6492:


[~bibinchundatt] Thanks for your comments. Will take care in next patch.

Attaching different runs of JMX o/p's for detailed understanding. These has 
been captured in single node setup of 8 GB & 8 vcores and added x,y labels with 
exclusivity as false. By default, node has "default" partition.
 # Ran a DS job without any -node_label_expression. Please refer attachment 
PartitionQueueMetrics_default_partition.txt
 # Mapped node to label 'x'. Ran a DS job with -node_label_expression as "x". 
Please refer attachment PartitionQueueMetrics_x_partition.txt
 # Mapped node to label 'y'. Ran a DS job with -node_label_expression as "y". 
Please refer attachment PartitionQueueMetrics_y_partition.txt

[~Naganarasimha] [~bibinchundatt]
{quote}I think the assumption that queue metrics is created only during the 
creation of constructor is wrong as partition can be added dynamically. hence 
if the given queuemetrics objects are not present for a given partition we need 
to create it.
{quote}
Currently, after RM start, am able to see metrics (for ex, availablememory etc) 
"default" partition metrics because of 
{{CSQueueUtils.updateQueueStatistics(resourceCalculator, clusterResource, this, 
labelManager, null)}} in {{AbstractCSQueue#setupQueueConfigs}}. I think we will 
need to do this for every node label used in queue configuration with labels. 
With this, we can show metrics for all labels after RM start.

For partitions getting added in the middle, Do we need to ensure 
PartitionQueueMetrics updated in "replacelabelsonnode" flow? Please share your 
suggestions.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) Docker image cannot set HADOOP_CONF_DIR

2018-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347283#comment-16347283
 ] 

Hudson commented on YARN-7677:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13591 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13591/])
YARN-7677. Docker image cannot set HADOOP_CONF_DIR. Contributed by Jim (jlowe: 
rev 12eaae383ad06de8f9959241b2451dec82cf9ceb)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DelegatingLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java


> Docker image cannot set HADOOP_CONF_DIR
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-4599) Set OOM control for memory cgroups

2018-01-31 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi reassigned YARN-4599:


Assignee: Miklos Szegedi  (was: sandflee)

> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Major
>  Labels: oct16-medium
> Attachments: YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7862) YARN native service REST endpoint needs user.name as query param

2018-01-31 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347258#comment-16347258
 ] 

Sunil G commented on YARN-7862:
---

bq.If you have obtained delegation token somehow, then you can forward the 
cookie to:

In case of non-kerberized cluster, how this delegation token can be obtained?

> YARN native service REST endpoint needs user.name as query param
> 
>
> Key: YARN-7862
> URL: https://issues.apache.org/jira/browse/YARN-7862
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Sunil G
>Priority: Major
>
> While accessing below yarn rest end point with POST method type,
> {code:java}
> http://rm_ip:8088/app/v1/services{code}
> below error is coming in non-secure cluster.
> {noformat}
> {
> "diagnostics": "Null user"
> }{noformat}
> When *user.name* is provided as query param with *dr.who* we can see that 
> yarn started service with proxy user, not dr.who. 
> In non-secure cluster, native service should ideally take the user from 
> remote ugi.
> in secure cluster, its better to derive user from kerberized shell.
>  
> cc/  [~jianhe] [~eyang]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-01-31 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347259#comment-16347259
 ] 

Miklos Szegedi commented on YARN-4599:
--

Thank you, [~sandflee]. I am working on a patch now.

> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: sandflee
>Priority: Major
>  Labels: oct16-medium
> Attachments: YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7862) YARN native service REST endpoint needs user.name as query param

2018-01-31 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347257#comment-16347257
 ] 

Eric Yang commented on YARN-7862:
-

By the way, the above statements apply to AuthenticationFilter (multi-users 
mode), StaticUserWebFilter (single user mode), the remote user is hard coded to 
hadoop.http.staticuser.user in core-site.xml.

> YARN native service REST endpoint needs user.name as query param
> 
>
> Key: YARN-7862
> URL: https://issues.apache.org/jira/browse/YARN-7862
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Sunil G
>Priority: Major
>
> While accessing below yarn rest end point with POST method type,
> {code:java}
> http://rm_ip:8088/app/v1/services{code}
> below error is coming in non-secure cluster.
> {noformat}
> {
> "diagnostics": "Null user"
> }{noformat}
> When *user.name* is provided as query param with *dr.who* we can see that 
> yarn started service with proxy user, not dr.who. 
> In non-secure cluster, native service should ideally take the user from 
> remote ugi.
> in secure cluster, its better to derive user from kerberized shell.
>  
> cc/  [~jianhe] [~eyang]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to heartbeat sync error

2018-01-31 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7849:
---
Attachment: YARN-7849.v1.patch

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization fails due to 
> heartbeat sync error
> --
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7849.v1.patch
>
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   3   >