[jira] [Updated] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9451:

Attachment: (was: YARN-9451-001.patch)

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869207#comment-16869207
 ] 

Tarun Parimi commented on YARN-9209:


Unit test failure is unrelated and is already tracked in YARN-9333

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch, YARN-9209.002.patch, 
> YARN-9209.003.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9451:

Attachment: YARN-9451-001.patch

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread zhangjian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869200#comment-16869200
 ] 

zhangjian edited comment on YARN-9632 at 6/21/19 5:50 AM:
--

Hi, Tao Yang.

I‘m run spark2.4.3 on yarn with hadoop2.7.7 and hadoop2.7 is the spark best 
practice version。I tried to use the patch in YARN-3467,but it's not usefull in 
hadoop2.7.7.


was (Author: lilingzj):
Hi, Tao Yang.

I‘m run spark on yarn with hadoop2.7.7 and hadoop2.7 is the spark best practice 
version。I tried to use the patch in YARN-3467,but it's not usefull in 
hadoop2.7.7.

> No allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-9632
> URL: https://issues.apache.org/jira/browse/YARN-9632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, webapp, yarn
>Affects Versions: 2.7.7
>Reporter: zhangjian
>Priority: Major
> Fix For: 2.8.5
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> But, the RM Web UI does not report on these items . I saw it's ok in 
> hadoop2.8.5.
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread zhangjian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869200#comment-16869200
 ] 

zhangjian edited comment on YARN-9632 at 6/21/19 5:48 AM:
--

Hi, Tao Yang.

I‘m run spark on yarn with hadoop2.7.7 and hadoop2.7 is the spark best practice 
version。I tried to use the patch in YARN-3467,but it's not usefull in 
hadoop2.7.7.


was (Author: lilingzj):
Hi, Tao Yang.

I‘m run spark on yarn with hadoop2.7.7。I tried to use the patch in 
YARN-3467,but it's not usefull in hadoop2.7.7.

> No allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-9632
> URL: https://issues.apache.org/jira/browse/YARN-9632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, webapp, yarn
>Affects Versions: 2.7.7
>Reporter: zhangjian
>Priority: Major
> Fix For: 2.8.5
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> But, the RM Web UI does not report on these items . I saw it's ok in 
> hadoop2.8.5.
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread zhangjian (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869200#comment-16869200
 ] 

zhangjian commented on YARN-9632:
-

Hi, Tao Yang.

I‘m run spark on yarn with hadoop2.7.7。I tried to use the patch in 
YARN-3467,but it's not usefull in hadoop2.7.7.

> No allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-9632
> URL: https://issues.apache.org/jira/browse/YARN-9632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, webapp, yarn
>Affects Versions: 2.7.7
>Reporter: zhangjian
>Priority: Major
> Fix For: 2.8.5
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> But, the RM Web UI does not report on these items . I saw it's ok in 
> hadoop2.8.5.
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9634) Make yarn submit dir and log aggregation dir more evenly distributed

2019-06-20 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869151#comment-16869151
 ] 

zhuqi commented on YARN-9634:
-

Hi [~cheersyang],

Yeah i mean space quota. Also in normal situation, if we don't use quota, i 
mean that in our large cluster,  the fixed namespace binding the submit dir and 
log aggeration dir, will affect the fixed namespace rpc performance. I think we 
can add configuration with some round robin policy or hash policy to 
distributed this dir to configured namespaces among the hdfs federation. 

> Make yarn submit dir and log aggregation dir more evenly distributed
> 
>
> Key: YARN-9634
> URL: https://issues.apache.org/jira/browse/YARN-9634
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> When the cluster size is large, the dir which user submits the job, and the 
> dir which container log aggregate, and other information will fill the HDFS 
> directory, because the HDFS directory has a default storage limit. In 
> response to this situation, we can change these dirs more distributed, with 
> some policy to choose, such as hash policy and round robin policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9633) Support doas parameter at rest api of yarn-service

2019-06-20 Thread KWON BYUNGCHANG (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869072#comment-16869072
 ] 

KWON BYUNGCHANG commented on YARN-9633:
---

[~eyang] thank you for your feedback

> Support doas parameter at rest api of yarn-service
> --
>
> Key: YARN-9633
> URL: https://issues.apache.org/jira/browse/YARN-9633
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Major
> Attachments: YARN-9633.001.patch
>
>
> user can submit application of yarn-service with a more user-friendly web-ui 
> than RM UI2
> web-ui need proxy user privileges and the web-ui must be able to submit jobs 
> as an end user.
> REST API of yarn-service need to doas function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9582) Port YARN-8569 to RuncContainerRuntime

2019-06-20 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9582:
--
Summary: Port YARN-8569 to RuncContainerRuntime  (was: Port YARN-8569 to 
FSImageContainerRuntime)

> Port YARN-8569 to RuncContainerRuntime
> --
>
> Key: YARN-9582
> URL: https://issues.apache.org/jira/browse/YARN-9582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Priority: Major
>
> After YARN-9562 is merged, we should add in the yarn sysfs to the new runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9633) Support doas parameter at rest api of yarn-service

2019-06-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868930#comment-16868930
 ] 

Eric Yang commented on YARN-9633:
-

[~magnum] Thank you for the patch.  However, this feature is already supported, 
there is no need to duplicate web impersonation logic in YARN service api.  In 
core-site.xml, set:

{code}

  hadoop.http.filter.initializers
  
org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilterInitializer,org.apache.hadoop.security.HttpCrossOriginFilterInitializer

{code}

This will enable web impersonation globally to all web end points.

> Support doas parameter at rest api of yarn-service
> --
>
> Key: YARN-9633
> URL: https://issues.apache.org/jira/browse/YARN-9633
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Major
> Attachments: YARN-9633.001.patch
>
>
> user can submit application of yarn-service with a more user-friendly web-ui 
> than RM UI2
> web-ui need proxy user privileges and the web-ui must be able to submit jobs 
> as an end user.
> REST API of yarn-service need to doas function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868803#comment-16868803
 ] 

Hadoop QA commented on YARN-9560:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 22 unchanged - 3 fixed = 23 total (was 25) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 40s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9560 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972357/YARN-9560.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f668e73c3747 4.4.0-144-generic #170~14.04.1-Ubuntu SMP Mon Mar 
18 15:02:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c4b15d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24296/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 

[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Priority: Blocker  (was: Major)

> SLS fails with supplied example files
> -
>
> Key: YARN-9636
> URL: https://issues.apache.org/jira/browse/YARN-9636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator, security
>Affects Versions: 3.3.0
>Reporter: Erkin Alp Güney
>Priority: Blocker
>  Labels: blocker
>
> {{Exception in thread "Listener at localhost/18033" 
> java.lang.NoSuchMethodError: 
> java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
>  }}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
>  }}
> {{  at java.security.AccessController.doPrivileged(Native Method) }}
> {{  at javax.security.auth.Subject.doAs(Subject.java:422) }}
> {{  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977) }}
> {{  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984) }}
> Likely caused by covariant return types used by one of the dependent JARs. 
> This is not valid in JVM 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Component/s: security

> SLS fails with supplied example files
> -
>
> Key: YARN-9636
> URL: https://issues.apache.org/jira/browse/YARN-9636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator, security
>Affects Versions: 3.3.0
>Reporter: Erkin Alp Güney
>Priority: Major
>
> {{Exception in thread "Listener at localhost/18033" 
> java.lang.NoSuchMethodError: 
> java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
>  }}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
>  }}
> {{  at java.security.AccessController.doPrivileged(Native Method) }}
> {{  at javax.security.auth.Subject.doAs(Subject.java:422) }}
> {{  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977) }}
> {{  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984) }}
> Likely caused by covariant return types used by one of the dependent JARs. 
> This is not valid in JVM 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Labels: blocker  (was: )

> SLS fails with supplied example files
> -
>
> Key: YARN-9636
> URL: https://issues.apache.org/jira/browse/YARN-9636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator, security
>Affects Versions: 3.3.0
>Reporter: Erkin Alp Güney
>Priority: Major
>  Labels: blocker
>
> {{Exception in thread "Listener at localhost/18033" 
> java.lang.NoSuchMethodError: 
> java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
>  }}
> {{  at 
> org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
>  }}
> {{  at java.security.AccessController.doPrivileged(Native Method) }}
> {{  at javax.security.auth.Subject.doAs(Subject.java:422) }}
> {{  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
>  }}
> {{  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
>  }}
> {{  at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977) }}
> {{  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) }}
> {{  at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984) }}
> Likely caused by covariant return types used by one of the dependent JARs. 
> This is not valid in JVM 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Description: 
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
 }}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
 }}
{{  at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
 }}
{{  at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
 }}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
{{  at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
 }}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
 }}
{{  at java.security.AccessController.doPrivileged(Native Method) }}
{{  at javax.security.auth.Subject.doAs(Subject.java:422) }}
{{  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
 }}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
 }}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) }}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293) }}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240) }}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977) }}
{{  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) }}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984) }}

Likely caused by covariant return types used by one of the dependent JARs. This 
is not valid in JVM 8.

  was:
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)}}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)}}
{{  at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)}}
{{  at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)}}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{  at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)}}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)}}
{{  at java.security.AccessController.doPrivileged(Native Method)}}
{{  at javax.security.auth.Subject.doAs(Subject.java:422)}}
{{  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)}}
{{  at 

[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Description: 
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)}}
{{  at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)}}
{{  at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)}}
{{  at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)}}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{  at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)}}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)}}
{{  at java.security.AccessController.doPrivileged(Native Method)}}
{{  at javax.security.auth.Subject.doAs(Subject.java:422)}}
{{  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)}}
{{  at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)}}
{{  at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293)}}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240)}}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977)}}
{{  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)}}
{{  at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984)}}

Likely caused by covariant return types used by one of the dependent JARs. This 
is not valid in JVM 8.

  was:
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
{{ \{{ {{ {{     at java.security.AccessController.doPrivileged(Native 
Method)
{{ \{{ {{ {{     at 

[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Description: 
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
{{ \{{ {{ {{     at java.security.AccessController.doPrivileged(Native 
Method)
{{ \{{ {{ {{     at javax.security.auth.Subject.doAs(Subject.java:422)
{{ \{{ {{ {{     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
{{ \{{ {{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977)
{{ \{{ {{ {{     at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
{{ \{{ {{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984)

{{Likely caused by covariant return types used by one of the dependent JARs.}}

  was:
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)}}
{{ \{{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{ \{{ {{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)}}
{{ \{{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)}}
{{ \{{ {{     at 

[jira] [Created] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA
Erkin Alp Güney created YARN-9636:
-

 Summary: SLS fails with supplied example files
 Key: YARN-9636
 URL: https://issues.apache.org/jira/browse/YARN-9636
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler-load-simulator
Affects Versions: 3.3.0
Reporter: Erkin Alp Güney


{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
{{ \{{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
{{ \{{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
{{ \{{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
{{ \{{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
{{ \{{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
{{ \{{     at java.security.AccessController.doPrivileged(Native Method)
{{ \{{     at javax.security.auth.Subject.doAs(Subject.java:422)
{{ \{{     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
{{ \{{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{     at 
org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293)
{{ \{{     at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240)
{{ \{{     at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977)
{{ \{{     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
{{ \{{     at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984)

Likely caused by covariant return types used by one of the dependent JARs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9636) SLS fails with supplied example files

2019-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erkin Alp Güney updated YARN-9636:
--
Description: 
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)}}
{{ \{{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{ \{{ {{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)}}
{{ \{{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)}}
{{ \{{ {{     at java.security.AccessController.doPrivileged(Native 
Method)}}
{{ \{{ {{     at javax.security.auth.Subject.doAs(Subject.java:422)}}
{{ \{{ {{     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)}}
{{ \{{ {{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:293)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:240)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:977)}}
{{ \{{ {{     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)}}
{{ \{{ {{     at 
org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:984)}}{{Likely 
caused by covariant return types used by one of the dependent JARs.}}

  was:
{{Exception in thread "Listener at localhost/18033" 
java.lang.NoSuchMethodError: java.nio.ByteBuffer.rewind()Ljava/nio/ByteBuffer;}}
{{ \{{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(ProtoUtils.java:276)
{{ \{{     at 
org.apache.hadoop.yarn.api.records.impl.pb.ProtoBase.convertToProtoFormat(ProtoBase.java:63)
{{ \{{     at 
org.apache.hadoop.yarn.server.api.records.impl.pb.MasterKeyPBImpl.setBytes(MasterKeyPBImpl.java:77)
{{ \{{     at 
org.apache.hadoop.yarn.server.security.MasterKeyData.(MasterKeyData.java:40)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createNewMasterKey(AMRMTokenSecretManager.java:185)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.start(AMRMTokenSecretManager.java:106)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.serviceStart(RMSecretManagerService.java:78)
{{ \{{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{     at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
{{ \{{     at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
{{ \{{     at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
{{ \{{     at 

[jira] [Commented] (YARN-9596) QueueMetrics has incorrect metrics when labelled partitions are involved

2019-06-20 Thread Muhammad Samir Khan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868693#comment-16868693
 ] 

Muhammad Samir Khan commented on YARN-9596:
---

[~eepayne] yes, the patch applies cleanly with the --3way option on git apply. 
For branch-2.8 though the unit test fails because of a race condition in 
AsyncDispatcher (see 
[YARN-3878|[https://issues.apache.org/jira/browse/]YARN-3878], 
[YARN-5436|[https://issues.apache.org/jira/browse/]YARN-5436], and 
[YARN-5375|[https://issues.apache.org/jira/browse/]YARN-5375])

> QueueMetrics has incorrect metrics when labelled partitions are involved
> 
>
> Key: YARN-9596
> URL: https://issues.apache.org/jira/browse/YARN-9596
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.8.0, 3.3.0
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
>Priority: Major
> Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot 
> 2019-06-03 at 4.44.15 PM.png, YARN-9596.001.patch, YARN-9596.002.patch
>
>
> After YARN-6467, QueueMetrics should only be tracking metrics for the default 
> partition. However, the metrics are incorrect when labelled partitions are 
> involved.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Add label "test" to cluster and replace label on node1 to be "test"
>  # Note down "totalMB" at 
> /ws/v1/cluster/metrics
>  # Start first job on test queue.
>  # Start second job on default queue (does not work if the order of two jobs 
> is swapped).
>  # While the two applications are running, the "totalMB" at 
> /ws/v1/cluster/metrics will go down by 
> the amount of MB used by the first job (screenshots attached).
> Alternately:
> In 
> TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(),
>  add the following line at the end of the test before rm1.close():
> CSQueue rootQueue = cs.getRootQueue();
> assertEquals(10*GB,
>  rootQueue.getMetrics().getAvailableMB() + 
> rootQueue.getMetrics().getAllocatedMB());
> There are two nodes of 10GB each and only one of them have a non-default 
> label. The test will also fail against 20*GB check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9596) QueueMetrics has incorrect metrics when labelled partitions are involved

2019-06-20 Thread Muhammad Samir Khan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868693#comment-16868693
 ] 

Muhammad Samir Khan edited comment on YARN-9596 at 6/20/19 4:34 PM:


[~eepayne] yes, the patch applies cleanly with the --3way option on git apply. 
For branch-2.8 though the unit test fails because of a race condition in 
AsyncDispatcher (see YARN-3878, YARN-5436, and YARN-5375)


was (Author: samkhan):
[~eepayne] yes, the patch applies cleanly with the --3way option on git apply. 
For branch-2.8 though the unit test fails because of a race condition in 
AsyncDispatcher (see 
[YARN-3878|[https://issues.apache.org/jira/browse/]YARN-3878], 
[YARN-5436|[https://issues.apache.org/jira/browse/]YARN-5436], and 
[YARN-5375|[https://issues.apache.org/jira/browse/]YARN-5375])

> QueueMetrics has incorrect metrics when labelled partitions are involved
> 
>
> Key: YARN-9596
> URL: https://issues.apache.org/jira/browse/YARN-9596
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.8.0, 3.3.0
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
>Priority: Major
> Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot 
> 2019-06-03 at 4.44.15 PM.png, YARN-9596.001.patch, YARN-9596.002.patch
>
>
> After YARN-6467, QueueMetrics should only be tracking metrics for the default 
> partition. However, the metrics are incorrect when labelled partitions are 
> involved.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Add label "test" to cluster and replace label on node1 to be "test"
>  # Note down "totalMB" at 
> /ws/v1/cluster/metrics
>  # Start first job on test queue.
>  # Start second job on default queue (does not work if the order of two jobs 
> is swapped).
>  # While the two applications are running, the "totalMB" at 
> /ws/v1/cluster/metrics will go down by 
> the amount of MB used by the first job (screenshots attached).
> Alternately:
> In 
> TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(),
>  add the following line at the end of the test before rm1.close():
> CSQueue rootQueue = cs.getRootQueue();
> assertEquals(10*GB,
>  rootQueue.getMetrics().getAvailableMB() + 
> rootQueue.getMetrics().getAllocatedMB());
> There are two nodes of 10GB each and only one of them have a non-default 
> label. The test will also fail against 20*GB check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime

2019-06-20 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868664#comment-16868664
 ] 

Eric Badger commented on YARN-9560:
---

Attaching path 008 to address checkstyle. There will still be 1 checkstyle 
warning from OCIContainerRuntime.java:88, but I don't know how to fix that 
error without splitting up the config, which doesn't seem right.

> Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
> ---
>
> Key: YARN-9560
> URL: https://issues.apache.org/jira/browse/YARN-9560
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: YARN-9560.001.patch, YARN-9560.002.patch, 
> YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, 
> YARN-9560.006.patch, YARN-9560.007.patch, YARN-9560.008.patch
>
>
> Since the new OCI/squashFS/runc runtime will be using a lot of the same code 
> as DockerLinuxContainerRuntime, it would be good to move a bunch of the 
> DockerLinuxContainerRuntime code up a level to an abstract class that both of 
> the runtimes can extend. 
> The new structure will look like:
> {noformat}
> OCIContainerRuntime (abstract class)
>   - DockerLinuxContainerRuntime
>   - FSImageContainerRuntime (name negotiable)
> {noformat}
> This JIRA should only change the structure of the code, not the actual 
> semantics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime

2019-06-20 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9560:
--
Attachment: YARN-9560.008.patch

> Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
> ---
>
> Key: YARN-9560
> URL: https://issues.apache.org/jira/browse/YARN-9560
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: YARN-9560.001.patch, YARN-9560.002.patch, 
> YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, 
> YARN-9560.006.patch, YARN-9560.007.patch, YARN-9560.008.patch
>
>
> Since the new OCI/squashFS/runc runtime will be using a lot of the same code 
> as DockerLinuxContainerRuntime, it would be good to move a bunch of the 
> DockerLinuxContainerRuntime code up a level to an abstract class that both of 
> the runtimes can extend. 
> The new structure will look like:
> {noformat}
> OCIContainerRuntime (abstract class)
>   - DockerLinuxContainerRuntime
>   - FSImageContainerRuntime (name negotiable)
> {noformat}
> This JIRA should only change the structure of the code, not the actual 
> semantics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868633#comment-16868633
 ] 

Hadoop QA commented on YARN-9451:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-9451 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9451 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24295/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868628#comment-16868628
 ] 

Hadoop QA commented on YARN-9209:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 34s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.5 Server=18.09.5 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972335/YARN-9209.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7347c11e028e 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e02eb24 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24294/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24294/testReport/ |
| Max. process+thread count | 878 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Comment Edited] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868611#comment-16868611
 ] 

Prabhu Joseph edited comment on YARN-9451 at 6/20/19 3:15 PM:
--

After the patch, error message with right NM http port.

 !Screen Shot 2019-06-20 at 7.49.46 PM.png|width=300, height=100!

 


was (Author: prabhu joseph):
After the patch, error message with right NM http port.

 !Screen Shot 2019-06-20 at 7.49.46 PM.png! 

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9451:

Attachment: YARN-9451-001.patch

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9451:

Attachment: Screen Shot 2019-06-20 at 7.49.46 PM.png

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-06-20 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868611#comment-16868611
 ] 

Prabhu Joseph commented on YARN-9451:
-

After the patch, error message with right NM http port.

 !Screen Shot 2019-06-20 at 7.49.46 PM.png! 

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9634) Make yarn submit dir and log aggregation dir more evenly distributed

2019-06-20 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868547#comment-16868547
 ] 

Weiwei Yang commented on YARN-9634:
---

Hi [~zhuqi] what do you mean by the default storage limit, is that the space 
quota? Can you give an example of how to distribute these dirs, and why that 
helps? Thanks.

> Make yarn submit dir and log aggregation dir more evenly distributed
> 
>
> Key: YARN-9634
> URL: https://issues.apache.org/jira/browse/YARN-9634
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> When the cluster size is large, the dir which user submits the job, and the 
> dir which container log aggregate, and other information will fill the HDFS 
> directory, because the HDFS directory has a default storage limit. In 
> response to this situation, we can change these dirs more distributed, with 
> some policy to choose, such as hash policy and round robin policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-06-20 Thread zhoukang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868546#comment-16868546
 ] 

zhoukang commented on YARN-9537:


In our production cluster, we have some problems when AM was preempted, since 
the job will restart when AM was preempted. 
Thansk [~yufeigu]

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9537.001.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-9209:
---
Attachment: YARN-9209.003.patch

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch, YARN-9209.002.patch, 
> YARN-9209.003.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic

2019-06-20 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868490#comment-16868490
 ] 

Adam Antal commented on YARN-9559:
--

Hi,

Thanks for the patch [~jhung]. 

Generally I think this logic would better suit an interface. The interface 
would have the {{handle}} and probably an other function (e.g. {{init}}) 
handling the other parameters, so those would not be enforced to be given in 
construction time. Also by using implements we don't force the class hierarchy 
- hence it is possible to have your class extending some other class which may 
partially contain the logic you want to use here.

If the interface is not compatible with the {{Configuration.getClass}} call, 
then please ignore my comment. Anyways, in that case the public constructor 
with the String-only parameter should not be there (what would prevent users of 
this class from calling that constructor).

> Create AbstractContainersLauncher for pluggable ContainersLauncher logic
> 
>
> Key: YARN-9559
> URL: https://issues.apache.org/jira/browse/YARN-9559
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9559.001.patch, YARN-9559.002.patch, 
> YARN-9559.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868454#comment-16868454
 ] 

zhuqi commented on YARN-8995:
-

Hi, [~Tao Yang]

The TestYarnConfigurationFields failed, because the yarn-default.xml missing 
the new configuration which i created, if i need to add it in the 
yarn-default.xml.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868424#comment-16868424
 ] 

Hadoop QA commented on YARN-8995:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 7 new + 220 unchanged - 0 fixed = 227 total (was 220) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 57s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
42s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | TEST-TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-8995 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972314/YARN-8995.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 468a8c397988 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e02eb24 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 

[jira] [Updated] (YARN-9635) [UI2] Nodes page displayed duplicate nodes

2019-06-20 Thread Wanqiang Ji (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated YARN-9635:
--
Affects Version/s: 3.2.0

> [UI2] Nodes page displayed duplicate nodes
> --
>
> Key: YARN-9635
> URL: https://issues.apache.org/jira/browse/YARN-9635
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.2.0
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: UI2-nodes.jpg
>
>
> Steps:
>  * shutdown nodes
>  * start nodes
> Nodes Page:
> !UI2-nodes.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9635) [UI2] Nodes page displayed duplicate nodes

2019-06-20 Thread Wanqiang Ji (JIRA)
Wanqiang Ji created YARN-9635:
-

 Summary: [UI2] Nodes page displayed duplicate nodes
 Key: YARN-9635
 URL: https://issues.apache.org/jira/browse/YARN-9635
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wanqiang Ji
Assignee: Wanqiang Ji
 Attachments: UI2-nodes.jpg

Steps:
 * shutdown nodes
 * start nodes

Nodes Page:

!UI2-nodes.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9635) [UI2] Nodes page displayed duplicate nodes

2019-06-20 Thread Wanqiang Ji (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated YARN-9635:
--
Component/s: yarn-ui-v2

> [UI2] Nodes page displayed duplicate nodes
> --
>
> Key: YARN-9635
> URL: https://issues.apache.org/jira/browse/YARN-9635
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: UI2-nodes.jpg
>
>
> Steps:
>  * shutdown nodes
>  * start nodes
> Nodes Page:
> !UI2-nodes.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868418#comment-16868418
 ] 

Hadoop QA commented on YARN-9209:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 11 unchanged - 0 fixed = 12 total (was 11) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 41s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.5 Server=18.09.5 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972304/YARN-9209.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c6d230e6c706 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5bfdf62 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24291/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Commented] (YARN-9633) Support doas parameter at rest api of yarn-service

2019-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868415#comment-16868415
 ] 

Hadoop QA commented on YARN-9633:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
18s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
48s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9633 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972312/YARN-9633.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e95a5b4a2acd 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e02eb24 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 

[jira] [Commented] (YARN-9634) Make yarn submit dir and log aggregation dir more evenly distributed

2019-06-20 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868405#comment-16868405
 ] 

zhuqi commented on YARN-9634:
-

cc [~Tao Yang],[~cheersyang] ,[~leftnoteasy], [~sunilg]

> Make yarn submit dir and log aggregation dir more evenly distributed
> 
>
> Key: YARN-9634
> URL: https://issues.apache.org/jira/browse/YARN-9634
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> When the cluster size is large, the dir which user submits the job, and the 
> dir which container log aggregate, and other information will fill the HDFS 
> directory, because the HDFS directory has a default storage limit. In 
> response to this situation, we can change these dirs more distributed, with 
> some policy to choose, such as hash policy and round robin policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9634) Make yarn submit dir and log aggregation dir more evenly distributed

2019-06-20 Thread zhuqi (JIRA)
zhuqi created YARN-9634:
---

 Summary: Make yarn submit dir and log aggregation dir more evenly 
distributed
 Key: YARN-9634
 URL: https://issues.apache.org/jira/browse/YARN-9634
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: zhuqi
Assignee: zhuqi


When the cluster size is large, the dir which user submits the job, and the dir 
which container log aggregate, and other information will fill the HDFS 
directory, because the HDFS directory has a default storage limit. In response 
to this situation, we can change these dirs more distributed, with some policy 
to choose, such as hash policy and round robin policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868380#comment-16868380
 ] 

Tao Yang commented on YARN-8995:


I can see it's PA now, you can wait jenkins report after a few hours.
It still need to be reviewed by at least one committer who can help to commit 
this patch if approved.
cc: [~leftnoteasy], [~cheersyang], [~sunil.g].

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9633) Support doas parameter at rest api of yarn-service

2019-06-20 Thread KWON BYUNGCHANG (JIRA)
KWON BYUNGCHANG created YARN-9633:
-

 Summary: Support doas parameter at rest api of yarn-service
 Key: YARN-9633
 URL: https://issues.apache.org/jira/browse/YARN-9633
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn-native-services
Affects Versions: 3.1.2
Reporter: KWON BYUNGCHANG


user can submit application of yarn-service with a more user-friendly web-ui 
than RM UI2
web-ui need proxy user privileges and the web-ui must be able to submit jobs as 
an end user.

REST API of yarn-service need to doas function.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868371#comment-16868371
 ] 

zhuqi commented on YARN-8995:
-

Hi, [~Tao Yang]

Now i rename the new patch to YARN-8995.007.patch, and how to merge the patch 
finally.

Thanks.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: YARN-8995.007.patch

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: (was: YARN-8995.trunk-001.patch)

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: (was: YARN-8995.trunk-001.patch)

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.trunk-001.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868363#comment-16868363
 ] 

Tao Yang commented on YARN-8995:


Thanks [~zhuqi] for updating the patch. It's much better now.
Can you rename the new patch to YARN-8995.007.patch? 
There's no need to add "trunk" for the patch name since trunk is the default 
target branch. Additional, you can name the patch as 
"YARN-8995..001.patch" for non-trunk branches.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.trunk-001.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868350#comment-16868350
 ] 

zhuqi commented on YARN-8995:
-

Hi, [~Tao Yang]

Now the new patch  can be applied for trunk. 

Thanks.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.trunk-001.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: YARN-8995.trunk-001.patch

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.trunk-001.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868346#comment-16868346
 ] 

Tao Yang commented on YARN-9632:


Hi, [~lilingzj].
These metrics are supported from 2.8.0 in YARN-3467,  and 2.7.7 was already 
released for a year.
I think it's unnecessary to launch a new issue for this, I suggest you upgrade 
cluster or backport the patch in YARN-3467 for yourself. Of course, if there's 
a release plan for 2.7.8, you can propose to backport it in YARN-3467.

> No allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-9632
> URL: https://issues.apache.org/jira/browse/YARN-9632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, webapp, yarn
>Affects Versions: 2.7.7
>Reporter: zhangjian
>Priority: Major
> Fix For: 2.8.5
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> But, the RM Web UI does not report on these items . I saw it's ok in 
> hadoop2.8.5.
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: (was: YARN-8995.trunk-001.patch)

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-06-20 Thread zhuqi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: YARN-8995.trunk-001.patch

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9607) Auto-configuring rollover-size of IFile format for non-appendable filesystems

2019-06-20 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868326#comment-16868326
 ] 

Adam Antal commented on YARN-9607:
--

Thanks for the comment [~ste...@apache.org]. 

HADOOP-15691 would be really convenient. I'm on reviewing it. As I can see you 
were working on lately on the PR, so I take a look at that patch.

> Auto-configuring rollover-size of IFile format for non-appendable filesystems
> -
>
> Key: YARN-9607
> URL: https://issues.apache.org/jira/browse/YARN-9607
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9607.001.patch, YARN-9607.002.patch
>
>
> In YARN-9525, we made IFile format compatible with remote folders with s3a 
> scheme. In rolling fashioned log-aggregation IFile still fails with the 
> "append is not supported" error message, which is a known limitation of the 
> format by design. 
> There is a workaround though: setting the rollover size in the configuration 
> of the IFile format, in each rolling cycle a new aggregated log file will be 
> created, thus we eliminated the append from the process. Setting this config 
> globally would cause performance problems in the regular log-aggregation, so 
> I'm suggesting to enforcing this config to zero, if the scheme of the URI is 
> s3a (or any other non-appendable filesystem).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868320#comment-16868320
 ] 

Tarun Parimi commented on YARN-9209:


Modified unit tests to reflect the change intended by the patch.

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch, YARN-9209.002.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-06-20 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-9209:
---
Attachment: YARN-9209.002.patch

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch, YARN-9209.002.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9521) RM failed to start due to system services

2019-06-20 Thread kyungwan nam (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868310#comment-16868310
 ] 

kyungwan nam commented on YARN-9521:


{code:java}
2019-06-18 18:47:38,634 INFO  nodelabels.CommonNodeLabelsManager 
(CommonNodeLabelsManager.java:internalUpdateLabelsOnNodes(664)) - REPLACE 
labels on nodes:
2019-06-18 18:47:38,634 INFO  nodelabels.CommonNodeLabelsManager 
(CommonNodeLabelsManager.java:internalUpdateLabelsOnNodes(666)) -   
NM=test.nm1.com:0, labels=[test]
2019-06-18 18:47:38,635 INFO  allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - 
assignedContainer application attempt=appattempt_1560841031202_0111_01 
container=null queue=dev clusterResource= 
type=OFF_SWITCH requestedPartition=
2019-06-18 18:47:38,635 INFO  allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - 
assignedContainer application attempt=appattempt_1560841031202_0111_01 
container=null queue=dev clusterResource= 
type=OFF_SWITCH requestedPartition=
2019-06-18 18:47:38,635 INFO  allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - 
assignedContainer application attempt=appattempt_1560841031202_0111_01 
container=null queue=dev clusterResource= 
type=OFF_SWITCH requestedPartition=
2019-06-18 18:47:38,635 INFO  allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - 
assignedContainer application attempt=appattempt_1560841031202_0111_01 
container=null queue=dev clusterResource= 
type=OFF_SWITCH requestedPartition=
2019-06-18 18:47:38,636 INFO  rmcontainer.RMContainerImpl 
(RMContainerImpl.java:handle(480)) - container_e48_1560841031202_0111_01_002020 
Container Transitioned from NEW to ALLOCATED
2019-06-18 18:47:38,636 ERROR nodelabels.CommonNodeLabelsManager 
(CommonNodeLabelsManager.java:handleStoreEvent(201)) - Failed to store label 
modification to storage
2019-06-18 18:47:38,637 INFO  fica.FiCaSchedulerNode 
(FiCaSchedulerNode.java:allocateContainer(169)) - Assigned container 
container_e48_1560841031202_0111_01_002020 of capacity  
on host test.nm3.com:8454, which has 3 containers,  
used and  available after allocation
2019-06-18 18:47:38,637 FATAL event.AsyncDispatcher 
(AsyncDispatcher.java:dispatch(203)) - Error in dispatcher thread
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: 
Filesystem closed
at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.handleStoreEvent(CommonNodeLabelsManager.java:202)
at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:174)
at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:169)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:473)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1412)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1383)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:427)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:423)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:435)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:404)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1379)
at 
org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.ensureAppendEditlogFile(FileSystemNodeLabelsStore.java:107)
at 
org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.updateNodeToLabelsMappings(FileSystemNodeLabelsStore.java:118)
at 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.handleStoreEvent(CommonNodeLabelsManager.java:196)
... 5 more
2019-06-18 18:47:38,637 INFO  capacity.ParentQueue 
(ParentQueue.java:apply(1340)) - assignedContainer queue=root 
usedCapacity=0.08724866 absoluteUsedCapacity=0.08724866 used= cluster=
2019-06-18 18:47:38,637 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:tryCommit(2894)) - Allocation proposal accepted
2019-06-18 18:47:38,637 INFO  capacity.CapacityScheduler 
(CapacityScheduler.java:tryCommit(2900)) - Failed to accept allocation proposal
2019-06-18 18:47:38,637 INFO  capacity.CapacityScheduler 

[jira] [Updated] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread zhangjian (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjian updated YARN-9632:

Priority: Major  (was: Critical)

> No allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-9632
> URL: https://issues.apache.org/jira/browse/YARN-9632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, webapp, yarn
>Affects Versions: 2.7.7
>Reporter: zhangjian
>Priority: Major
> Fix For: 2.8.5
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> But, the RM Web UI does not report on these items . I saw it's ok in 
> hadoop2.8.5.
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9632) No allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2019-06-20 Thread zhangjian (JIRA)
zhangjian created YARN-9632:
---

 Summary: No allocatedMB, allocatedVCores, and runningContainers 
metrics on running Applications in RM Web UI
 Key: YARN-9632
 URL: https://issues.apache.org/jira/browse/YARN-9632
 Project: Hadoop YARN
  Issue Type: Bug
  Components: RM, webapp, yarn
Affects Versions: 2.7.7
Reporter: zhangjian
 Fix For: 2.8.5


The YARN REST API can report on the following properties:

*allocatedMB*: The sum of memory in MB allocated to the application's running 
containers
*allocatedVCores*: The sum of virtual cores allocated to the application's 
running containers
*runningContainers*: The number of containers currently running for the 
application

But, the RM Web UI does not report on these items . I saw it's ok in 
hadoop2.8.5.

It would be useful for YARN Application and Resource troubleshooting to have 
these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org