[jira] [Created] (YARN-7856) Validation and error handling when handling NM-RM with regarding to node-attributes

2018-01-29 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-7856:
-

 Summary: Validation and error handling when handling NM-RM with 
regarding to node-attributes
 Key: YARN-7856
 URL: https://issues.apache.org/jira/browse/YARN-7856
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, RM
Reporter: Weiwei Yang
Assignee: Weiwei Yang


When NM reports its distributed attributes to RM, RM needs to do proper 
validation of the received attributes, if attributes were not valid or failed 
to update, RM needs to notify NM about such failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5148) [UI2] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2018-01-29 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5148:
-
Summary: [UI2] Add page to new YARN UI to view server side 
configurations/logs/JVM-metrics  (was: [YARN-3368] Add page to new YARN UI to 
view server side configurations/logs/JVM-metrics)

> [UI2] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> -
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp, yarn-ui-v2
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: oct16-medium
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, Screen Shot 
> 2016-09-13 at 22.27.00.png, Screen Shot 2018-01-29 at 9.38.53 PM.png, Screen 
> Shot 2018-01-29 at 9.39.07 PM.png, UsingStringifyPrint.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, 
> YARN-5148-YARN-3368.03.patch, YARN-5148-YARN-3368.04.patch, 
> YARN-5148-YARN-3368.05.patch, YARN-5148-YARN-3368.06.patch, 
> YARN-5148.07.patch, YARN-5148.08.patch, YARN-5148.09.patch, 
> YARN-5148.10.patch, YARN-5148.11.patch, YARN-5148.12.patch, 
> YARN-5148.13.patch, YARN-5148.14.patch, YARN-5148.15.patch, 
> YARN-5148.16.patch, pretty-json-metrics.png, yarn-conf.png, yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2018-01-29 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344621#comment-16344621
 ] 

Wangda Tan commented on YARN-5148:
--

Thanks [~sunilg] , committing now.

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp, yarn-ui-v2
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: oct16-medium
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, Screen Shot 
> 2016-09-13 at 22.27.00.png, Screen Shot 2018-01-29 at 9.38.53 PM.png, Screen 
> Shot 2018-01-29 at 9.39.07 PM.png, UsingStringifyPrint.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, 
> YARN-5148-YARN-3368.03.patch, YARN-5148-YARN-3368.04.patch, 
> YARN-5148-YARN-3368.05.patch, YARN-5148-YARN-3368.06.patch, 
> YARN-5148.07.patch, YARN-5148.08.patch, YARN-5148.09.patch, 
> YARN-5148.10.patch, YARN-5148.11.patch, YARN-5148.12.patch, 
> YARN-5148.13.patch, YARN-5148.14.patch, YARN-5148.15.patch, 
> YARN-5148.16.patch, pretty-json-metrics.png, yarn-conf.png, yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344615#comment-16344615
 ] 

genericqa commented on YARN-7840:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
19s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
38s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
11s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-3409 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 18s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
15s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7840 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908270/YARN-7840-YARN-3409.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux cb7d21e8856c 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Pe

[jira] [Assigned] (YARN-7854) Attach prefixes to different type of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-7854:
-

Assignee: LiangYe  (was: Weiwei Yang)

> Attach prefixes to different type of node attributes
> 
>
> Key: YARN-7854
> URL: https://issues.apache.org/jira/browse/YARN-7854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: RM
>Reporter: Weiwei Yang
>Assignee: LiangYe
>Priority: Major
>
> There are multiple types of node attributes depending on which source it 
> comes from, includes
>  # Centralized: attributes set by users (admin or normal users)
>  # Distributed: attributes collected by a certain attribute provider on each 
> NM
>  # System: some built-in attributes in yarn, set by yarn internal components, 
> e.g scheduler
> To better manage these attributes, we introduce the prefix (namespace) 
> concept to the an attribute. This Jira is opened to figure out how to attach 
> prefixes (automatically/implicitly or explicitly) to different type of 
> attributes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2185) Use pipes when localizing archives

2018-01-29 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344605#comment-16344605
 ] 

Rohith Sharma K S commented on YARN-2185:
-

This is breaking in secured cluster with permission denied exception while 
localizing archives. [~miklos.szeg...@cloudera.com] could you help to check 
this as it is blocking in secure cluster? 
{noformat}
18/01/29 09:27:19 INFO mapreduce.Job:  map 0% reduce 0%
18/01/29 09:27:19 INFO mapreduce.Job: Job job_1517214437819_0004 failed with 
state FAILED due to: Application application_1517214437819_0004 failed 20 times 
due to AM Container for appattempt_1517214437819_0004_20 exited with  
exitCode: 1
Failing this attempt.Diagnostics: [2018-01-29 09:27:17.501]Exception from 
container-launch.
Container id: container_e20_1517214437819_0004_20_01
Exit code: 1
Shell output: main : command provided 1
main : run as user is hrt_qa
main : requested yarn user is hrt_qa
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file 
/grid/0/hadoop/yarn/local/nmPrivate/application_1517214437819_0004/container_e20_1517214437819_0004_20_01/container_e20_1517214437819_0004_20_01.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...


[2018-01-29 09:27:17.503]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
find: ?./mr-framework?: Permission denied

[2018-01-29 09:27:17.503]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
find: ?./mr-framework?: Permission denied

For more detailed output, check the application tracking page: 
http://ctr-e137-1514896590304-41842-01-06.hwx.site:8088/cluster/app/application_1517214437819_0004
 Then click on links to logs of each attempt.
. Failing the application.
18/01/29 09:27:19 INFO mapreduce.Job: Counters: 0
Job job_1517214437819_0004 failed!
{noformat}

> Use pipes when localizing archives
> --
>
> Key: YARN-2185
> URL: https://issues.apache.org/jira/browse/YARN-2185
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-2185.000.patch, YARN-2185.001.patch, 
> YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch, 
> YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch, 
> YARN-2185.008.patch, YARN-2185.009.patch, YARN-2185.010.patch, 
> YARN-2185.011.patch, YARN-2185.012.patch, YARN-2185.012.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it, 
> and then removes it.  It would be more efficient to stream the data as it's 
> being unpacked to avoid both the extra disk space requirements and the 
> additional disk activity from storing the archive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344598#comment-16344598
 ] 

Weiwei Yang commented on YARN-7840:
---

[~bibinchundatt] could you please take a look the v2 patch? I can't recall we 
ever discussed to replace the type according to your second comment (might have 
missed something?), please share your idea. If #2 is not critical, I suggest to 
get this in first and we can discuss the improvements in a followup JIRA, as 
this is a blocker for rest of patches. What do you think?

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344594#comment-16344594
 ] 

Weiwei Yang commented on YARN-7757:
---

A few notes for reviewers
 # This patch refactor existing code like what described in 
[^nodeLabelsProvider_refactor_class_hierarchy.pdf]
 # This patch adds a \{{ScriptBasedNodeAttributesProvider}} which is used to 
collect node attributes from a configured script. It reuses a lot of code from 
the general classes.
 # Added \{{TestScriptBasedNodeAttributesProvider}} to test attribute script 
provider in different cases

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344589#comment-16344589
 ] 

Weiwei Yang commented on YARN-7757:
---

UT failure is related, fixed it and also the checkstyle issues in v4 patch.

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7757:
--
Attachment: YARN-7757-YARN-3409.004.patch

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> YARN-7757-YARN-3409.004.patch, nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7844) Expose metrics for scheduler operation (allocate, schedulerEvent) to JMX

2018-01-29 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344581#comment-16344581
 ] 

Wei Yan commented on YARN-7844:
---

After enabling the metrics, the JMX will have one more item like:
{code:java}
{
"name" : "Hadoop:service=ResourceManager,name=SchedulerOpMetrics",
"modelerType" : "SchedulerOpMetrics",
"tag.Hostname" : "hostname",
"AllocateCallNumOps" : 0,
"AllocateCallAvgTime" : 0.0,
"AllocateCallStdevTime" : 0.0,
"AllocateCallIMinTime" : 3.4028234663852886E38,
"AllocateCallIMaxTime" : 1.401298464324817E-45,
"AllocateCallMinTime" : 3.4028234663852886E38,
"AllocateCallMaxTime" : 1.401298464324817E-45,
"AllocateCallINumOps" : 0,
"NodeAddedCallNumOps" : 1,
"NodeAddedCallAvgTime" : 7.0,
"NodeRemovedCallNumOps" : 0,
"NodeRemovedCallAvgTime" : 0.0,
"NodeUpdateCallNumOps" : 19,
"NodeUpdateCallAvgTime" : 0.1,
"NodeResourceUpdateCallNumOps" : 0,
"NodeResourceUpdateCallAvgTime" : 0.0,
"NodeLabelsUpdateCallNumOps" : 1,
"NodeLabelsUpdateCallAvgTime" : 2.0,
"AppAddedCallNumOps" : 0,
"AppAddedCallAvgTime" : 0.0,
"AppRemovedCallNumOps" : 0,
"AppRemovedCallAvgTime" : 0.0,
"AppAttemptAddedCallNumOps" : 0,
"AppAttemptAddedCallAvgTime" : 0.0,
"AppAttemptRemovedCallNumOps" : 0,
"AppAttemptRemovedCallAvgTime" : 0.0,
"ContainerExpiredCallNumOps" : 0,
"ContainerExpiredCallAvgTime" : 0.0,
"ReleaseContainerCallNumOps" : 0,
"ReleaseContainerCallAvgTime" : 0.0,
"KillReservedContainerCallNumOps" : 0,
"KillReservedContainerCallAvgTime" : 0.0,
"MarkContainerForPreemptionCallNumOps" : 0,
"MarkContainerForPreemptionCallAvgTime" : 0.0,
"MarkContainerForKillableCallNumOps" : 0,
"MarkContainerForKillableCallAvgTime" : 0.0,
"MarkContainerForNonKillableCallNumOps" : 0,
"MarkContainerForNonKillableCallAvgTime" : 0.0,
"ManageQueueCallNumOps" : 0,
"ManageQueueCallAvgTime" : 0.0
  }, {{code}

> Expose metrics for scheduler operation (allocate, schedulerEvent) to JMX
> 
>
> Key: YARN-7844
> URL: https://issues.apache.org/jira/browse/YARN-7844
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-7844.000.patch
>
>
> Currently FairScheduler's FSOpDurations records some scheduler operation 
> metrics: nodeUpdateCall, preemptCall, etc. We may need similar for 
> CapacityScheduler. Also, need to add more metrics there. This could help 
> monitor the RM scheduler performance, and get more insights whether scheduler 
> is under-pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7844) Expose metrics for scheduler operation (allocate, schedulerEvent) to JMX

2018-01-29 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-7844:
--
Attachment: YARN-7844.000.patch

> Expose metrics for scheduler operation (allocate, schedulerEvent) to JMX
> 
>
> Key: YARN-7844
> URL: https://issues.apache.org/jira/browse/YARN-7844
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-7844.000.patch
>
>
> Currently FairScheduler's FSOpDurations records some scheduler operation 
> metrics: nodeUpdateCall, preemptCall, etc. We may need similar for 
> CapacityScheduler. Also, need to add more metrics there. This could help 
> monitor the RM scheduler performance, and get more insights whether scheduler 
> is under-pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344575#comment-16344575
 ] 

genericqa commented on YARN-7757:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
55s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
51s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
31s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-3409 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  4s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 299 unchanged - 21 fixed = 301 total (was 320) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 47s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m  
1s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7757 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908261/YARN-7757-YARN-3409.003.patch
 |
| Optional Tests |  asflicense  c

[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344570#comment-16344570
 ] 

Weiwei Yang commented on YARN-7840:
---

Hi [~Naganarasimha]

v2 patch addresses my comment, I am +1 on the patch once jenkins is happy.

 

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7840) Update PB for prefix support of node attributes

2018-01-29 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344564#comment-16344564
 ] 

Naganarasimha G R commented on YARN-7840:
-

Thanks for the quick review [~cheersyang] & [~bibinchundatt], Hope my latest 
patch takes care of the points which you mentioned. Hope to see review comments 
at the earliest so that i can take care of it immediately.

TestPBImplRecords should suffice for the testcase and findbugs issue is not 
related to my fix.

 

 

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7854) Attach prefixes to different type of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-7854:
-

Assignee: Weiwei Yang

> Attach prefixes to different type of node attributes
> 
>
> Key: YARN-7854
> URL: https://issues.apache.org/jira/browse/YARN-7854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: RM
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>
> There are multiple types of node attributes depending on which source it 
> comes from, includes
>  # Centralized: attributes set by users (admin or normal users)
>  # Distributed: attributes collected by a certain attribute provider on each 
> NM
>  # System: some built-in attributes in yarn, set by yarn internal components, 
> e.g scheduler
> To better manage these attributes, we introduce the prefix (namespace) 
> concept to the an attribute. This Jira is opened to figure out how to attach 
> prefixes (automatically/implicitly or explicitly) to different type of 
> attributes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7854) Attach prefixes to different type of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-7854:
-

Assignee: (was: Weiwei Yang)

> Attach prefixes to different type of node attributes
> 
>
> Key: YARN-7854
> URL: https://issues.apache.org/jira/browse/YARN-7854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: RM
>Reporter: Weiwei Yang
>Priority: Major
>
> There are multiple types of node attributes depending on which source it 
> comes from, includes
>  # Centralized: attributes set by users (admin or normal users)
>  # Distributed: attributes collected by a certain attribute provider on each 
> NM
>  # System: some built-in attributes in yarn, set by yarn internal components, 
> e.g scheduler
> To better manage these attributes, we introduce the prefix (namespace) 
> concept to the an attribute. This Jira is opened to figure out how to attach 
> prefixes (automatically/implicitly or explicitly) to different type of 
> attributes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7840) Update PB for prefix support of node attributes

2018-01-29 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-7840:

Attachment: YARN-7840-YARN-3409.002.patch

> Update PB for prefix support of node attributes
> ---
>
> Key: YARN-7840
> URL: https://issues.apache.org/jira/browse/YARN-7840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-7840-YARN-3409.001.patch, 
> YARN-7840-YARN-3409.002.patch
>
>
> We need to support prefix (namespace) for node attributes, this will add the 
> flexibility to provide ability to do proper ACL, avoid naming conflicts etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7855) RMAppAttemptImpl throws Invalid event: CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception

2018-01-29 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created YARN-7855:
-

 Summary: RMAppAttemptImpl throws Invalid event: 
CONTAINER_ALLOCATED at ALLOCATED_SAVING Exception
 Key: YARN-7855
 URL: https://issues.apache.org/jira/browse/YARN-7855
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.5
Reporter: Zhizhen Hou


After upgrade hadoop from hadoop 2.6 to hadoop 2.7.5, the resourcemanager 
report the following error log occasionally.

 
{code:java}
2018-01-30 14:12:41,349 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
CONTAINER_ALLOCATED at ALLOCATED_SAVING
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
    at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
    at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
    at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
    at java.lang.Thread.run(Thread.java:745)
2018-01-30 14:12:41,351 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
CONTAINER_ALLOCATED at ALLOCATED_SAVING
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:808)
    at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
    at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:803)
    at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:784)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
    at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7854) Attach prefixes to different type of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-7854:
-

Assignee: Weiwei Yang

> Attach prefixes to different type of node attributes
> 
>
> Key: YARN-7854
> URL: https://issues.apache.org/jira/browse/YARN-7854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: RM
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>
> There are multiple types of node attributes depending on which source it 
> comes from, includes
>  # Centralized: attributes set by users (admin or normal users)
>  # Distributed: attributes collected by a certain attribute provider on each 
> NM
>  # System: some built-in attributes in yarn, set by yarn internal components, 
> e.g scheduler
> To better manage these attributes, we introduce the prefix (namespace) 
> concept to the an attribute. This Jira is opened to figure out how to attach 
> prefixes (automatically/implicitly or explicitly) to different type of 
> attributes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7854) Attach prefixes to different type of node attributes

2018-01-29 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-7854:
-

 Summary: Attach prefixes to different type of node attributes
 Key: YARN-7854
 URL: https://issues.apache.org/jira/browse/YARN-7854
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: RM
Reporter: Weiwei Yang


There are multiple types of node attributes depending on which source it comes 
from, includes
 # Centralized: attributes set by users (admin or normal users)
 # Distributed: attributes collected by a certain attribute provider on each NM
 # System: some built-in attributes in yarn, set by yarn internal components, 
e.g scheduler

To better manage these attributes, we introduce the prefix (namespace) concept 
to the an attribute. This Jira is opened to figure out how to attach prefixes 
(automatically/implicitly or explicitly) to different type of attributes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5592) Add support for dynamic resource updates with multiple resource types

2018-01-29 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344532#comment-16344532
 ] 

Varun Vasudev commented on YARN-5592:
-

[~maniraj...@gmail.com] - this ticket was created to extend functionality added 
by  YARN-291 to all resource types.

> Add support for dynamic resource updates with multiple resource types
> -
>
> Key: YARN-5592
> URL: https://issues.apache.org/jira/browse/YARN-5592
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Manikandan R
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5592) Add support for dynamic resource updates with multiple resource types

2018-01-29 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-5592:
---

Assignee: Manikandan R  (was: Varun Vasudev)

> Add support for dynamic resource updates with multiple resource types
> -
>
> Key: YARN-5592
> URL: https://issues.apache.org/jira/browse/YARN-5592
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Manikandan R
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5590) Add support for increase and decrease of container resources with resource profiles

2018-01-29 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-5590:
---

Assignee: Manikandan R  (was: Varun Vasudev)

> Add support for increase and decrease of container resources with resource 
> profiles
> ---
>
> Key: YARN-5590
> URL: https://issues.apache.org/jira/browse/YARN-5590
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Manikandan R
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5590) Add support for increase and decrease of container resources with resource profiles

2018-01-29 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344526#comment-16344526
 ] 

Varun Vasudev commented on YARN-5590:
-

[~maniraj...@gmail.com] - please feel free to take it up. This is to add 
support for increasing and decreasing container resources, to extend the 
functionality added by YARN-1197 to apply it to all resource types.

> Add support for increase and decrease of container resources with resource 
> profiles
> ---
>
> Key: YARN-5590
> URL: https://issues.apache.org/jira/browse/YARN-5590
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344523#comment-16344523
 ] 

Bibin A Chundatt edited comment on YARN-7842 at 1/30/18 5:33 AM:
-

Thank you [~cheersyang] for quick patch

 

Overall patch looks good to me. +1 from my side

Will wait for review from others too. 


was (Author: bibinchundatt):
Thank you [~cheersyang] for quick patch

 

Overall patch looks good to me. +1 from my side

 

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344523#comment-16344523
 ] 

Bibin A Chundatt commented on YARN-7842:


Thank you [~cheersyang] for quick patch

 

Overall patch looks good to me. +1 from my side

 

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7780) Documentation for Placement Constraints

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344516#comment-16344516
 ] 

genericqa commented on YARN-7780:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-6592 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  4m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
31s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
37s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
12s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-6592 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} YARN-6592 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7780 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908258/YARN-7780-YARN-6592.003.patc

[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344513#comment-16344513
 ] 

genericqa commented on YARN-5148:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
37s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-5148 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908259/YARN-5148.16.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 65db93547e4e 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / dbb9dde |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19516/testReport/ |
| Max. process+thread count | 441 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19516/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp, yarn-ui-v2
>Reporter: Wangda Tan
>

[jira] [Comment Edited] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344431#comment-16344431
 ] 

Weiwei Yang edited comment on YARN-7842 at 1/30/18 4:51 AM:


Hi [~Naganarasimha], [~sunilg], [~bibinchundatt]

This patch only includes the PB changes to HB request in order to unblock other 
tasks, can you help to review ? Thanks!


was (Author: cheersyang):
Hi [~Naganarasimha], [~sunilg], [~bibinchundatt]

Could you help to review this patch, thanks?

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7757:
--
Attachment: YARN-7757-YARN-3409.003.patch

> Refactor NodeLabelsProvider to be more generic and reusable for node 
> attributes providers
> -
>
> Key: YARN-7757
> URL: https://issues.apache.org/jira/browse/YARN-7757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7757-YARN-3409.001.patch, 
> YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, 
> nodeLabelsProvider_refactor_class_hierarchy.pdf
>
>
> Propose to do refactor on {{NodeLabelsProvider}}, 
> {{AbstractNodeLabelsProvider}} to be more generic, so node attributes 
> providers can reuse these interface/abstract classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2018-01-29 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5148:
--
Attachment: YARN-5148.16.patch

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp, yarn-ui-v2
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: oct16-medium
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, Screen Shot 
> 2016-09-13 at 22.27.00.png, Screen Shot 2018-01-29 at 9.38.53 PM.png, Screen 
> Shot 2018-01-29 at 9.39.07 PM.png, UsingStringifyPrint.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, 
> YARN-5148-YARN-3368.03.patch, YARN-5148-YARN-3368.04.patch, 
> YARN-5148-YARN-3368.05.patch, YARN-5148-YARN-3368.06.patch, 
> YARN-5148.07.patch, YARN-5148.08.patch, YARN-5148.09.patch, 
> YARN-5148.10.patch, YARN-5148.11.patch, YARN-5148.12.patch, 
> YARN-5148.13.patch, YARN-5148.14.patch, YARN-5148.15.patch, 
> YARN-5148.16.patch, pretty-json-metrics.png, yarn-conf.png, yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2018-01-29 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344480#comment-16344480
 ] 

Sunil G commented on YARN-5148:
---

Thanks [~leftnoteasy]. Fixed the name of logs page to "YARN Daemon Logs"

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp, yarn-ui-v2
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
>Priority: Major
>  Labels: oct16-medium
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, Screen Shot 
> 2016-09-13 at 22.27.00.png, Screen Shot 2018-01-29 at 9.38.53 PM.png, Screen 
> Shot 2018-01-29 at 9.39.07 PM.png, UsingStringifyPrint.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, 
> YARN-5148-YARN-3368.03.patch, YARN-5148-YARN-3368.04.patch, 
> YARN-5148-YARN-3368.05.patch, YARN-5148-YARN-3368.06.patch, 
> YARN-5148.07.patch, YARN-5148.08.patch, YARN-5148.09.patch, 
> YARN-5148.10.patch, YARN-5148.11.patch, YARN-5148.12.patch, 
> YARN-5148.13.patch, YARN-5148.14.patch, YARN-5148.15.patch, 
> YARN-5148.16.patch, pretty-json-metrics.png, yarn-conf.png, yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7780) Documentation for Placement Constraints

2018-01-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344470#comment-16344470
 ] 

Konstantinos Karanasos commented on YARN-7780:
--

Attaching new version addressing comments.

> Documentation for Placement Constraints
> ---
>
> Key: YARN-7780
> URL: https://issues.apache.org/jira/browse/YARN-7780
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
>Priority: Major
> Attachments: YARN-7780-YARN-6592.001.patch, 
> YARN-7780-YARN-6592.002.patch, YARN-7780-YARN-6592.003.patch
>
>
> JIRA to track documentation for the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7780) Documentation for Placement Constraints

2018-01-29 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-7780:
-
Attachment: YARN-7780-YARN-6592.003.patch

> Documentation for Placement Constraints
> ---
>
> Key: YARN-7780
> URL: https://issues.apache.org/jira/browse/YARN-7780
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
>Priority: Major
> Attachments: YARN-7780-YARN-6592.001.patch, 
> YARN-7780-YARN-6592.002.patch, YARN-7780-YARN-6592.003.patch
>
>
> JIRA to track documentation for the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344431#comment-16344431
 ] 

Weiwei Yang commented on YARN-7842:
---

Hi [~Naganarasimha], [~sunilg], [~bibinchundatt]

Could you help to review this patch, thanks?

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344427#comment-16344427
 ] 

genericqa commented on YARN-7842:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
48s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
6s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7842 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908247/YARN-7842-YARN-3409.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 98d5c76eb41f 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-3409 / 3472700 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19514/testReport/ |
| Max. process+thread count | 474 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19514/console |
| Powered b

[jira] [Commented] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344395#comment-16344395
 ] 

Yufei Gu commented on YARN-7853:


Thanks [~rohithsharma]. Close it as duplication

> SLS failed to startup due to java.lang.NoClassDefFoundError
> ---
>
> Key: YARN-7853
> URL: https://issues.apache.org/jira/browse/YARN-7853
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>   at java.lang.Class.getDeclaredMethods(Class.java:1975)
>   at 
> com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
>   at 
> com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
>   at 
> org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
>   at 
> com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
>   at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
>   at 
> com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
>   at com.google.inject.spi.Elements.getElements(Elements.java:110)
>   at 
> com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
>   at 
> com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
>   at com.google.inject.Guice.createInjector(Guice.java:96)
>   at com.google.inject.Guice.createInjector(Guice.java:73)
>   at com.google.inject.Guice.createInjector(Guice.java:62)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
>   at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
>   at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 40 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-7853.

Resolution: Duplicate

> SLS failed to startup due to java.lang.NoClassDefFoundError
> ---
>
> Key: YARN-7853
> URL: https://issues.apache.org/jira/browse/YARN-7853
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>   at java.lang.Class.getDeclaredMethods(Class.java:1975)
>   at 
> com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
>   at 
> com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
>   at 
> org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
>   at 
> com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
>   at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
>   at 
> com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
>   at com.google.inject.spi.Elements.getElements(Elements.java:110)
>   at 
> com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
>   at 
> com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
>   at com.google.inject.Guice.createInjector(Guice.java:96)
>   at com.google.inject.Guice.createInjector(Guice.java:73)
>   at com.google.inject.Guice.createInjector(Guice.java:62)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
>   at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
>   at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 40 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7842) PB changes to carry node-attributes in NM heartbeat

2018-01-29 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7842:
--
Attachment: YARN-7842-YARN-3409.002.patch

> PB changes to carry node-attributes in NM heartbeat
> ---
>
> Key: YARN-7842
> URL: https://issues.apache.org/jira/browse/YARN-7842
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7842-YARN-3409.001.patch, 
> YARN-7842-YARN-3409.002.patch
>
>
> PB changes to carry node-attributes in NM heartbeat. Split from a larger 
> patch for easier review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344393#comment-16344393
 ] 

Rohith Sharma K S commented on YARN-7853:
-

It is dup of YARN-7794.

> SLS failed to startup due to java.lang.NoClassDefFoundError
> ---
>
> Key: YARN-7853
> URL: https://issues.apache.org/jira/browse/YARN-7853
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>   at java.lang.Class.getDeclaredMethods(Class.java:1975)
>   at 
> com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
>   at 
> com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
>   at 
> org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
>   at 
> com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
>   at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
>   at 
> com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
>   at com.google.inject.spi.Elements.getElements(Elements.java:110)
>   at 
> com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
>   at 
> com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
>   at com.google.inject.Guice.createInjector(Guice.java:96)
>   at com.google.inject.Guice.createInjector(Guice.java:73)
>   at com.google.inject.Guice.createInjector(Guice.java:62)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
>   at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
>   at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 40 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7790) Improve Capacity Scheduler Async Scheduling to better handle node failures

2018-01-29 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344394#comment-16344394
 ] 

Wangda Tan commented on YARN-7790:
--

Thanks [~sunilg] , it's better to commit to branch-2.9 s well.

> Improve Capacity Scheduler Async Scheduling to better handle node failures
> --
>
> Key: YARN-7790
> URL: https://issues.apache.org/jira/browse/YARN-7790
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Fix For: 3.1.0, 3.0.1
>
> Attachments: YARN-7790.001.patch, YARN-7790.002.patch, 
> YARN-7790.003.patch
>
>
> This is not a new issue but async scheduling makes it worse:
> In sync scheduling, if an AM container allocated to a node, it assumes node 
> just heartbeat to RM, and AM launcher will connect NM to launch the 
> container. Even though it is possible that NM crashes after the heartbeat, 
> which causes AM hangs for a while. But it is related rare.
> In async scheduling world, multiple AM containers can be placed on a 
> problematic NM, which could cause application hangs easily. Discussed with 
> [~sunilg] and [~jianhe] , we need one fix:
> When async scheduling enabled:
>  - Skip node which missed X node heartbeat.
> And in addition, it's better to reduce wait time by setting following configs 
> to earlier fail a container being launched at an NM with connectivity issue.
> {code:java}
> RetryPolicy retryPolicy =
> createRetryPolicy(conf,
>   YarnConfiguration.CLIENT_NM_CONNECT_MAX_WAIT_MS,
>   YarnConfiguration.DEFAULT_CLIENT_NM_CONNECT_MAX_WAIT_MS,
>   YarnConfiguration.CLIENT_NM_CONNECT_RETRY_INTERVAL_MS,
>   YarnConfiguration.DEFAULT_CLIENT_NM_CONNECT_RETRY_INTERVAL_MS);
> {code}
> The second part is not covered by the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344382#comment-16344382
 ] 

Yufei Gu edited comment on YARN-7853 at 1/30/18 1:56 AM:
-

TimelineCollectorManager is in package 
hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar. We probably want to put 
hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar into 
/hadoop-tools/hadoop-tools-dist/target/hadoop-tools-dist-3.1.0-SNAPSHOT/share/hadoop/tools/lib
  so that it could be in the classpath of SLS. 


was (Author: yufeigu):
We probably want to put hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar 
into 
/hadoop-tools/hadoop-tools-dist/target/hadoop-tools-dist-3.1.0-SNAPSHOT//share/hadoop/tools/lib
  so that hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar could be in the 
classpath of SLS. 

> SLS failed to startup due to java.lang.NoClassDefFoundError
> ---
>
> Key: YARN-7853
> URL: https://issues.apache.org/jira/browse/YARN-7853
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>   at java.lang.Class.getDeclaredMethods(Class.java:1975)
>   at 
> com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
>   at 
> com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
>   at 
> org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
>   at 
> com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
>   at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
>   at 
> com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
>   at com.google.inject.spi.Elements.getElements(Elements.java:110)
>   at 
> com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
>   at 
> com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
>   at com.google.inject.Guice.createInjector(Guice.java:96)
>   at com.google.inject.Guice.createInjector(Guice.java:73)
>   at com.google.inject.Guice.createInjector(Guice.java:62)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
>   at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
>   at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Lau

[jira] [Commented] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344382#comment-16344382
 ] 

Yufei Gu commented on YARN-7853:


We probably want to put hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar 
into 
/hadoop-tools/hadoop-tools-dist/target/hadoop-tools-dist-3.1.0-SNAPSHOT//share/hadoop/tools/lib
  so that hadoop-yarn-server-timelineservice-3.1.0-SNAPSHOT.jar could be in the 
classpath of SLS. 

> SLS failed to startup due to java.lang.NoClassDefFoundError
> ---
>
> Key: YARN-7853
> URL: https://issues.apache.org/jira/browse/YARN-7853
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>   at java.lang.Class.getDeclaredMethods(Class.java:1975)
>   at 
> com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
>   at 
> com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
>   at 
> com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
>   at 
> org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
>   at 
> com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
>   at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
>   at 
> com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
>   at com.google.inject.spi.Elements.getElements(Elements.java:110)
>   at 
> com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
>   at 
> com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
>   at com.google.inject.Guice.createInjector(Guice.java:96)
>   at com.google.inject.Guice.createInjector(Guice.java:73)
>   at com.google.inject.Guice.createInjector(Guice.java:62)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
>   at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
>   at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 40 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-01-29 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344379#comment-16344379
 ] 

Weiwei Yang commented on YARN-7778:
---

Thanks for the feedback [~kkaranasos], I will work on a patch once YARN-7822 is 
done.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7853) SLS failed to startup due to java.lang.NoClassDefFoundError

2018-01-29 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-7853:
--

 Summary: SLS failed to startup due to 
java.lang.NoClassDefFoundError
 Key: YARN-7853
 URL: https://issues.apache.org/jira/browse/YARN-7853
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler-load-simulator
Affects Versions: 3.1.0
Reporter: Yufei Gu


{code}
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.getDeclaredMethods(Class.java:1975)
at 
com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:688)
at 
com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:380)
at 
com.google.inject.spi.InjectionPoint.forInstanceMethodsAndFields(InjectionPoint.java:399)
at 
com.google.inject.internal.BindingBuilder.toInstance(BindingBuilder.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebApp.setup(RMWebApp.java:60)
at 
org.apache.hadoop.yarn.webapp.WebApp.configureServlets(WebApp.java:160)
at 
com.google.inject.servlet.ServletModule.configure(ServletModule.java:55)
at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
at 
com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements.getElements(Elements.java:110)
at 
com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
at 
com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
at com.google.inject.Guice.createInjector(Guice.java:96)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:379)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:424)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1126)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1236)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.sls.SLSRunner.startRM(SLSRunner.java:284)
at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:231)
at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:943)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:950)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 40 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7765) [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM

2018-01-29 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7765:

Fix Version/s: 3.0.1
   2.10.0
   3.1.0

> [Atsv2] GSSException: No valid credentials provided - Failed to find any 
> Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM
> 
>
> Key: YARN-7765
> URL: https://issues.apache.org/jira/browse/YARN-7765
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Fix For: 3.1.0, 2.10.0, 3.0.1
>
> Attachments: YARN-7765.01.patch, YARN-7765.02.patch
>
>
> Secure cluster is deployed and all YARN services are started successfully. 
> When application is submitted, app collectors which is started as aux-service 
> throwing below exception. But this exception is *NOT* observed from RM 
> TimelineCollector. 
> Cluster is deployed with Hadoop-3.0 and Hbase-1.2.6 secure cluster. All the 
> YARN and HBase service are started and working perfectly fine. After 24 hours 
> i.e when token lifetime is expired, HBaseClient in NM and HDFSClient in 
> HMaster and HRegionServer started getting this error. After sometime, HBase 
> daemons got shutdown. In NM, JVM didn't shutdown but none of the events got 
> published.
> {noformat}
> 2018-01-17 11:04:48,017 FATAL ipc.RpcClientImpl (RpcClientImpl.java:run(684)) 
> - SASL authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> {noformat}
> cc :/ [~vrushalic] [~varun_saxena] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7835) [Atsv2] Race condition in NM while publishing events if second attempt launched on same node

2018-01-29 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344367#comment-16344367
 ] 

Rohith Sharma K S commented on YARN-7835:
-

[~haibochen] As per the design 
PerNodeTimelineCollectorsAuxService#initializeContainer adds collector for 
master container  and removes in 
PerNodeTimelineCollectorsAuxService#stopContainer. Lets take a case when order 
of events 
# Application-1 -> appAttempt-1 master container launched in node-1. App 
collector got added in node-1 timelineCollector manager for  application-1.
# For some reasons, master container got killed and appAttempt-2 is scheduled 
on same node i.e node-2
# Applicatin-1 -> appAttempt-2 master container launched in node-2. This try to 
check does collector is already exist and doesn't add since is exist. You can 
see from the above logs trace.
# Application-1->appAttempt-1 stopContianer request has come. This removes 
existing collector for an app which leads timeline collector doesn't have this 
app info. 

The other option to fix this issue was change TimelineCollector#collectors map 
from  to . But we cann't make 
change because in many places, we are adding application without attempt as 
well say in RM. 
Best way to handle out of order evet is from NodeManager itself!

> [Atsv2] Race condition in NM while publishing events if second attempt 
> launched on same node
> 
>
> Key: YARN-7835
> URL: https://issues.apache.org/jira/browse/YARN-7835
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-7835.001.patch
>
>
> It is observed race condition that if master container is killed for some 
> reason and launched on same node then NMTimelinePublisher doesn't add 
> timelineClient. But once completed container for 1st attempt has come then 
> NMTimelinePublisher removes the timelineClient. 
>  It causes all subsequent event publishing from different client fails to 
> publish with exception Application is not found. !



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

2018-01-29 Thread Steven Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344335#comment-16344335
 ] 

Steven Rand commented on YARN-7655:
---

Sounds good, thanks!

> avoid AM preemption caused by RRs for specific nodes or racks
> -
>
> Key: YARN-7655
> URL: https://issues.apache.org/jira/browse/YARN-7655
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: YARN-7655-001.patch
>
>
> We frequently see AM preemptions when 
> {{starvedApp.getStarvedResourceRequests()}} in 
> {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs 
> that request containers on a specific node. Since this causes us to only 
> consider one node to preempt containers on, the really good work that was 
> done in YARN-5830 doesn't save us from AM preemption. Even though there might 
> be multiple nodes on which we could preempt enough non-AM containers to 
> satisfy the app's starvation, we often wind up preempting one or more AM 
> containers on the single node that we're considering.
> A proposed solution is that if we're going to preempt one or more AM 
> containers for an RR that specifies a node or rack, then we should instead 
> expand the search space to consider all nodes. That way we take advantage of 
> YARN-5830, and only preempt AMs if there's no alternative. I've attached a 
> patch with an initial implementation of this. We've been running it on a few 
> clusters, and have seen AM preemptions drop from double-digit occurrences on 
> many days to zero.
> Of course, the tradeoff is some loss of locality, since the starved app is 
> less likely to be allocated resources at the most specific locality level 
> that it asked for. My opinion is that this tradeoff is worth it, but 
> interested to hear what others think as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7811) Service AM should use configured default docker network

2018-01-29 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344318#comment-16344318
 ] 

Eric Yang commented on YARN-7811:
-

+1 This patch works on my cluster.  I will commit this tomorrow if no 
objections.

> Service AM should use configured default docker network
> ---
>
> Key: YARN-7811
> URL: https://issues.apache.org/jira/browse/YARN-7811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-7811.01.patch
>
>
> Currently the DockerProviderService used by the Service AM hardcodes a 
> default of bridge for the docker network. We already have a YARN 
> configuration property for default network, so the Service AM should honor 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7851) Graph view does not show all AM attempts

2018-01-29 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-7851:
-
Description: 
Scenario:

1) Run an application where all AM attempt fails
2) Go to Graph view for application

Here, The application started 10 attempt of AM. However, Graph view has 
pictorial representation of 4 AM attempts. 
It should show all 10 attempts in Graph

  was:
Scenario:

1) Run an application where all AM attempt fails
2) Go to Graph view for application

Here, The application started 10 attempt of AM. However, Graph view has 
pictorial representation of only has 4 AM attempts. 
It should show all 10 attempts in Graph


> Graph view does not show all AM attempts
> 
>
> Key: YARN-7851
> URL: https://issues.apache.org/jira/browse/YARN-7851
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Priority: Major
> Attachments: Screen Shot 2018-01-29 at 4.16.47 PM.png
>
>
> Scenario:
> 1) Run an application where all AM attempt fails
> 2) Go to Graph view for application
> Here, The application started 10 attempt of AM. However, Graph view has 
> pictorial representation of 4 AM attempts. 
> It should show all 10 attempts in Graph



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7852) FlowRunReader constructs min_start_time filter for both createdtimestart and createdtimeend.

2018-01-29 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7852:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-7055

> FlowRunReader constructs min_start_time filter for both createdtimestart and 
> createdtimeend.
> 
>
> Key: YARN-7852
> URL: https://issues.apache.org/jira/browse/YARN-7852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> {code:java}
> protected FilterList constructFilterListBasedOnFilters() throws IOException {
>     FilterList listBasedOnFilters = new FilterList();
>     // Filter based on created time range.
>     Long createdTimeBegin = getFilters().getCreatedTimeBegin();
>     Long createdTimeEnd = getFilters().getCreatedTimeEnd();
>     if (createdTimeBegin != 0 || createdTimeEnd != Long.MAX_VALUE) {
>   listBasedOnFilters.addFilter(TimelineFilterUtils
>   .createSingleColValueFiltersByRange(FlowRunColumn.MIN_START_TIME,
>   createdTimeBegin, createdTimeEnd));
>     }
>     // Filter based on metric filters.
>     TimelineFilterList metricFilters = getFilters().getMetricFilters();
>     if (metricFilters != null && !metricFilters.getFilterList().isEmpty()) {
>   listBasedOnFilters.addFilter(TimelineFilterUtils.createHBaseFilterList(
>   FlowRunColumnPrefix.METRIC, metricFilters));
>     }
>     return listBasedOnFilters;
>   }{code}
>  
> createdTimeEnd is used as an upper bound for MIN_START_TIME.  We should 
> create one filter based on createdTimeBegin and another based on 
> createdTimeEnd.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7852) FlowRunReader constructs min_start_time filter for both createdtimestart and createdtimeend.

2018-01-29 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7852:
-
Description: 
{code:java}
protected FilterList constructFilterListBasedOnFilters() throws IOException {
    FilterList listBasedOnFilters = new FilterList();
    // Filter based on created time range.
    Long createdTimeBegin = getFilters().getCreatedTimeBegin();
    Long createdTimeEnd = getFilters().getCreatedTimeEnd();
    if (createdTimeBegin != 0 || createdTimeEnd != Long.MAX_VALUE) {
  listBasedOnFilters.addFilter(TimelineFilterUtils
  .createSingleColValueFiltersByRange(FlowRunColumn.MIN_START_TIME,
  createdTimeBegin, createdTimeEnd));
    }
    // Filter based on metric filters.
    TimelineFilterList metricFilters = getFilters().getMetricFilters();
    if (metricFilters != null && !metricFilters.getFilterList().isEmpty()) {
  listBasedOnFilters.addFilter(TimelineFilterUtils.createHBaseFilterList(
  FlowRunColumnPrefix.METRIC, metricFilters));
    }
    return listBasedOnFilters;
  }{code}
 

createdTimeEnd is used as an upper bound for MIN_START_TIME.  We should create 
one filter based on createdTimeBegin and another based on createdTimeEnd.

> FlowRunReader constructs min_start_time filter for both createdtimestart and 
> createdtimeend.
> 
>
> Key: YARN-7852
> URL: https://issues.apache.org/jira/browse/YARN-7852
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelinereader
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> {code:java}
> protected FilterList constructFilterListBasedOnFilters() throws IOException {
>     FilterList listBasedOnFilters = new FilterList();
>     // Filter based on created time range.
>     Long createdTimeBegin = getFilters().getCreatedTimeBegin();
>     Long createdTimeEnd = getFilters().getCreatedTimeEnd();
>     if (createdTimeBegin != 0 || createdTimeEnd != Long.MAX_VALUE) {
>   listBasedOnFilters.addFilter(TimelineFilterUtils
>   .createSingleColValueFiltersByRange(FlowRunColumn.MIN_START_TIME,
>   createdTimeBegin, createdTimeEnd));
>     }
>     // Filter based on metric filters.
>     TimelineFilterList metricFilters = getFilters().getMetricFilters();
>     if (metricFilters != null && !metricFilters.getFilterList().isEmpty()) {
>   listBasedOnFilters.addFilter(TimelineFilterUtils.createHBaseFilterList(
>   FlowRunColumnPrefix.METRIC, metricFilters));
>     }
>     return listBasedOnFilters;
>   }{code}
>  
> createdTimeEnd is used as an upper bound for MIN_START_TIME.  We should 
> create one filter based on createdTimeBegin and another based on 
> createdTimeEnd.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5366) Improve handling of the Docker container life cycle

2018-01-29 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-5366:

Affects Version/s: 3.1.0
 Target Version/s: 3.1.0
Fix Version/s: 3.1.0

> Improve handling of the Docker container life cycle
> ---
>
> Key: YARN-5366
> URL: https://issues.apache.org/jira/browse/YARN-5366
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>  Labels: oct16-medium
> Fix For: 3.1.0
>
> Attachments: YARN-5366.001.patch, YARN-5366.002.patch, 
> YARN-5366.003.patch, YARN-5366.004.patch, YARN-5366.005.patch, 
> YARN-5366.006.patch, YARN-5366.007.patch, YARN-5366.008.patch, 
> YARN-5366.009.patch, YARN-5366.010.patch
>
>
> There are several paths that need to be improved with regard to the Docker 
> container lifecycle when running Docker containers on YARN.
> 1) Provide the ability to keep a container on the NodeManager for a set 
> period of time for debugging purposes.
> 2) Support sending signals to the process in the container to allow for 
> triggering stack traces, heap dumps, etc.
> 3) Support for Docker's live restore, which means moving away from the use of 
> {{docker wait}}. (YARN-5818)
> 4) Improve the resiliency of liveliness checks (kill -0) by adding retries.
> 5) Improve the resiliency of container removal by adding retries.
> 6) Only attempt to stop, kill, and remove containers if the current container 
> state allows for it.
> 7) Better handling of short lived containers when the container is stopped 
> before the PID can be retrieved. (YARN-6305)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7852) FlowRunReader constructs min_start_time filter for both createdtimestart and createdtimeend.

2018-01-29 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-7852:


 Summary: FlowRunReader constructs min_start_time filter for both 
createdtimestart and createdtimeend.
 Key: YARN-7852
 URL: https://issues.apache.org/jira/browse/YARN-7852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelinereader
Affects Versions: 3.0.0
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7528) Resource types that use units need to be defined at RM level and NM level or when using small units you will overflow max_allocation calculation

2018-01-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344290#comment-16344290
 ] 

Daniel Templeton commented on YARN-7528:


Sorry it took me a while to respond.  Did you try requesting 
{{-Dmapreduce.map.resource.gpu=5k}} with that setup?  I don't remember the 
details, but I think that if you now ask for a quantity without a unit, it will 
assume the resource's default unit.  If you force the unit to something larger 
than millis, it should expose the issue.  As far as I know, the issue should 
still exist.

> Resource types that use units need to be defined at RM level and NM level or 
> when using small units you will overflow max_allocation calculation
> 
>
> Key: YARN-7528
> URL: https://issues.apache.org/jira/browse/YARN-7528
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation, resourcemanager
>Affects Versions: 3.0.0
>Reporter: Grant Sohn
>Assignee: Szilard Nemeth
>Priority: Major
>
> When the unit is not defined in the RM, the LONG_MAX default will overflow in 
> the conversion step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization

2018-01-29 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344261#comment-16344261
 ] 

Botong Huang commented on YARN-7849:


Yeah, I recall hitting issue in this Test file during YARN-7102, but didn't 
surface in later iterations. I will take a look soon. Thanks [~jlowe] for 
reporting it. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization
> 
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization

2018-01-29 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344261#comment-16344261
 ] 

Botong Huang edited comment on YARN-7849 at 1/30/18 12:25 AM:
--

Yeah, I recall hitting issue in this test file during YARN-7102, but didn't 
surface in later iterations. I will take a look soon. Thanks [~jlowe] for 
reporting it. 


was (Author: botong):
Yeah, I recall hitting issue in this Test file during YARN-7102, but didn't 
surface in later iterations. I will take a look soon. Thanks [~jlowe] for 
reporting it. 

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization
> 
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization

2018-01-29 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang reassigned YARN-7849:
--

Assignee: Botong Huang

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization
> 
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Assignee: Botong Huang
>Priority: Major
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7851) Graph view does not show all AM attempts

2018-01-29 Thread Yesha Vora (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-7851:
-
Attachment: Screen Shot 2018-01-29 at 4.16.47 PM.png

> Graph view does not show all AM attempts
> 
>
> Key: YARN-7851
> URL: https://issues.apache.org/jira/browse/YARN-7851
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Priority: Major
> Attachments: Screen Shot 2018-01-29 at 4.16.47 PM.png
>
>
> Scenario:
> 1) Run an application where all AM attempt fails
> 2) Go to Graph view for application
> Here, The application started 10 attempt of AM. However, Graph view has 
> pictorial representation of only has 4 AM attempts. 
> It should show all 10 attempts in Graph



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7851) Graph view does not show all AM attempts

2018-01-29 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-7851:


 Summary: Graph view does not show all AM attempts
 Key: YARN-7851
 URL: https://issues.apache.org/jira/browse/YARN-7851
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Yesha Vora


Scenario:

1) Run an application where all AM attempt fails
2) Go to Graph view for application

Here, The application started 10 attempt of AM. However, Graph view has 
pictorial representation of only has 4 AM attempts. 
It should show all 10 attempts in Graph



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7556) Fair scheduler configuration should allow resource types in the minResources and maxResources properties

2018-01-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344252#comment-16344252
 ] 

Daniel Templeton commented on YARN-7556:


During my break I realized that I need to make sure that resource types in 
{{maxResources}} don't get set to 0.  Otherwise it can screw up the shares 
calculations.  I'll update the patch for that and your doc issue.

> Fair scheduler configuration should allow resource types in the minResources 
> and maxResources properties
> 
>
> Key: YARN-7556
> URL: https://issues.apache.org/jira/browse/YARN-7556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Attachments: YARN-7556.001.patch, YARN-7556.002.patch, 
> YARN-7556.003.patch, YARN-7556.004.patch, YARN-7556.005.patch, 
> YARN-7556.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization

2018-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344214#comment-16344214
 ] 

Jason Lowe commented on YARN-7849:
--

Looks like this started failing after YARN-7102.  I think the unit test is 
trying to inject a node heartbeat asynchronously on a "live" cluster which is 
not going to go so well given the more strict response ID matching after 
YARN-7102.  From the unit test output:
{noformat}
2018-01-29 17:34:32,505 WARN  [Node Status Updater] 
nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:run(782)) - Node 
is out of sync with ResourceManager, hence resyncing.
2018-01-29 17:34:32,505 WARN  [Node Status Updater] 
nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:run(784)) - 
Message from ResourceManager: Too far behind rm response id:0 nm response id:2
{noformat}

Pinging [~botong] if there are spare cycles to look at this.

> TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization
> 
>
> Key: YARN-7849
> URL: https://issues.apache.org/jira/browse/YARN-7849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
>Reporter: Jason Lowe
>Priority: Major
>
> testUpdateNodeUtilization is failing.  From a branch-2.8 run:
> {noformat}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
> testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
>   Time elapsed: 12.961 sec  <<< FAILURE!
> java.lang.AssertionError: Containers Utillization not propagated to RMNode 
> expected:<> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
>   at 
> org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7850) New UI does not show status for Log Aggregation

2018-01-29 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-7850:


 Summary: New UI does not show status for Log Aggregation
 Key: YARN-7850
 URL: https://issues.apache.org/jira/browse/YARN-7850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Yesha Vora


The status of Log Aggregation is not specified any where.

New UI should show the Log aggregation status for finished application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7723) Avoid using docker volume --format option to compatible to older docker releases

2018-01-29 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344211#comment-16344211
 ] 

Eric Badger commented on YARN-7723:
---

Sorry for the long hiatus. Patch looks good to me. +1 (non-binding)

 

[~sunilg], I'm fine with you committing this whenever you get a chance

> Avoid using docker volume --format option to compatible to older docker 
> releases
> 
>
> Key: YARN-7723
> URL: https://issues.apache.org/jira/browse/YARN-7723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-7723.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7849) TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization

2018-01-29 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7849:


 Summary: 
TestMiniYarnClusterNodeUtilization#testUpdateNodeUtilization
 Key: YARN-7849
 URL: https://issues.apache.org/jira/browse/YARN-7849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.1.0, 2.9.1, 3.0.1, 2.8.4
Reporter: Jason Lowe


testUpdateNodeUtilization is failing.  From a branch-2.8 run:
{noformat}
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.013 sec <<< 
FAILURE! - in org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization
testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization)
  Time elapsed: 12.961 sec  <<< FAILURE!
java.lang.AssertionError: Containers Utillization not propagated to RMNode 
expected:<> but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at 
org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:227)
at 
org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7778) Merging of constraints defined at different levels

2018-01-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344154#comment-16344154
 ] 

Konstantinos Karanasos commented on YARN-7778:
--

Hi [~cheersyang], just checked your proposal. I like the idea of taking 
advantage of the AND constraints, instead of actually merging the constraints. 
So +1 to your approach.

> Merging of constraints defined at different levels
> --
>
> Key: YARN-7778
> URL: https://issues.apache.org/jira/browse/YARN-7778
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: Merge Constraints Solution.pdf
>
>
> When we have multiple constraints defined for a given set of allocation tags 
> at different levels (i.e., at the cluster, the application or the scheduling 
> request level), we need to merge those constraints.
> Defining constraint levels as cluster > application > scheduling request, 
> constraints defined at lower levels should only be more restrictive than 
> those of higher levels. Otherwise the allocation should fail.
> For example, if there is an application level constraint that allows no more 
> than 5 HBase containers per rack, a scheduling request can further restrict 
> that to 3 containers per rack but not to 7 containers per rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-29 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344137#comment-16344137
 ] 

Jim Brennan commented on YARN-7677:
---

Based on the discussion here and in YARN-7226, and after discussing with 
[~jlowe] and [~ebadger], I have put up a patch for this.

The change is as follows:
 # Remove the line in ContainerLaunch.sh that explicitly adds HADOOP_CONF_DIR 
to the environment (as noted above).
 # Instead of ignoring the whitelist in the case of docker, always add the 
whitelist environment variables that are not already defined in the containers 
context using the {{var:-default}} variable expansion syntax.

In the docker case where a whitelist environment variable is defined in the 
image, this will prevent the launch script from overwriting it with the one 
from the Nodemanager's environment.

The non-docker case behaves the same as before, except that the whitelisted 
environment variables 
 that are not defined by the container context are set using the 
{{var:-default}} syntax, but in this case the default value is always used.


> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7848) Force removal of docker containers that do not get removed on first try

2018-01-29 Thread Eric Badger (JIRA)
Eric Badger created YARN-7848:
-

 Summary: Force removal of docker containers that do not get 
removed on first try
 Key: YARN-7848
 URL: https://issues.apache.org/jira/browse/YARN-7848
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Eric Badger


After the addition of YARN-5366, containers will get removed after a certain 
debug delay. However, this is a one-time effort. If the removal fails for 
whatever reason, the container will persist. We need to add a mechanism for a 
forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7598) Document how to use classpath isolation for aux-services in YARN

2018-01-29 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344130#comment-16344130
 ] 

Xuan Gong commented on YARN-7598:
-

[~djp] Thanks for the comment. Attached a new patch for this

> Document how to use classpath isolation for aux-services in YARN
> 
>
> Key: YARN-7598
> URL: https://issues.apache.org/jira/browse/YARN-7598
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7598.2.patch, YARN-7598.trunk.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7598) Document how to use classpath isolation for aux-services in YARN

2018-01-29 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-7598:

Attachment: YARN-7598.2.patch

> Document how to use classpath isolation for aux-services in YARN
> 
>
> Key: YARN-7598
> URL: https://issues.apache.org/jira/browse/YARN-7598
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7598.2.patch, YARN-7598.trunk.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344123#comment-16344123
 ] 

Jason Lowe commented on YARN-7815:
--

bq. Can we break anything if we move localized user-private files from 
nm-local-dir/usercache/user to nm-local-dir/usercache/user/filecache during 
upgrade?

Moving files on running apps is definitely going to break some of them. IIUC 
there's no proposal to move any files as part of this, just change whether or 
not containers have read-write access to certain local paths even if they try 
to explicitly change the permissions (as they could today with user-private 
files since they own them).  Right now we mount nm-local-dir/usercache/user to 
get access to its underlying filecache directory, and this simply proposes to 
directly mount nm-local-dir/usercache/user/filecache rather than the parent, as 
the parent cannot be mounted read-only due to the other read-write directories 
we are trying to mount underneath it (i.e.: the applications's appcache 
directory).

bq. Should not we remove this comment and code in this case?

I think this is still useful. The intent of that code is not to lock down and 
completely prevent AM-RM token access by any means.  It's there to prevent 
_accidental_ use of the AM-RM token. For example, if some task code ended up 
calling an API that requires contacting the RM (e.g.: acting like a client and 
trying to get job status) then that could easily DDoS the RM for a large job. 
The lack of AM-RM token for tasks means a connection to an RM will not work by 
default.  It can still be done (e.g.: Oozie launcher tasks that launch other 
jobs), but it doesn't do this by default.

Sure, a task could try really hard to go hunting for one if they happened to be 
running on the same node as the AM. If we're worried about that then the simple 
fix is to have the AM delete the token file after it's been consumed and before 
it starts launching tasks.

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-29 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-7677:
--
Attachment: YARN-7677.001.patch

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7847) Provide permalinks for container logs

2018-01-29 Thread Gera Shegalov (JIRA)
Gera Shegalov created YARN-7847:
---

 Summary: Provide permalinks for container logs
 Key: YARN-7847
 URL: https://issues.apache.org/jira/browse/YARN-7847
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: amrmproxy
Reporter: Gera Shegalov


YARN doesn't offer a service similar to AM proxy URL for container logs even if 
log-aggregation is enabled. The current mechanism of having the NM redirect to 
yarn.log.server.url fails once the node is down. Workarounds like in MR 
JobHistory to rewrite URI's on the fly are possible, but do not represent a 
good long term solution to onboard new apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344082#comment-16344082
 ] 

Miklos Szegedi commented on YARN-7815:
--

Thank you, for the replies [~jlowe] and [~eyang]. I understand now that 
container level isolation is not possible.

I have one last question. Should not we remove this comment and code in this 
case?

[https://github.com/apache/hadoop/blob/7fd287b4af5a191f18ea92850b7d904e4b4fb693/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L670]

Based on what you said, removing the AM token is misleading, since a neighbor 
container can grab it anyways by design.

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344071#comment-16344071
 ] 

Eric Yang commented on YARN-7815:
-

[~miklos.szeg...@cloudera.com] It will be hard to enforce read-only to other 
container directories because they might be spawned much later than current 
container launch.  I like [~jlowe]'s proposal to keep the read/write access to 
targeted app.  Can we break anything if we move localized user-private files 
from nm-local-dir/usercache/_user to nm-local-dir/usercache/__user__/filecache_ 
during upgrade?

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344067#comment-16344067
 ] 

Jason Lowe commented on YARN-7815:
--

This would break a framework where containers on the same node act as 
co-processors and read (or even write) each other's directories directly.

I guess I am missing the use-case for this.  All the application frameworks I 
know of don't really have the concept of separate security tokens across 
containers.  Once you compromise a single container you have essentially 
compromised the entire app as far as secrets are concerned.  If we really need 
extreme separation across containers within the same application then I would 
argue that's a separate runtime model than what YARN provides today.

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344052#comment-16344052
 ] 

Miklos Szegedi commented on YARN-7815:
--

Thank you [~jlowe] for the response. I agree with 1. and 2. above. Since 3. 
would expose container tokens of other containers to the current container, how 
about mounting the app dir as read-write and mounting an empty directory to 
containers other than the current one? This is a bit more work (yes, a bit more 
hacky...) but it would achieve the accepted level of security with backward 
compatibility.
{code:java}
# mkdir app
# mkdir /empty
# mkdir app/container1
# mkdir app/container2
# mkdir app/container3
# docker run -t -i -v /root/app:/app:rw -v /empty:/app/container1:ro -v 
/root/app/container2:/app/container2:rw -v /empty:/app/container3:ro -bash
bash-4.4# touch /app/a.txt
bash-4.4# touch /app/container1/a.txt
touch: /app/container1/a.txt: Read-only file system
bash-4.4# touch /app/container2/a.txt
bash-4.4# touch /app/container3/a.txt
touch: /app/container3/a.txt: Read-only file system
# {code}

> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7811) Service AM should use configured default docker network

2018-01-29 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344022#comment-16344022
 ] 

Shane Kumpf commented on YARN-7811:
---

There has been discussion, but no work has started that I am aware of. It might 
be good to open an issue to track that feature.

> Service AM should use configured default docker network
> ---
>
> Key: YARN-7811
> URL: https://issues.apache.org/jira/browse/YARN-7811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-7811.01.patch
>
>
> Currently the DockerProviderService used by the Service AM hardcodes a 
> default of bridge for the docker network. We already have a YARN 
> configuration property for default network, so the Service AM should honor 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7811) Service AM should use configured default docker network

2018-01-29 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344011#comment-16344011
 ] 

Gour Saha commented on YARN-7811:
-

/cc [~shaneku...@gmail.com] . I remember a discussion where there were plans to 
expose a simple read-only API in container-executor which the RM and NM can 
call to get a list of (ok to expose) properties. If the list of allowed docker 
networks falls in that ok list then the client and REST API can fail early and 
provide a better experience for the end-user.

[~shaneku...@gmail.com] , is anything like this planned?

> Service AM should use configured default docker network
> ---
>
> Key: YARN-7811
> URL: https://issues.apache.org/jira/browse/YARN-7811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-7811.01.patch
>
>
> Currently the DockerProviderService used by the Service AM hardcodes a 
> default of bridge for the docker network. We already have a YARN 
> configuration property for default network, so the Service AM should honor 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7815) Mount the filecache as read-only in Docker containers

2018-01-29 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344007#comment-16344007
 ] 

Jason Lowe commented on YARN-7815:
--

bq. Would it make sense to detach the appcache and mount a separate appcache 
dir for each container? AFAIK it is not for sharing between containers, since 
they might get scheduled to other nodes anyways.

It is used for sharing in some circumstances, e.g.: Tez shared fetch where a 
task can avoid fetching a broadcast output that another task already fetched, 
or Tez local fetch where a downstream task that runs on the same node fetches 
an output directly from local disk rather than having it copied through the 
shuffle server.  Besides those existing use-cases, having a separate appcache 
directory per container would add significant load to the shuffle handler, 
since it would add another dimension to the search matrix for shuffle data.

Bottom line is we have to mount the application's appcache directory read/write 
for backwards compatibility.  I don't see that as being a big concern, as 
compromising a single container is already compromising the entire application 
(due to the application secrets available within that container).  The key is 
preventing access/corruption to other applications even from the same user.

I think that leaves us with this proposal which should accomplish that and 
remove one of the mounts being made today:

1. nm-local-dir/filecache mounted read-only for access to localized public files
2. nm-local-dir/usercache/_user_/filecache mounted read-only for access to 
localized user-private files
3. nm-local-dir/usercache/_user_/appcache/_applicationId_ mounted read-write 
for access to the application work area and underlying container working 
directory




> Mount the filecache as read-only in Docker containers
> -
>
> Key: YARN-7815
> URL: https://issues.apache.org/jira/browse/YARN-7815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>
> Currently, when using the Docker runtime, the filecache directories are 
> mounted read-write into the Docker containers. Read write access is not 
> necessary. We should make this more restrictive by changing that mount to 
> read-only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2175) Container localization has no timeouts and tasks can be stuck there for a long time

2018-01-29 Thread Tao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343997#comment-16343997
 ] 

Tao Zhang commented on YARN-2175:
-

We're facing this issue too. Localization in part of containers may take a long 
time due to underlying HDFS issues, machine network conditions, etc. AM will 
request more containers when it doesn't see enough available containers (which 
finish localization). However Yarn will keep those containers being stuck at 
localization and there's not a good automatic way to kill them. A dynamically 
adjusting Timeout feature for "localizing" would help here.

Comparing to a "pre-configured" timeout value, it'd be better to have a 
"dynamically adjusting" timeout. E.g, we calculate the avg localization time 
for first 50% containers of 1 app, then set *2 * 
avg_localizing_time_of_half_containers* as the timeout threshold for rest 
containers. This requires information of all containers localization time. 
Hence *RM* would be the appropriate component to implement this feature 
(container localization timeout). AM may not be a good choice since 
"localization" is a common process of yarn, and we don't want to implement this 
feature for each type of ApplicationMasters.

 

> Container localization has no timeouts and tasks can be stuck there for a 
> long time
> ---
>
> Key: YARN-2175
> URL: https://issues.apache.org/jira/browse/YARN-2175
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Anubhav Dhoot
>Priority: Major
>
> There are no timeouts that can be used to limit the time taken by various 
> container startup operations. Localization for example could take a long time 
> and there is no automated way to kill an task if its stuck in these states. 
> These may have nothing to do with the task itself and could be an issue 
> within the platform.
> Ideally there should be configurable limits for various states within the 
> NodeManager to limit various states. The RM does not care about most of these 
> and its only between AM and the NM. We can start by making these global 
> configurable defaults and in future we can make it fancier by letting AM 
> override them in the start container request. 
> This jira will be used to limit localization time and we can open others if 
> we feel we need to limit other operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7780) Documentation for Placement Constraints

2018-01-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343904#comment-16343904
 ] 

Konstantinos Karanasos commented on YARN-7780:
--

Thanks for the comments, [~sunilg]. I will address them in the next version of 
the patch.

Re: SchedulingRequest, I guess we can keep it given it is an API change as Arun 
says.

> Documentation for Placement Constraints
> ---
>
> Key: YARN-7780
> URL: https://issues.apache.org/jira/browse/YARN-7780
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
>Priority: Major
> Attachments: YARN-7780-YARN-6592.001.patch, 
> YARN-7780-YARN-6592.002.patch
>
>
> JIRA to track documentation for the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7822) Constraint satisfaction checker support for composite OR and AND constraints

2018-01-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343897#comment-16343897
 ] 

Konstantinos Karanasos commented on YARN-7822:
--

Patch looks good, thanks [~cheersyang].

I have only one ask, which I don't think is a big change. Can we support DNF? 
This is just ORs of ANDs. This means that we can allow the following:
 * AND constraints with no nesting (your patch does that already);
 * OR constraints, each being either a simple constraint or an AND constraint 
with no nesting. So essentially, you only have to allow this additional case in 
your patch.

If we do this, then we can transform any compound constraint with any nesting 
level to an equivalent DNF form (we can tackle this in a separate JIRA), and we 
will be done with the canSatisfyConstraint for all possible constraints.

 

Regarding the max cardinality, you are right it is a bit confusing in the 
design doc, but it is not a code implementation artifact.

The idea is that the constraint has to be satisfied the moment of scheduling:
 * When the source and target tags are different, this happens to be the same 
both before and after the placement. Assume a constraint that hb-m should be 
placed at a node with at most 2 hb-rs. Say node n1 has before scheduling 2 
hb-rs, so we can place hb-m there. After placement, it will still have 2 hb-rs.
 * Now when the source and target tags are the same, it is more complicated. 
Assume a constraint that hb-rs should be placed at a node with at most 2 hb-rs. 
Say node n1 has again 2 hb-rs before scheduling. After scheduling, it will have 
3.

So to unify the two cases above, we say that the constraints should be 
satisfied before the placement of the new container happens. As [~asuresh] 
mentioned, at some point we tried to create a special case for when 
source==target, but it the semantics were not straightforward (e.g., if a user 
was saying max cardinality 2, they were expecting 2 after placement; when a 
user says max cardinality 0, they think anti-affinity, but it actually has to 
be 1 if we consider cardinality after placement). To unify things, we opted for 
cardinality check before placement in all cases.

 

> Constraint satisfaction checker support for composite OR and AND constraints
> 
>
> Key: YARN-7822
> URL: https://issues.apache.org/jira/browse/YARN-7822
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-7822-YARN-6592.001.patch, 
> YARN-7822-YARN-6592.002.patch, YARN-7822-YARN-6592.003.patch, 
> YARN-7822-YARN-6592.004.patch
>
>
> JIRA to track changes to {{PlacementConstraintsUtil#canSatisfyConstraints}} 
> handle OR and AND Composite constaints



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7841) Cleanup AllocationFileLoaderService's reloadAllocations method

2018-01-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343887#comment-16343887
 ] 

genericqa commented on YARN-7841:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 25 new + 30 unchanged - 17 fixed = 55 total (was 47) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 38s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7841 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908173/YARN-7841-002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1dff3c7626bf 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7fd287b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19512/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19512/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hado

[jira] [Commented] (YARN-7598) Document how to use classpath isolation for aux-services in YARN

2018-01-29 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343874#comment-16343874
 ] 

Junping Du commented on YARN-7598:
--

Thanks [~vinodkv] for comments. I agree that's a reasonable suggestion. 
[~xgong], would you incorporate Vinod's comments above? Thanks!

> Document how to use classpath isolation for aux-services in YARN
> 
>
> Key: YARN-7598
> URL: https://issues.apache.org/jira/browse/YARN-7598
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7598.trunk.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7835) [Atsv2] Race condition in NM while publishing events if second attempt launched on same node

2018-01-29 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343871#comment-16343871
 ] 

Haibo Chen commented on YARN-7835:
--

[~rohithsharma] Trying to understand the issue here. It seems like a collector 
is populated upon APP creation whereas it is removed upon APP attempt finish 
event. Ideally, a collector should be bound to either an APP or an APP_ATTEMPT.

Should we make it consistent, that is, either tie a collector with APP 
lifecycle events, or APP_Attempt life cycle events?

> [Atsv2] Race condition in NM while publishing events if second attempt 
> launched on same node
> 
>
> Key: YARN-7835
> URL: https://issues.apache.org/jira/browse/YARN-7835
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-7835.001.patch
>
>
> It is observed race condition that if master container is killed for some 
> reason and launched on same node then NMTimelinePublisher doesn't add 
> timelineClient. But once completed container for 1st attempt has come then 
> NMTimelinePublisher removes the timelineClient. 
>  It causes all subsequent event publishing from different client fails to 
> publish with exception Application is not found. !



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7839) Check node capacity before placing in the Algorithm

2018-01-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343850#comment-16343850
 ] 

Konstantinos Karanasos commented on YARN-7839:
--

I think the change makes sense, [~asuresh].

However, what about the case that a node seems full but a container is about to 
finish (and will be finished until the allocate is done)? Should we completely 
reject such nodes, or simply give higher priority to nodes that already have 
available resources?
{quote}getPreferredNodeIterator(CandidateNodeSet candidateNodeSet)
{quote}
[~cheersyang], despite the naming, as far as I know, the candidateNodeSet is 
currently always only a single node...

> Check node capacity before placing in the Algorithm
> ---
>
> Key: YARN-7839
> URL: https://issues.apache.org/jira/browse/YARN-7839
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Priority: Major
>
> Currently, the Algorithm assigns a node to a requests purely based on if the 
> constraints are met. It is later in the scheduling phase that the Queue 
> capacity and Node capacity are checked. If the request cannot be placed 
> because of unavailable Queue/Node capacity, the request is retried by the 
> Algorithm.
> For clusters that are running at high utilization, we can reduce the retries 
> if we perform the Node capacity check in the Algorithm as well. The Queue 
> capacity check and the other user limit checks can still be handled by the 
> scheduler (since queues and other limits are tied to the scheduler, and not 
> scheduler agnostic)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7846) Ss

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf resolved YARN-7846.
---
Resolution: Invalid

> Ss
> --
>
> Key: YARN-7846
> URL: https://issues.apache.org/jira/browse/YARN-7846
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> {quote}:(
> 
> h3.  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7846) Ss

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-7846:
--
Flags:   (was: Patch,Important)

> Ss
> --
>
> Key: YARN-7846
> URL: https://issues.apache.org/jira/browse/YARN-7846
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> {quote}:(
> 
> h3.  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7846) Ss

2018-01-29 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343825#comment-16343825
 ] 

Shane Kumpf commented on YARN-7846:
---

Another that I'm going to close this as invalid. If you intended to open this 
issue, feel free to reopen with an updated subject and description.

> Ss
> --
>
> Key: YARN-7846
> URL: https://issues.apache.org/jira/browse/YARN-7846
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> {quote}:(
> 
> h3.  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7846) Ss

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-7846:
--
Hadoop Flags:   (was: Incompatible change,Reviewed)

> Ss
> --
>
> Key: YARN-7846
> URL: https://issues.apache.org/jira/browse/YARN-7846
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> {quote}:(
> 
> h3.  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7846) Ss

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-7846:
--
Affects Version/s: (was: 2.7.5)
 Target Version/s:   (was: 3.0.0-alpha2)
Fix Version/s: (was: YARN-321)
  Component/s: (was: applications)

> Ss
> --
>
> Key: YARN-7846
> URL: https://issues.apache.org/jira/browse/YARN-7846
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> {quote}:(
> 
> h3.  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7845) H

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf resolved YARN-7845.
---
Resolution: Invalid

> H
> -
>
> Key: YARN-7845
> URL: https://issues.apache.org/jira/browse/YARN-7845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> h1.  
> ||Heading 1||Heading 2||
> |Col A1|Col A2|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7845) H

2018-01-29 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-7845:
--
Affects Version/s: (was: 2.7.5)
 Target Version/s:   (was: 3.0.0-alpha2)
Fix Version/s: (was: YARN-321)
  Component/s: (was: applications)

> H
> -
>
> Key: YARN-7845
> URL: https://issues.apache.org/jira/browse/YARN-7845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> h1.  
> ||Heading 1||Heading 2||
> |Col A1|Col A2|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7845) H

2018-01-29 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343823#comment-16343823
 ] 

Shane Kumpf commented on YARN-7845:
---

I'm going to close this as invalid. If you intended to open this issue, feel 
free to reopen with an updated subject and description.

> H
> -
>
> Key: YARN-7845
> URL: https://issues.apache.org/jira/browse/YARN-7845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Said Ali Alshwqbi
>Priority: Minor
>
> h1.  
> ||Heading 1||Heading 2||
> |Col A1|Col A2|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7829) Rebalance UI2 cluster overview page

2018-01-29 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343819#comment-16343819
 ] 

Eric Yang commented on YARN-7829:
-

[~GergelyNovak] Thank you for the patch.  The dashboard looks better with your 
patch.  Original proposal wasn't strictly bound to two rows, but we can have 
second row charts to auto flow base on browse width.  New charts can be added 
to fill up the empty space.  Bootstrap can specify the screen resolution to 
assist the auto flow base on media resolution, and we can probably auto-flow 
two charts to third row base on browser width.  I agree with you that if 
someone maximize browser on 16:9 TV.  The two rows to fill out the horizontal 
real estate is not helping, and we probably want to create more charts to fill 
up the empty space or increase the chart heights of second row to the same 
height as the first row. 

 

Small nits that Memory and VCore are also related to system resource, which are 
more closely related to Cluster Resource on the top row.  Would it be better to 
move Finished Apps and Running Apps to third row?  Memory-mb and Vcores 
probably need capitalization for correctness.  Thoughts?

> Rebalance UI2 cluster overview page
> ---
>
> Key: YARN-7829
> URL: https://issues.apache.org/jira/browse/YARN-7829
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Assignee: Gergely Novák
>Priority: Major
> Attachments: YARN-7829.001.patch, ui2-cluster-overview.png
>
>
> The cluster overview page looks like a upside down triangle.  It would be 
> nice to rebalance the charts to ensure horizontal real estate are utilized 
> properly.  The screenshot attachment includes some suggestion for rebalance.  
> Node Manager status and cluster resource are closely related, it would be 
> nice to promote the chart to first row.  Application Status, and Resource 
> Availability are closely related.  It would be nice to promote Resource usage 
> to side by side with Application Status to fill up the horizontal real 
> estates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

2018-01-29 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343803#comment-16343803
 ] 

Yufei Gu commented on YARN-7655:


Yes, let's move forward with this patch. We are trying to introduce node 
labeling or any similar idea into scheduler recently. Node labeling or its 
whatever alternative behaves in a way like data locality. If we want to let AM 
preemption to get around locality somehow, we need do the same thing to node 
labeling. I will try to review your patch soon. 

> avoid AM preemption caused by RRs for specific nodes or racks
> -
>
> Key: YARN-7655
> URL: https://issues.apache.org/jira/browse/YARN-7655
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: YARN-7655-001.patch
>
>
> We frequently see AM preemptions when 
> {{starvedApp.getStarvedResourceRequests()}} in 
> {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs 
> that request containers on a specific node. Since this causes us to only 
> consider one node to preempt containers on, the really good work that was 
> done in YARN-5830 doesn't save us from AM preemption. Even though there might 
> be multiple nodes on which we could preempt enough non-AM containers to 
> satisfy the app's starvation, we often wind up preempting one or more AM 
> containers on the single node that we're considering.
> A proposed solution is that if we're going to preempt one or more AM 
> containers for an RR that specifies a node or rack, then we should instead 
> expand the search space to consider all nodes. That way we take advantage of 
> YARN-5830, and only preempt AMs if there's no alternative. I've attached a 
> patch with an initial implementation of this. We've been running it on a few 
> clusters, and have seen AM preemptions drop from double-digit occurrences on 
> many days to zero.
> Of course, the tradeoff is some loss of locality, since the starved app is 
> less likely to be allocated resources at the most specific locality level 
> that it asked for. My opinion is that this tradeoff is worth it, but 
> interested to hear what others think as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6675) Add NM support to launch opportunistic containers based on overallocation

2018-01-29 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6675:
-
Attachment: YARN-6675-YARN-1011.prelim0.patch

> Add NM support to launch opportunistic containers based on overallocation
> -
>
> Key: YARN-6675
> URL: https://issues.apache.org/jira/browse/YARN-6675
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-6675-YARN-1011.prelim0.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7846) Ss

2018-01-29 Thread Said Ali Alshwqbi (JIRA)
Said Ali Alshwqbi created YARN-7846:
---

 Summary: Ss
 Key: YARN-7846
 URL: https://issues.apache.org/jira/browse/YARN-7846
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications
Affects Versions: 2.7.5
Reporter: Said Ali Alshwqbi
 Fix For: YARN-321


{quote}:(

h3.  
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >