[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261774#comment-17261774
 ] 

Wangda Tan commented on YARN-10504:
---

1) AbstractCSQueue, lots of format changes, can we revert all format change 
(change spaces). Since it will cause issues when backport and reviews 

2) LeafQueue: 
There're two TODOs: 
  //TODO recalculate max applications because they can depend on capacity
  //TODO recalculate max applications because they can depend on capacity

I think it took care by the {{updateAbsoluteCapacitiesAndRelatedFields}}, 
please double check, and if it is correct, we should remove the TODOs

3) TestAbsoluteResourceWithAutoQueue: 

I added a TODO: 
  TODO: Wangda: I think this test case is not correct, Sunil could help look
  // into details.

4) TestCapacitySchedulerAutoCreatedQueueBase.java

I also added a TODO
// TODO: Wangda, I think this is a wrong test, it doesn't consider rounding
...

[~bteke], [~gandras], [~zhuqi] if any of you plan to make change to the patch, 
please add a comment to the JIRA so others will know, otherwise the patch will 
be hard to merge.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used

2021-01-08 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261718#comment-17261718
 ] 

zhuqi commented on YARN-10532:
--

[~wangda] [~gandras]

The patch is based original capacity auto creation, when the template policy 
logic to support multiple auto created levels finished in YARN-10506  i will 
add the parentQueue and leafQueue auto deletion logic.

Thanks.       

> Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is 
> not being used
> 
>
> Key: YARN-10532
> URL: https://issues.apache.org/jira/browse/YARN-10532
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: zhuqi
>Priority: Major
> Attachments: YARN-10532.001.patch
>
>
> It's better if we can delete auto-created queues when they are not in use for 
> a period of time (like 5 mins). It will be helpful when we have a large 
> number of auto-created queues (e.g. from 500 users), but only a small subset 
> of queues are actively used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10553) Refactor TestDistributedShell

2021-01-08 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki resolved YARN-10553.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Refactor TestDistributedShell
> -
>
> Key: YARN-10553
> URL: https://issues.apache.org/jira/browse/YARN-10553
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available, refactoring, test
> Fix For: 3.4.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> TestDistributedShell has grown so large over time. It has 29 tests.
>  This is running the risk of exceeding 30 minutes limit for a single unit 
> class.
>  * The implementation has lots of code redundancy.
>  * The Jira splits TestDistributedShell into three different unitTest for 
> each TimeLineVersion: V1.0, 1.5, and 2.0
>  * Fixes the broken test {{testDSShellWithEnforceExecutionType}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261549#comment-17261549
 ] 

Benjamin Teke commented on YARN-10504:
--

Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the 
getting and merging the QueueCapacities happens *before* calling the 
{{ParentQueue#updateClusterResource}} (and 
{{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource 
}}the {{AbstractCSQueue#updateEffectiveResources }}is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts.


{code:java}
  void updateEffectiveResources(Resource clusterResource) {
Set configuredNodelabels =
csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath());
for (String label : configuredNodelabels) {
  Resource resourceByLabel = labelManager.getResourceByLabel(label,
  clusterResource);  Resource minResource = 
queueResourceQuotas.getConfiguredMinResource(
  label);  // Update effective resource (min/max) to each child 
queue.
  if (getCapacityConfigType().equals(
  CapacityConfigType.ABSOLUTE_RESOURCE)) {
queueResourceQuotas.setEffectiveMinResource(label,
getMinResourceNormalized(queuePath,
((ParentQueue) parent).getEffectiveMinRatioPerResource(),
minResource));


...{code}

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10562) Alternate fix for DirectoryCollection.checkDirs() race

2021-01-08 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261543#comment-17261543
 ] 

Peter Bacsko commented on YARN-10562:
-

Thanks for the patch [~Jim_Brennan]. cc [~leftnoteasy] [~snemeth] this might be 
interesting / important for us, you guys might want to look at it. I'll try to 
find some free cycles to go through the changes.

> Alternate fix for DirectoryCollection.checkDirs() race
> --
>
> Key: YARN-10562
> URL: https://issues.apache.org/jira/browse/YARN-10562
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-10562.001.patch, YARN-10562.002.patch, 
> YARN-10562.003.patch
>
>
> In YARN-9833, a race condition in DirectoryCollection. {{getGoodDirs()}} and 
> related methods were returning an unmodifiable view of the lists. These 
> accesses were protected by read/write locks, but because the lists are 
> CopyOnWriteArrayLists, subsequent changes to the list, even when done under 
> the writelock, were exposed when a caller started iterating the list view. 
> CopyOnWriteArrayLists cache the current underlying list in the iterator, so 
> it is safe to iterate them even while they are being changed - at least the 
> view will be consistent.
> The problem was that checkDirs() was clearing the lists and rebuilding them 
> from scratch every time, so if a caller called getGoodDirs() just before 
> checkDirs cleared it, and then started iterating right after the clear, they 
> could get an empty list.
> The fix in YARN-9833 was to change {{getGoodDirs()}} and related methods to 
> return a copy of the list, which definitely fixes the race condition. The 
> disadvantage is that now we create a new copy of these lists every time we 
> launch a container. The advantage using CopyOnWriteArrayList was that the 
> lists should rarely ever change, and we can avoid all the copying. 
> Unfortunately, the way checkDirs() was written, it guaranteed that it would 
> modify those lists multiple times every time.
> So this Jira proposes an alternate solution for YARN-9833, which mainly just 
> rewrites checkDirs() to minimize the changes to the underlying lists. There 
> are still some small windows where a disk will have been added to one list, 
> but not yet removed from another if you hit it just right, but I think these 
> should be pretty rare and relatively harmless, and in the vast majority of 
> cases I suspect only one disk will be moving from one list to another at any 
> time.   The question is whether this type of inconsistency (which was always 
> there before -YARN-9833- is worth reducing all the copying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10541) capture the performance metrics of ZKRMStateStore

2021-01-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated YARN-10541:
---
Fix Version/s: 3.3.1

> capture the performance metrics of ZKRMStateStore
> -
>
> Key: YARN-10541
> URL: https://issues.apache.org/jira/browse/YARN-10541
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In a production env, most yarn resourcemnagers are in ha mode and rely on 
> zookeeper components. It is very necessary to capture the performance metrics 
> of ZKRMStateStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10541) capture the performance metrics of ZKRMStateStore

2021-01-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned YARN-10541:
--

Assignee: Max  Xie

> capture the performance metrics of ZKRMStateStore
> -
>
> Key: YARN-10541
> URL: https://issues.apache.org/jira/browse/YARN-10541
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In a production env, most yarn resourcemnagers are in ha mode and rely on 
> zookeeper components. It is very necessary to capture the performance metrics 
> of ZKRMStateStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10541) capture the performance metrics of ZKRMStateStore

2021-01-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved YARN-10541.

Resolution: Fixed

> capture the performance metrics of ZKRMStateStore
> -
>
> Key: YARN-10541
> URL: https://issues.apache.org/jira/browse/YARN-10541
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In a production env, most yarn resourcemnagers are in ha mode and rely on 
> zookeeper components. It is very necessary to capture the performance metrics 
> of ZKRMStateStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10541) capture the performance metrics of ZKRMStateStore

2021-01-08 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated YARN-10541:
---
Fix Version/s: 3.4.0

> capture the performance metrics of ZKRMStateStore
> -
>
> Key: YARN-10541
> URL: https://issues.apache.org/jira/browse/YARN-10541
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In a production env, most yarn resourcemnagers are in ha mode and rely on 
> zookeeper components. It is very necessary to capture the performance metrics 
> of ZKRMStateStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261510#comment-17261510
 ] 

Hadoop QA commented on YARN-10504:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
19s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 8 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
19s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 57s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
56s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/447/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 24 new + 758 unchanged - 13 fixed = 782 total (was 771) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 17s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {

[jira] [Commented] (YARN-10562) Alternate fix for DirectoryCollection.checkDirs() race

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261490#comment-17261490
 ] 

Hadoop QA commented on YARN-10562:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
56s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 29s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
19s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 35s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {c

[jira] [Commented] (YARN-10525) Add weight mode conversion to fs2cs

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261472#comment-17261472
 ] 

Hadoop QA commented on YARN-10525:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
4s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 7 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
52s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 13s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
48s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/446/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 6 unchanged - 2 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  9s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} 

[jira] [Updated] (YARN-10562) Alternate fix for DirectoryCollection.checkDirs() race

2021-01-08 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-10562:
---
Attachment: YARN-10562.003.patch

> Alternate fix for DirectoryCollection.checkDirs() race
> --
>
> Key: YARN-10562
> URL: https://issues.apache.org/jira/browse/YARN-10562
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-10562.001.patch, YARN-10562.002.patch, 
> YARN-10562.003.patch
>
>
> In YARN-9833, a race condition in DirectoryCollection. {{getGoodDirs()}} and 
> related methods were returning an unmodifiable view of the lists. These 
> accesses were protected by read/write locks, but because the lists are 
> CopyOnWriteArrayLists, subsequent changes to the list, even when done under 
> the writelock, were exposed when a caller started iterating the list view. 
> CopyOnWriteArrayLists cache the current underlying list in the iterator, so 
> it is safe to iterate them even while they are being changed - at least the 
> view will be consistent.
> The problem was that checkDirs() was clearing the lists and rebuilding them 
> from scratch every time, so if a caller called getGoodDirs() just before 
> checkDirs cleared it, and then started iterating right after the clear, they 
> could get an empty list.
> The fix in YARN-9833 was to change {{getGoodDirs()}} and related methods to 
> return a copy of the list, which definitely fixes the race condition. The 
> disadvantage is that now we create a new copy of these lists every time we 
> launch a container. The advantage using CopyOnWriteArrayList was that the 
> lists should rarely ever change, and we can avoid all the copying. 
> Unfortunately, the way checkDirs() was written, it guaranteed that it would 
> modify those lists multiple times every time.
> So this Jira proposes an alternate solution for YARN-9833, which mainly just 
> rewrites checkDirs() to minimize the changes to the underlying lists. There 
> are still some small windows where a disk will have been added to one list, 
> but not yet removed from another if you hit it just right, but I think these 
> should be pretty rare and relatively harmless, and in the vast majority of 
> cases I suspect only one disk will be moving from one list to another at any 
> time.   The question is whether this type of inconsistency (which was always 
> there before -YARN-9833- is worth reducing all the copying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10562) Alternate fix for DirectoryCollection.checkDirs() race

2021-01-08 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261403#comment-17261403
 ] 

Jim Brennan commented on YARN-10562:


patch 003 fixes the new checkstyle issues.


> Alternate fix for DirectoryCollection.checkDirs() race
> --
>
> Key: YARN-10562
> URL: https://issues.apache.org/jira/browse/YARN-10562
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-10562.001.patch, YARN-10562.002.patch, 
> YARN-10562.003.patch
>
>
> In YARN-9833, a race condition in DirectoryCollection. {{getGoodDirs()}} and 
> related methods were returning an unmodifiable view of the lists. These 
> accesses were protected by read/write locks, but because the lists are 
> CopyOnWriteArrayLists, subsequent changes to the list, even when done under 
> the writelock, were exposed when a caller started iterating the list view. 
> CopyOnWriteArrayLists cache the current underlying list in the iterator, so 
> it is safe to iterate them even while they are being changed - at least the 
> view will be consistent.
> The problem was that checkDirs() was clearing the lists and rebuilding them 
> from scratch every time, so if a caller called getGoodDirs() just before 
> checkDirs cleared it, and then started iterating right after the clear, they 
> could get an empty list.
> The fix in YARN-9833 was to change {{getGoodDirs()}} and related methods to 
> return a copy of the list, which definitely fixes the race condition. The 
> disadvantage is that now we create a new copy of these lists every time we 
> launch a container. The advantage using CopyOnWriteArrayList was that the 
> lists should rarely ever change, and we can avoid all the copying. 
> Unfortunately, the way checkDirs() was written, it guaranteed that it would 
> modify those lists multiple times every time.
> So this Jira proposes an alternate solution for YARN-9833, which mainly just 
> rewrites checkDirs() to minimize the changes to the underlying lists. There 
> are still some small windows where a disk will have been added to one list, 
> but not yet removed from another if you hit it just right, but I think these 
> should be pretty rare and relatively harmless, and in the vast majority of 
> cases I suspect only one disk will be moving from one list to another at any 
> time.   The question is whether this type of inconsistency (which was always 
> there before -YARN-9833- is worth reducing all the copying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10563) Fix dependency exclusion problem in poms

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261396#comment-17261396
 ] 

Hadoop QA commented on YARN-10563:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
56s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 6s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
80m 23s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
29s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 
29s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
16s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 18m 
16s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green}{color} | {color:green} The patch has no ill-formed 
XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 59s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green}{color} | {color:green} the patch passed w

[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261381#comment-17261381
 ] 

Benjamin Teke commented on YARN-10504:
--

Hey [~zhuqi],

Thanks for your updates. Took the liberty to refactor/include your changes in 
my latest patch. Also based on our offline discussion with [~gandras] I removed 
some parts of his 002 patch, as it will be handled differently. As you have 
mentioned the AutoQueueCreation should now work, because the leafQueueTemplates 
should now have the correct absolute capacities. I will now turn to the 
AbsoluteResourceWithAutoQueue issue and handle the maxApplications in auto 
created cases.

If you upload a new patch please use the 005 patch as base, because that 
contains the latest from our side and also includes your changes from 003/004 
patch.

Thanks!

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10504:
-
Attachment: YARN-10504.005.patch

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10525) Add weight mode conversion to fs2cs

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10525:

Description: 
Weight mode will be added to Capacity Scheduler.

Currently, we convert FS weights to percentages, however, it will be more 
useful to keep those values and use them in CS as well.

> Add weight mode conversion to fs2cs
> ---
>
> Key: YARN-10525
> URL: https://issues.apache.org/jira/browse/YARN-10525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: zhuqi
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10525-001.patch, YARN-10525-002.patch, 
> YARN-10525-003.patch
>
>
> Weight mode will be added to Capacity Scheduler.
> Currently, we convert FS weights to percentages, however, it will be more 
> useful to keep those values and use them in CS as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261335#comment-17261335
 ] 

Hadoop QA commented on YARN-10559:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
20s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 52s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
54s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 55s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/444/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 1 new + 41 
unchanged - 1 fixed = 42 total (was 42) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/444/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 86 new + 507 unchanged - 0 fixed = 593 total (was 507) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:

[jira] [Updated] (YARN-10525) Add weight mode conversion to fs2cs

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10525:

Attachment: YARN-10525-003.patch

> Add weight mode conversion to fs2cs
> ---
>
> Key: YARN-10525
> URL: https://issues.apache.org/jira/browse/YARN-10525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: zhuqi
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10525-001.patch, YARN-10525-002.patch, 
> YARN-10525-003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261321#comment-17261321
 ] 

Hadoop QA commented on YARN-10504:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 
10 new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
57s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 47s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
53s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/442/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 36 new + 878 unchanged - 13 fixed = 914 total (was 891) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 30s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. 

[jira] [Updated] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

2021-01-08 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10264:
--
Parent: YARN-10323
Issue Type: Sub-task  (was: Task)

> Add container launch related env / classpath debug info to container logs 
> when a container fails
> 
>
> Key: YARN-10264
> URL: https://issues.apache.org/jira/browse/YARN-10264
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
>
> Sometimes when a container fails to launch, it can be pretty hard to figure 
> out why it has failed.
> Similar to YARN-4309, we can add a switch to control if the printing of 
> environment variables and Java classpath should be done.
> As a bonus, 
> [jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html] 
> could also be utilized to print some verbose info about the classpath. 
> When log aggregation occurs, all this information will automatically get 
> collected and make debugging such container launch failures much easier.
> Below is an example output when the user faces a classpath configuration 
> issue while launching an application: 
> {code:java}
> End of LogType:prelaunch.err
> **
> 2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
> 2020-04-19 05:49:12,145 DEBUG:app_info:Application 
> application_1587300264561_0001 failed 2 times due to AM Container for 
> appattempt_1587300264561_0001_02 exited with  exitCode: 1
> Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from 
> container-launch.
> Container id: container_e60_1587300264561_0001_02_01
> Exit code: 1
> Exception message: Launch container failed
> Shell output: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /dataroot/ycloud/yarn/nm/nmPrivate/application_1587300264561_0001/container_e60_1587300264561_0001_02_01/container_e60_1587300264561_0001_02_01.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> [2020-04-19 12:45:01.984]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Please check whether your etc/hadoop/mapred-site.xml contains the below 
> configuration:
> 
>   yarn.app.mapreduce.am.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> 
>   mapreduce.map.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> 
>   mapreduce.reduce.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> [2020-04-19 12:45:01.985]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Please check whether your etc/hadoop/mapred-site.xml contains the below 
> configuration:
> 
>   yarn.app.mapreduce.am.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> 
>   mapreduce.map.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> 
>   mapreduce.reduce.env
>   HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
> directory}
> 
> For more detailed output, check the application tracking page: 
> http://quasar-plnefj-2.quasar-plnefj.root.hwx.site:8088/cluster/app/application_1587300264561_0001
>  Then click on links to logs of each attempt.
> ...
> 2020-04-19 05:49:12,148 INFO:util:* End test_app_API 
> (yarn.suite.YarnAPITests) *
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10525) Add weight mode conversion to fs2cs

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261308#comment-17261308
 ] 

Hadoop QA commented on YARN-10525:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 7 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
38s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m  6s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m  
1s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; 
considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/441/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 6 unchanged - 2 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 48s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |

[jira] [Commented] (YARN-10427) Duplicate Job IDs in SLS output

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261287#comment-17261287
 ] 

Szilard Nemeth commented on YARN-10427:
---

Hi [~werd.up],
I'm glad that you found it useful.


> Duplicate Job IDs in SLS output
> ---
>
> Key: YARN-10427
> URL: https://issues.apache.org/jira/browse/YARN-10427
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Affects Versions: 3.0.0, 3.3.0, 3.2.1, 3.4.0
> Environment: I ran the attached inputs on my MacBook Pro, using 
> Hadoop compiled from the latest trunk (as of commit 139a43e98e). I also 
> tested against 3.2.1 and 3.3.0 release branches.
>  
>Reporter: Drew Merrill
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-10427-sls-scriptsandlogs.tar.gz, 
> YARN-10427.001.patch, YARN-10427.002.patch, YARN-10427.003.patch, 
> fair-scheduler.xml, inputsls.json, jobruntime.csv, jobruntime.csv, 
> mapred-site.xml, sls-runner.xml, yarn-site.xml
>
>
> Hello, I'm hoping someone can help me resolve or understand some issues I've 
> been having with the YARN Scheduler Load Simulator (SLS). I've been 
> experimenting with SLS for several months now at work as we're trying to 
> build a simulation model to characterize our enterprise Hadoop infrastructure 
> for purposes of future capacity planning. In the process of attempting to 
> verify and validate the SLS output, I've encountered a number of issues 
> including runtime exceptions and bad output. The focus of this issue is the 
> bad output. In all my simulation runs, the jobruntime.csv output seems to 
> have one or more of the following problems: no output, duplicate job ids, 
> and/or missing job ids.
>  
> Because of where I work, I'm unable to provide the exact inputs I typically 
> use, but I'm able to reproduce the problem of the duplicate Job IDS using 
> some simplified inputs and configuration files, which I've attached, along 
> with the output I obtained.
>  
> The command I used to run the simulation:
> {{./runsls.sh --tracetype=SLS --tracelocation=./inputsls.json 
> --output-dir=sls-run-1 --print-simulation 
> --track-jobs=job_1,job_2,job_3,job_4,job_5,job_6,job_7,job_8,job_9,job_10}}
>  
> Can anyone help me understand what would cause the duplicate Job IDs in the 
> output? Is this a bug in Hadoop or a problem with my inputs? Thanks in 
> advance.
>  
> PS: This is my first issue I've ever opened so please be kind if I've missed 
> something or am not understanding something obvious about the way Hadoop 
> works. I'll gladly follow-up with more info as requested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

2021-01-08 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10507:
--
Fix Version/s: 3.4.0

> Add the capability to fs2cs to write the converted placement rules inside 
> capacity-scheduler.xml
> 
>
> Key: YARN-10507
> URL: https://issues.apache.org/jira/browse/YARN-10507
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: fs2cs
> Fix For: 3.4.0
>
> Attachments: YARN-10507-001.patch, YARN-10507-002.patch, 
> YARN-10507-003.patch, YARN-10507-004.patch, YARN-10507-005.patch, 
> YARN-10507-006.patch, YARN-10507-007.patch, YARN-10507-008.patch
>
>
> Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when 
> it converts the placement rules.
> However, we also support having the JSON inlined inside 
> {{capacity-scheduler.xml}}.  Add a command line switch so that we can choose 
> the desired output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261285#comment-17261285
 ] 

Szilard Nemeth commented on YARN-10507:
---

Thanks [~pbacsko] for working on this.
Checked the latest patch.
Patch LGTM, committed to trunk.
Resolving this jira.

> Add the capability to fs2cs to write the converted placement rules inside 
> capacity-scheduler.xml
> 
>
> Key: YARN-10507
> URL: https://issues.apache.org/jira/browse/YARN-10507
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: fs2cs
> Attachments: YARN-10507-001.patch, YARN-10507-002.patch, 
> YARN-10507-003.patch, YARN-10507-004.patch, YARN-10507-005.patch, 
> YARN-10507-006.patch, YARN-10507-007.patch, YARN-10507-008.patch
>
>
> Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when 
> it converts the placement rules.
> However, we also support having the JSON inlined inside 
> {{capacity-scheduler.xml}}.  Add a command line switch so that we can choose 
> the desired output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we have  a access check in its caller,  so now i pass "true" to 
AppAttemptInfo in the patch.  

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we have  a access check in its caller,  so now i pass "true" to 
AppAttemptInfo.  

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@h

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we have  a access check in its caller,  so now i pass "true" to 
AppAttemptInfo.  

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we have  a access check,  so now i pass "true" to AppAttemptInfo. 
 

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we have  a access check,  so now i pass "true" to AppAttemptInfo. 
 

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo.  

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl

[jira] [Comment Edited] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261274#comment-17261274
 ] 

Szilard Nemeth edited comment on YARN-10295 at 1/8/21, 12:25 PM:
-

Hi [~wangda],
There's a dedicated jira to deal with test failures on branch-3.2: YARN-10249.

There are other commits that went in under similar circumstances: 
https://issues.apache.org/jira/browse/YARN-10194?focusedCommentId=17093663&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17093663

Nowadays test results are looking better for branch-3.2, see this for example: 
https://issues.apache.org/jira/browse/YARN-10528?focusedCommentId=17252715&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#
But this doesn't mean all is good for branch-3.2, as I said tests are quite 
unreliable / flakey there.


was (Author: snemeth):
Hi @wangda,
There's a dedicated jira to deal with test failures on branch-3.2: YARN-10249.

There are other commits that went in under similar circumstances: 
https://issues.apache.org/jira/browse/YARN-10194?focusedCommentId=17093663&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17093663

Nowadays test results are looking better for branch-3.2, see this for example: 
https://issues.apache.org/jira/browse/YARN-10528?focusedCommentId=17252715&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#
But this doesn't mean all is good for branch-3.2, as I said tests are quite 
unreliable / flakey there.

> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.2.2, 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> se

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo.  

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo. 

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to u

[jira] [Comment Edited] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261274#comment-17261274
 ] 

Szilard Nemeth edited comment on YARN-10295 at 1/8/21, 12:24 PM:
-

Hi @wangda,
There's a dedicated jira to deal with test failures on branch-3.2: YARN-10249.

There are other commits that went in under similar circumstances: 
https://issues.apache.org/jira/browse/YARN-10194?focusedCommentId=17093663&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17093663

Nowadays test results are looking better for branch-3.2, see this for example: 
https://issues.apache.org/jira/browse/YARN-10528?focusedCommentId=17252715&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#
But this doesn't mean all is good for branch-3.2, as I said tests are quite 
unreliable / flakey there.


was (Author: snemeth):
Hi @wangda,
There's a dedicated jira to deal with test failures on branch-3.2: YARN-10249.

There are other commits that went in under similar circumstances: 
https://issues.apache.org/jira/browse/YARN-10194?focusedCommentId=17093663&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17093663

Nowadays test results are looking better for branch-3.2, see this for example: 
https://issues.apache.org/jira/browse/YARN-10528?focusedCommentId=17252715&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#



> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.2.2, 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> second time for getting the ApplicationAttemptId would be an NPE, as the 
> container got unreserved in th

[jira] [Commented] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261274#comment-17261274
 ] 

Szilard Nemeth commented on YARN-10295:
---

Hi @wangda,
There's a dedicated jira to deal with test failures on branch-3.2: YARN-10249.

There are other commits that went in under similar circumstances: 
https://issues.apache.org/jira/browse/YARN-10194?focusedCommentId=17093663&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17093663

Nowadays test results are looking better for branch-3.2, see this for example: 
https://issues.apache.org/jira/browse/YARN-10528?focusedCommentId=17252715&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#



> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.2.2, 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> second time for getting the ApplicationAttemptId would be an NPE, as the 
> container got unreserved in the meantime.
> {code:java}
> // Do not schedule if there are any reservations to fulfill on the node
> if (node.getReservedContainer() != null) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Skipping scheduling since node " + node.getNodeID()
> + " is reserved by application " + node.getReservedContainer()
> .getContainerId().getApplicationAttemptId());
>  }
>  return null;
> }
> {code}
> A fix would be to store the container object before the if block. 
> Only branch-3.1/3.2 is affected, because the newer branches have YARN-9664 
> which indirectly fixed this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo. 

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to us

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has access check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Other apis, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user

[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

Besides, at 
[https://github.com/apache/hadoop/blob/580a6a75a3e3d3b7918edeffd6e93fc211166884/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java#L145]

it seems that we also miss a access check, but i do not ttigger it, so now i 
pass "true" to AppAttemptInfo

 

  was:
It seems that we miss a access check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus anyone can get some sensitive information that she/he should not, like 
logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -u  : 
> http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
> {
>   "appAttempts": {
> "appAttempt": [
>   {
> "id": 1,
> "startTime": 1609318411566,
> "containerId": "container_1609318368700_0002_01_01",
> "nodeHttpAddress": "hadoop12:8044

[jira] [Commented] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation

2021-01-08 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261259#comment-17261259
 ] 

Andras Gyori commented on YARN-10506:
-

Added support for placementContexts. As of now, there are two ways to enable 
auto queue creation under a specific ParentQueue:
1. Set the isCreateParentQueue and isCreateLeafQueue of 
ApplicationPlacementContext (which will be done by the mapping rules)
2. If they are not set, the logic will fall back to check if the option enabled 
for the ParentQueue:
{noformat}
yarn.scheduler.capacity.root..auto-queue-creation.enabled{noformat}
Also, by its nature, dynamically created ParentQueues allow auto queue creation 
by default. Therefore it is possible to create a new queue under a previously 
auto created ParentQueue without setting anything.

> Update queue creation logic to use weight mode and allow the flexible 
> static/dynamic creation
> -
>
> Key: YARN-10506
> URL: https://issues.apache.org/jira/browse/YARN-10506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10506.001.patch, YARN-10506.002.patch, 
> YARN-10506.003.patch, YARN-10506.004.patch, YARN-10506.005.patch
>
>
> The queue creation logic should be updated to use weight mode and support the 
> flexible creation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261258#comment-17261258
 ] 

Hadoop QA commented on YARN-10506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 11s{color} 
| {color:red}{color} | {color:red} YARN-10506 does not apply to trunk. Rebase 
required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for 
help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13018367/YARN-10506.005.patch |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/443/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> Update queue creation logic to use weight mode and allow the flexible 
> static/dynamic creation
> -
>
> Key: YARN-10506
> URL: https://issues.apache.org/jira/browse/YARN-10506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10506.001.patch, YARN-10506.002.patch, 
> YARN-10506.003.patch, YARN-10506.004.patch, YARN-10506.005.patch
>
>
> The queue creation logic should be updated to use weight mode and support the 
> flexible creation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10563) Fix dependency exclusion problem in poms

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10563:

Attachment: YARN-10563-001.patch

> Fix dependency exclusion problem in poms
> 
>
> Key: YARN-10563
> URL: https://issues.apache.org/jira/browse/YARN-10563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-10563-001.patch
>
>
> Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
> {{jsonschema2pojo-core}}. It turned out that when a project imports either 
> hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
> can actually show up as active dependencies. This is probably has to do with 
> Maven as it does not handle exclusions properly when listed inside a 
> {{dependencyManagement}} section. I confirmed it with some local builds (eg. 
> Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
> that were not supposed to be there).
> So all excludeable artifacts should be moved to 
> {{hadoop-yarn-server-resourcemanager/pom.xml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10563) Fix dependency exclusion problem in poms

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10563:

Description: 
Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
{{jsonschema2pojo-core}}. It turned out that when a project imports either 
hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
can actually show up as active dependencies. This is probably has to do with 
Maven as it does not handle exclusions properly when listed inside a 
{{dependencyManagement}} section. I confirmed it with some local builds (eg. 
Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
that were not supposed to be displayed as either a compile or runtime 
dependency).

So all excludeable artifacts should be moved to 
{{hadoop-yarn-server-resourcemanager/pom.xml}}.

  was:
Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
{{jsonschema2pojo-core}}. It turned out that when a project imports either 
hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
can actually show up as active dependencies. This is probably has to do with 
Maven as it does not handle exclusions properly.

So all excludeable artifacts should be moved to 
{{hadoop-yarn-server-resourcemanager/pom.xml}}.


> Fix dependency exclusion problem in poms
> 
>
> Key: YARN-10563
> URL: https://issues.apache.org/jira/browse/YARN-10563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
>
> Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
> {{jsonschema2pojo-core}}. It turned out that when a project imports either 
> hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
> can actually show up as active dependencies. This is probably has to do with 
> Maven as it does not handle exclusions properly when listed inside a 
> {{dependencyManagement}} section. I confirmed it with some local builds (eg. 
> Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
> that were not supposed to be displayed as either a compile or runtime 
> dependency).
> So all excludeable artifacts should be moved to 
> {{hadoop-yarn-server-resourcemanager/pom.xml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10563) Fix dependency exclusion problem in poms

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10563:

Description: 
Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
{{jsonschema2pojo-core}}. It turned out that when a project imports either 
hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
can actually show up as active dependencies. This is probably has to do with 
Maven as it does not handle exclusions properly when listed inside a 
{{dependencyManagement}} section. I confirmed it with some local builds (eg. 
Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
that were not supposed to be there).

So all excludeable artifacts should be moved to 
{{hadoop-yarn-server-resourcemanager/pom.xml}}.

  was:
Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
{{jsonschema2pojo-core}}. It turned out that when a project imports either 
hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
can actually show up as active dependencies. This is probably has to do with 
Maven as it does not handle exclusions properly when listed inside a 
{{dependencyManagement}} section. I confirmed it with some local builds (eg. 
Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
that were not supposed to be displayed as either a compile or runtime 
dependency).

So all excludeable artifacts should be moved to 
{{hadoop-yarn-server-resourcemanager/pom.xml}}.


> Fix dependency exclusion problem in poms
> 
>
> Key: YARN-10563
> URL: https://issues.apache.org/jira/browse/YARN-10563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
>
> Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
> {{jsonschema2pojo-core}}. It turned out that when a project imports either 
> hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
> can actually show up as active dependencies. This is probably has to do with 
> Maven as it does not handle exclusions properly when listed inside a 
> {{dependencyManagement}} section. I confirmed it with some local builds (eg. 
> Oozie uses hadoop-minicluster and the dependency tree showed a bunch of items 
> that were not supposed to be there).
> So all excludeable artifacts should be moved to 
> {{hadoop-yarn-server-resourcemanager/pom.xml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10528) maxAMShare should only be accepted for leaf queues, not parent queues

2021-01-08 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10528:
--
Fix Version/s: 3.2.3
   3.1.5
   3.3.1
   3.4.0

> maxAMShare should only be accepted for leaf queues, not parent queues
> -
>
> Key: YARN-10528
> URL: https://issues.apache.org/jira/browse/YARN-10528
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
> Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3
>
> Attachments: YARN-10528-branch-3.1.001.patch, 
> YARN-10528-branch-3.2.001.patch, YARN-10528-branch-3.3.001.patch, 
> YARN-10528.001.patch, YARN-10528.002.patch, maxAMShare for root.users (parent 
> queue) has no effect as child queue does not inherit it.png
>
>
> Based on [Hadoop 
> documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html],
>  it is clear that {{maxAMShare}} property can only be used for *leaf queues*. 
> This is similar to the {{reservation}} setting.
> However, existing code only ensures that the reservation setting is not 
> accepted for "parent" queues (see 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L226
>  and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L233)
>  but it is missing the checks for {{maxAMShare}}. Due to this, it is 
> currently possible to have an allocation similar to below:
> {code}
> 
> 
> 
> 1.0
> drf
> *
> *
> 
> 1.0
> drf
> 
> 
> 1.0
> drf
> 1.0
> 
> 
> fair
> 
> 
> 
> 
> 
> 
> 
> 
> {code}
> where {{maxAMShare}} is 1.0f meaning, it is possible allocate 100% of the 
> queue's resources for Application Masters. Notice above that root.users is a 
> parent queue, however, it still gladly accepts {{maxAMShare}}. This is 
> contrary to the documentation and in fact, it is very misleading because the 
> child queues like root.users. actually do not inherit this setting at 
> all and they still go on and use the default of 0.5 instead of 1.0, see the 
> attached screenshot as an example.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10528) maxAMShare should only be accepted for leaf queues, not parent queues

2021-01-08 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261255#comment-17261255
 ] 

Szilard Nemeth commented on YARN-10528:
---

Thanks [~sahuja] for the new patches.

LGTM, committed to trunk, branch-3.3, branch-3.2 and branch-3.1.

Resolving this jira.

> maxAMShare should only be accepted for leaf queues, not parent queues
> -
>
> Key: YARN-10528
> URL: https://issues.apache.org/jira/browse/YARN-10528
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
> Attachments: YARN-10528-branch-3.1.001.patch, 
> YARN-10528-branch-3.2.001.patch, YARN-10528-branch-3.3.001.patch, 
> YARN-10528.001.patch, YARN-10528.002.patch, maxAMShare for root.users (parent 
> queue) has no effect as child queue does not inherit it.png
>
>
> Based on [Hadoop 
> documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html],
>  it is clear that {{maxAMShare}} property can only be used for *leaf queues*. 
> This is similar to the {{reservation}} setting.
> However, existing code only ensures that the reservation setting is not 
> accepted for "parent" queues (see 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L226
>  and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L233)
>  but it is missing the checks for {{maxAMShare}}. Due to this, it is 
> currently possible to have an allocation similar to below:
> {code}
> 
> 
> 
> 1.0
> drf
> *
> *
> 
> 1.0
> drf
> 
> 
> 1.0
> drf
> 1.0
> 
> 
> fair
> 
> 
> 
> 
> 
> 
> 
> 
> {code}
> where {{maxAMShare}} is 1.0f meaning, it is possible allocate 100% of the 
> queue's resources for Application Masters. Notice above that root.users is a 
> parent queue, however, it still gladly accepts {{maxAMShare}}. This is 
> contrary to the documentation and in fact, it is very misleading because the 
> child queues like root.users. actually do not inherit this setting at 
> all and they still go on and use the default of 0.5 instead of 1.0, see the 
> attached screenshot as an example.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10563) Fix dependency exclusion problem in poms

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10563:

Description: 
Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
{{jsonschema2pojo-core}}. It turned out that when a project imports either 
hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
can actually show up as active dependencies. This is probably has to do with 
Maven as it does not handle exclusions properly.

So all excludeable artifacts should be moved to 
{{hadoop-yarn-server-resourcemanager/pom.xml}}.

> Fix dependency exclusion problem in poms
> 
>
> Key: YARN-10563
> URL: https://issues.apache.org/jira/browse/YARN-10563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
>
> Currently, {{hadoop-project/pom.xml}} contains a bunch of exclusions for 
> {{jsonschema2pojo-core}}. It turned out that when a project imports either 
> hadoop-minicluster or hadoop-yarn-server-resourcemanager, excluded artifacts 
> can actually show up as active dependencies. This is probably has to do with 
> Maven as it does not handle exclusions properly.
> So all excludeable artifacts should be moved to 
> {{hadoop-yarn-server-resourcemanager/pom.xml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a access check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus anyone can get some sensitive information that she/he should not, like 
logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

  was:
It seems that we miss a access check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a access check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus anyone can get some sensitive information that she/he should not, like 
> logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -u  : 
> http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
> {
>   "appAttempts": {
> "appAttempt": [
>   {
> "id": 1,
> "startTime": 1609318411566,
> "containerId": "container_1609318368700_0002_01_01",
> "nodeHttpAddress": "hadoop12:8044",
> "nodeId": "hadoop12:36831",
> "logsLink": 
> "http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
> "blacklistedNodes": "",
> "nodesBlacklistedBySystem": ""
>   }
> ]
>   }
> }
> {code}
> Others api, like getApps and getApp, has security check  like "hasAccess(app, 
> hsr)", th

[jira] [Updated] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation

2021-01-08 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10506:

Attachment: YARN-10506.005.patch

> Update queue creation logic to use weight mode and allow the flexible 
> static/dynamic creation
> -
>
> Key: YARN-10506
> URL: https://issues.apache.org/jira/browse/YARN-10506
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10506.001.patch, YARN-10506.002.patch, 
> YARN-10506.003.patch, YARN-10506.004.patch, YARN-10506.005.patch
>
>
> The queue creation logic should be updated to use weight mode and support the 
> flexible creation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0002.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Description: 
It seems that we miss a access check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 

  was:
It seems that we miss a security check before getAppAttempts, see 
[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]

thus we can get the some sensitive information, like logs link.  
{code:java}
application_1609318368700_0002 belong to user2

user1@hadoop11$ curl --negotiate -u  : 
http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
{
  "appAttempts": {
"appAttempt": [
  {
"id": 1,
"startTime": 1609318411566,
"containerId": "container_1609318368700_0002_01_01",
"nodeHttpAddress": "hadoop12:8044",
"nodeId": "hadoop12:36831",
"logsLink": 
"http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
"blacklistedNodes": "",
"nodesBlacklistedBySystem": ""
  }
]
  }
}

{code}
Others api, like getApps and getApp, has security check  like "hasAccess(app, 
hsr)", they would hide the logs link if the appid do not belong to query user, 
see 

[https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]

 We need add hasAccess(app, hsr) for getAppAttempts.

 


>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a access check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -u  : 
> http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
> {
>   "appAttempts": {
> "appAttempt": [
>   {
> "id": 1,
> "startTime": 1609318411566,
> "containerId": "container_1609318368700_0002_01_01",
> "nodeHttpAddress": "hadoop12:8044",
> "nodeId": "hadoop12:36831",
> "logsLink": 
> "http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
> "blacklistedNodes": "",
> "nodesBlacklistedBySystem": ""
>   }
> ]
>   }
> }
> {code}
> Others api, like getApps and getApp, has security check  like "hasAccess(app, 
> hsr)", they would hide the logs link if the appid do not 

[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: (was: YARN-10559.0002.patch)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YARN-10555:
--
Labels: pull-request-available security  (was: security)

>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available, security
> Attachments: YARN-10555_1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -u  : 
> http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
> {
>   "appAttempts": {
> "appAttempt": [
>   {
> "id": 1,
> "startTime": 1609318411566,
> "containerId": "container_1609318368700_0002_01_01",
> "nodeHttpAddress": "hadoop12:8044",
> "nodeId": "hadoop12:36831",
> "logsLink": 
> "http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
> "blacklistedNodes": "",
> "nodesBlacklistedBySystem": ""
>   }
> ]
>   }
> }
> {code}
> Others api, like getApps and getApp, has security check  like "hasAccess(app, 
> hsr)", they would hide the logs link if the appid do not belong to query 
> user, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]
>  We need add hasAccess(app, hsr) for getAppAttempts.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10555) missing access check before getAppAttempts

2021-01-08 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-10555:
-
Summary:  missing access check before getAppAttempts  (was:  missing 
security check before getAppAttempts)

>  missing access check before getAppAttempts
> ---
>
> Key: YARN-10555
> URL: https://issues.apache.org/jira/browse/YARN-10555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: security
> Attachments: YARN-10555_1.patch
>
>
> It seems that we miss a security check before getAppAttempts, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1127]
> thus we can get the some sensitive information, like logs link.  
> {code:java}
> application_1609318368700_0002 belong to user2
> user1@hadoop11$ curl --negotiate -u  : 
> http://hadoop11:8088/ws/v1/cluster/apps/application_1609318368700_0002/appattempts/|jq
> {
>   "appAttempts": {
> "appAttempt": [
>   {
> "id": 1,
> "startTime": 1609318411566,
> "containerId": "container_1609318368700_0002_01_01",
> "nodeHttpAddress": "hadoop12:8044",
> "nodeId": "hadoop12:36831",
> "logsLink": 
> "http://hadoop12:8044/node/containerlogs/container_1609318368700_0002_01_01/user2";,
> "blacklistedNodes": "",
> "nodesBlacklistedBySystem": ""
>   }
> ]
>   }
> }
> {code}
> Others api, like getApps and getApp, has security check  like "hasAccess(app, 
> hsr)", they would hide the logs link if the appid do not belong to query 
> user, see 
> [https://github.com/apache/hadoop/blob/513f1995adc9b73f9c7f4c7beb89725b51b313ac/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java#L1098]
>  We need add hasAccess(app, hsr) for getAppAttempts.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261232#comment-17261232
 ] 

zhuqi commented on YARN-10504:
--

Fixed max absolute update in AutoCreatedLeafQueue. And the 
TestQueueManagementDynamicEditPolicy will pass.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-08 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-10504:
-
Attachment: YARN-10504.004.patch

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10525) Add weight mode conversion to fs2cs

2021-01-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10525:

Attachment: YARN-10525-002.patch

> Add weight mode conversion to fs2cs
> ---
>
> Key: YARN-10525
> URL: https://issues.apache.org/jira/browse/YARN-10525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: zhuqi
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10525-001.patch, YARN-10525-002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261196#comment-17261196
 ] 

Hadoop QA commented on YARN-10559:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
31s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 3 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
32s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 26s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
47s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/440/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 83 new + 508 unchanged - 0 fixed = 591 total (was 508) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 39s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {c

[jira] [Commented] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

2021-01-08 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261167#comment-17261167
 ] 

Andras Gyori commented on YARN-10507:
-

Thank you [~pbacsko] for the updated patch. It LGTM now +1.

> Add the capability to fs2cs to write the converted placement rules inside 
> capacity-scheduler.xml
> 
>
> Key: YARN-10507
> URL: https://issues.apache.org/jira/browse/YARN-10507
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: fs2cs
> Attachments: YARN-10507-001.patch, YARN-10507-002.patch, 
> YARN-10507-003.patch, YARN-10507-004.patch, YARN-10507-005.patch, 
> YARN-10507-006.patch, YARN-10507-007.patch, YARN-10507-008.patch
>
>
> Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when 
> it converts the placement rules.
> However, we also support having the JSON inlined inside 
> {{capacity-scheduler.xml}}.  Add a command line switch so that we can choose 
> the desired output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10538) Add recommissioning nodes to the list of updated nodes returned to the AM

2021-01-08 Thread Bibin Chundatt (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin Chundatt resolved YARN-10538.
---
Fix Version/s: 3.3.1
   3.4.0
   Resolution: Fixed

Committed to trunk and branch-3.3

> Add recommissioning nodes to the list of updated nodes returned to the AM
> -
>
> Key: YARN-10538
> URL: https://issues.apache.org/jira/browse/YARN-10538
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.1, 3.1.1
>Reporter: Srinivas S T
>Assignee: Srinivas S T
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> YARN-6483 introduced nodes that transitioned to DECOMMISSIONING state to the 
> list of updated nodes returned to the AM. This allows the Spark application 
> master to gracefully decommission its containers on the decommissioning node. 
> But if the node were to be recommissioned, the Spark application master would 
> not be aware of this. We propose to add recommissioned node to the list of 
> updated nodes sent to the AM when a recommission node transition occurs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

2021-01-08 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261144#comment-17261144
 ] 

Peter Bacsko commented on YARN-10507:
-

[~gandras] could you check it again?

> Add the capability to fs2cs to write the converted placement rules inside 
> capacity-scheduler.xml
> 
>
> Key: YARN-10507
> URL: https://issues.apache.org/jira/browse/YARN-10507
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: fs2cs
> Attachments: YARN-10507-001.patch, YARN-10507-002.patch, 
> YARN-10507-003.patch, YARN-10507-004.patch, YARN-10507-005.patch, 
> YARN-10507-006.patch, YARN-10507-007.patch, YARN-10507-008.patch
>
>
> Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when 
> it converts the placement rules.
> However, we also support having the JSON inlined inside 
> {{capacity-scheduler.xml}}.  Add a command line switch so that we can choose 
> the desired output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org