[jira] [Assigned] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi reassigned YARN-10532: Assignee: zhuqi > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249512#comment-17249512 ] zhuqi commented on YARN-10532: -- [~wangda] I want to take it , if no one to take. Thanks.:) > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10533) Display managed apps and unmanaged apps on RM UI
[ https://issues.apache.org/jira/browse/YARN-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyrus Jackson updated YARN-10533: - Summary: Display managed apps and unmanaged apps on RM UI (was: Separate metrics on RM UI for managed apps and unmanaged apps) > Display managed apps and unmanaged apps on RM UI > > > Key: YARN-10533 > URL: https://issues.apache.org/jira/browse/YARN-10533 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Cyrus Jackson >Assignee: Cyrus Jackson >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10533) Separate metrics on RM UI for managed apps and unmanaged apps
Cyrus Jackson created YARN-10533: Summary: Separate metrics on RM UI for managed apps and unmanaged apps Key: YARN-10533 URL: https://issues.apache.org/jira/browse/YARN-10533 Project: Hadoop YARN Issue Type: Improvement Reporter: Cyrus Jackson Assignee: Cyrus Jackson -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9833) Race condition when DirectoryCollection.checkDirs() runs during container launch
[ https://issues.apache.org/jira/browse/YARN-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249402#comment-17249402 ] Eric Badger commented on YARN-9833: --- bq. Isn't this how it has been for years? It was returning an unmodifiableList view of the underlying List, so that limits what the caller can do. getGoodDirs() and the others just return a read-only List. They don't have to know about the internals. Well, yes an no. It was _supposed_ to be like that. But given this bug, we can see that it clearly wasn't. The callee in this case _should_ have been atomic and so the unmodifiable view of the list _should_ have been fine. But when you get into fine-grained locking like this, mistakes are easy to make because the person making the change doesn't necessarily understand the history behind why the code is written the way it is. If we can guarantee that the callee will always perform atomic operations on the lists, then there isn't an issue. Maybe we can guarantee this by adding a comment/warning in the checkDirs() function to make sure that anyone touching this code is super careful about locking. I agree with the idea of fixing this on the callee side and not having the caller create a new object everytime. I just want to make sure that this bug isn't reintroduced by accident down the line because of the added complexity of fine-grained locking. I don't know if this is a performance-sensitive area of the code where such a tradeoff would clearly be to go for performance. > Race condition when DirectoryCollection.checkDirs() runs during container > launch > > > Key: YARN-9833 > URL: https://issues.apache.org/jira/browse/YARN-9833 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9833-001.patch > > > During endurance testing, we found a race condition that cause an empty > {{localDirs}} being passed to container-executor. > The problem is that {{DirectoryCollection.checkDirs()}} clears three > collections: > {code:java} > this.writeLock.lock(); > try { > localDirs.clear(); > errorDirs.clear(); > fullDirs.clear(); > ... > {code} > This happens in critical section guarded by a write lock. When we start a > container, we retrieve the local dirs by calling > {{dirsHandler.getLocalDirs();}} which in turn invokes > {{DirectoryCollection.getGoodDirs()}}. The implementation of this method is: > {code:java} > List getGoodDirs() { > this.readLock.lock(); > try { > return Collections.unmodifiableList(localDirs); > } finally { > this.readLock.unlock(); > } > } > {code} > So we're also in a critical section guarded by the lock. But > {{Collections.unmodifiableList()}} only returns a _view_ of the collection, > not a copy. After we get the view, {{MonitoringTimerTask.run()}} might be > scheduled to run and immediately clears {{localDirs}}. > This caused a weird behaviour in container-executor, which exited with error > code 35 (COULD_NOT_CREATE_WORK_DIRECTORIES). > Therefore we can't just return a view, we must return a copy with > {{ImmutableList.copyOf()}}. > Credits to [~snemeth] for analyzing and determining the root cause. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9833) Race condition when DirectoryCollection.checkDirs() runs during container launch
[ https://issues.apache.org/jira/browse/YARN-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249316#comment-17249316 ] Jim Brennan commented on YARN-9833: --- {quote} My worry with this is that code changes in the future will incorrectly use getGoodDirs or the other methods that expose the private lists from within DirectoryCollection. {quote} Isn't this how it has been for years? It was returning an unmodifiableList view of the underlying List, so that limits what the caller can do. getGoodDirs() and the others just return a read-only List. They don't have to know about the internals. If we are going to change these to return a copy of the list, we may want to reconsider the use of CopyOnWriteArrayList - I'm not sure it is buying us anything. > Race condition when DirectoryCollection.checkDirs() runs during container > launch > > > Key: YARN-9833 > URL: https://issues.apache.org/jira/browse/YARN-9833 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9833-001.patch > > > During endurance testing, we found a race condition that cause an empty > {{localDirs}} being passed to container-executor. > The problem is that {{DirectoryCollection.checkDirs()}} clears three > collections: > {code:java} > this.writeLock.lock(); > try { > localDirs.clear(); > errorDirs.clear(); > fullDirs.clear(); > ... > {code} > This happens in critical section guarded by a write lock. When we start a > container, we retrieve the local dirs by calling > {{dirsHandler.getLocalDirs();}} which in turn invokes > {{DirectoryCollection.getGoodDirs()}}. The implementation of this method is: > {code:java} > List getGoodDirs() { > this.readLock.lock(); > try { > return Collections.unmodifiableList(localDirs); > } finally { > this.readLock.unlock(); > } > } > {code} > So we're also in a critical section guarded by the lock. But > {{Collections.unmodifiableList()}} only returns a _view_ of the collection, > not a copy. After we get the view, {{MonitoringTimerTask.run()}} might be > scheduled to run and immediately clears {{localDirs}}. > This caused a weird behaviour in container-executor, which exited with error > code 35 (COULD_NOT_CREATE_WORK_DIRECTORIES). > Therefore we can't just return a view, we must return a copy with > {{ImmutableList.copyOf()}}. > Credits to [~snemeth] for analyzing and determining the root cause. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9833) Race condition when DirectoryCollection.checkDirs() runs during container launch
[ https://issues.apache.org/jira/browse/YARN-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249289#comment-17249289 ] Eric Badger commented on YARN-9833: --- I agree that the CopyOnWriteArrayList is most likely a performance thing. Since {{getGoodDirs()}} is called on every container launch, then that's a lot of copying. Not sure it's that bad in the grand scheme of things though. bq. My suggestion for fixing this would be to fix the checkdirs() implementation to operate on local copies of these arrays, and then update them with a single assignment only if they have changed. My worry with this is that code changes in the future will incorrectly use {{getGoodDirs}} or the other methods that expose the private lists from within DirectoryCollection. So in my mind it's a tradeoff between performance and maintainability. I don't know what the performance impact is. We could potentially mitigate some (most?) of the maintainability impact via a comment on the getGoodDirs() method (as well as the getLocalDirs() method in LocalDirsHandlerService). In general, I don't like calling methods to have to be aware of callee methods and having to deal with their locking. That could also be mitigated by fixing the callee method to remove the race condition, but that could be reintroduced by accident in the future, since they may not understand the full impact of the CopyOnWriteArrayList bq. 1. We were not thinking about errorDirs because as we were tracking down the issue, only localDirs seemed to be problematic, although I agree that it is inconsistent this way. Shall we follow-up on this? Yea, we should definitely follow up to fix errorDirs > Race condition when DirectoryCollection.checkDirs() runs during container > launch > > > Key: YARN-9833 > URL: https://issues.apache.org/jira/browse/YARN-9833 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9833-001.patch > > > During endurance testing, we found a race condition that cause an empty > {{localDirs}} being passed to container-executor. > The problem is that {{DirectoryCollection.checkDirs()}} clears three > collections: > {code:java} > this.writeLock.lock(); > try { > localDirs.clear(); > errorDirs.clear(); > fullDirs.clear(); > ... > {code} > This happens in critical section guarded by a write lock. When we start a > container, we retrieve the local dirs by calling > {{dirsHandler.getLocalDirs();}} which in turn invokes > {{DirectoryCollection.getGoodDirs()}}. The implementation of this method is: > {code:java} > List getGoodDirs() { > this.readLock.lock(); > try { > return Collections.unmodifiableList(localDirs); > } finally { > this.readLock.unlock(); > } > } > {code} > So we're also in a critical section guarded by the lock. But > {{Collections.unmodifiableList()}} only returns a _view_ of the collection, > not a copy. After we get the view, {{MonitoringTimerTask.run()}} might be > scheduled to run and immediately clears {{localDirs}}. > This caused a weird behaviour in container-executor, which exited with error > code 35 (COULD_NOT_CREATE_WORK_DIRECTORIES). > Therefore we can't just return a view, we must return a copy with > {{ImmutableList.copyOf()}}. > Credits to [~snemeth] for analyzing and determining the root cause. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10526) RMAppManager CS Placement ignores parent path
[ https://issues.apache.org/jira/browse/YARN-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249106#comment-17249106 ] Hadoop QA commented on YARN-10526: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 56s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 36s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 6s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 51s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green}{color} | {c
[jira] [Commented] (YARN-9833) Race condition when DirectoryCollection.checkDirs() runs during container launch
[ https://issues.apache.org/jira/browse/YARN-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249043#comment-17249043 ] Jim Brennan commented on YARN-9833: --- Thinking about it more over the weekend, I suspect that the reason the CopyOnWriteArrayList was used was more for performance than to allow someone to hang onto the reference for a long time. This ideally is a list that doesn't change very often, so handing out a view of a copy-on-write array is cheaper than making a copy every time we launch a container. Unfortunately, {{checkdirs()}} as written seems to ruin any advantage we've gained by mutating the lists every time it runs (and multiple times at that, by first clearing and then adding each entry individually). This is also where the race comes in. My suggestion for fixing this would be to fix the {{checkdirs()}} implementation to operate on local copies of these arrays, and then update them with a single assignment only if they have changed. > Race condition when DirectoryCollection.checkDirs() runs during container > launch > > > Key: YARN-9833 > URL: https://issues.apache.org/jira/browse/YARN-9833 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9833-001.patch > > > During endurance testing, we found a race condition that cause an empty > {{localDirs}} being passed to container-executor. > The problem is that {{DirectoryCollection.checkDirs()}} clears three > collections: > {code:java} > this.writeLock.lock(); > try { > localDirs.clear(); > errorDirs.clear(); > fullDirs.clear(); > ... > {code} > This happens in critical section guarded by a write lock. When we start a > container, we retrieve the local dirs by calling > {{dirsHandler.getLocalDirs();}} which in turn invokes > {{DirectoryCollection.getGoodDirs()}}. The implementation of this method is: > {code:java} > List getGoodDirs() { > this.readLock.lock(); > try { > return Collections.unmodifiableList(localDirs); > } finally { > this.readLock.unlock(); > } > } > {code} > So we're also in a critical section guarded by the lock. But > {{Collections.unmodifiableList()}} only returns a _view_ of the collection, > not a copy. After we get the view, {{MonitoringTimerTask.run()}} might be > scheduled to run and immediately clears {{localDirs}}. > This caused a weird behaviour in container-executor, which exited with error > code 35 (COULD_NOT_CREATE_WORK_DIRECTORIES). > Therefore we can't just return a view, we must return a copy with > {{ImmutableList.copyOf()}}. > Credits to [~snemeth] for analyzing and determining the root cause. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10526) RMAppManager CS Placement ignores parent path
[ https://issues.apache.org/jira/browse/YARN-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249001#comment-17249001 ] Gergely Pollak commented on YARN-10526: --- Latest version is supposed to be final, added a test and a few comments. Also please ignore the fail in TestDelegationTokenRenewer, it seems to be flaky and is unrelated. > RMAppManager CS Placement ignores parent path > - > > Key: YARN-10526 > URL: https://issues.apache.org/jira/browse/YARN-10526 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10526.001.patch, YARN-10526.002.patch, > YARN-10526.003.patch, YARN-10526.004.patch > > > When RMAppManager creates the RMApp object using the placementContext's > results, it only uses the getQueue method which will return only the name of > the leaf queue in the case of CapacityScheduler. > If a queue exists with this name, then the application will be placed into > the queue. If the queue does not exists, then CS will take the parent path > into consideration during the auto queue creation, however this only happens > if there is no queue with the leaf name. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10526) RMAppManager CS Placement ignores parent path
[ https://issues.apache.org/jira/browse/YARN-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10526: -- Attachment: YARN-10526.004.patch > RMAppManager CS Placement ignores parent path > - > > Key: YARN-10526 > URL: https://issues.apache.org/jira/browse/YARN-10526 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10526.001.patch, YARN-10526.002.patch, > YARN-10526.003.patch, YARN-10526.004.patch > > > When RMAppManager creates the RMApp object using the placementContext's > results, it only uses the getQueue method which will return only the name of > the leaf queue in the case of CapacityScheduler. > If a queue exists with this name, then the application will be placed into > the queue. If the queue does not exists, then CS will take the parent path > into consideration during the auto queue creation, however this only happens > if there is no queue with the leaf name. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation
[ https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248989#comment-17248989 ] zhuqi commented on YARN-10506: -- [~wangda] {quote}Handle convert of dynamic queue to static queue (still see some test failures). {quote} The test failed because some mistakes queue name , should change to this: {code:java} // Do change 2nd level queue from dynamic to static csConf.setQueues("root", new String[] { "a", "b", "c-auto", "e" }); csConf.setNonLabeledQueueWeight("root.e", 6f); csConf.setQueues("root.e", new String[] { "e1-auto" }); csConf.setNonLabeledQueueWeight("root.e.e1-auto", 6f); cs.reinitialize(csConf, mockRM.getRMContext()); // Get queue c CSQueue e1 = cs.getQueue("root.e.e1-auto"); // e's abs resource should be 6/20 * (6/7), (since a/c/e.weight=6, all other 2 peers // have weight=1, and e1's weight is 6, e2's weight is 1). Assert.assertEquals((6 / 20f) * (6 / 7f), e1.getAbsoluteCapacity(), 1e-6); Assert.assertEquals(360 * GB, c.getQueueResourceQuotas().getEffectiveMinResource().getMemorySize()); Assert.assertEquals(6f, c.getQueueCapacities().getWeight(), 1e-6); {code} > Update queue creation logic to use weight mode and allow the flexible > static/dynamic creation > - > > Key: YARN-10506 > URL: https://issues.apache.org/jira/browse/YARN-10506 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Priority: Major > Attachments: YARN-10506.001.patch > > > The queue creation logic should be updated to use weight mode and support the > flexible creation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10510) TestAppPage.testAppBlockRenderWithNullCurrentAppAttempt will cause NullPointerException
[ https://issues.apache.org/jira/browse/YARN-10510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248866#comment-17248866 ] Hadoop QA commented on YARN-10510: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 0s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 55s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 49s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/385/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 7 new + 48 unchanged - 0 fixed = 55 total (was 48) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color
[jira] [Updated] (YARN-10519) Refactor QueueMetricsForCustomResources class to move to yarn-common package
[ https://issues.apache.org/jira/browse/YARN-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Minni Mittal updated YARN-10519: Attachment: YARN-10519.v4.patch > Refactor QueueMetricsForCustomResources class to move to yarn-common package > > > Key: YARN-10519 > URL: https://issues.apache.org/jira/browse/YARN-10519 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Minni Mittal >Assignee: Minni Mittal >Priority: Major > Attachments: YARN-10519.v1.patch, YARN-10519.v2.patch, > YARN-10519.v3.patch, YARN-10519.v4.patch > > > Refactor the code for QueueMetricsForCustomResources to move the base classes > to yarn-common package. This helps in reusing the class in adding custom > resource types at NM level also. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10531) Be able to disable user limit factor for CapacityScheduler Leaf Queue
[ https://issues.apache.org/jira/browse/YARN-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248823#comment-17248823 ] zhuqi commented on YARN-10531: -- [~wangda] Attach a latest patch, to make unit test more clear. And change the logic: {code:java} // If user-limit-factor set to -1, we should disabled user limit. if (getUserLimitFactor() != -1) { maxUserLimit = Resources.multiplyAndRoundDown(queueCapacity, getUserLimitFactor()); } else { maxUserLimit = lQueue. getEffectiveMaxCapacityDown(nodePartition, lQueue.getMinimumAllocation()); } {code} Other changes is related to maxPerUserApplications and so on. > Be able to disable user limit factor for CapacityScheduler Leaf Queue > - > > Key: YARN-10531 > URL: https://issues.apache.org/jira/browse/YARN-10531 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > Attachments: YARN-10531.001.patch, YARN-10531.002.patch > > > User limit factor is used to define max cap of how much resource can be > consumed by single user. > Under Auto Queue Creation context, it doesn't make much sense to set user > limit factor, because initially every queue will set weight to 1.0, we want > user can consume more resource if possible. It is hard to pre-determine how > to set up user limit factor. So it makes more sense to add a new value (like > -1) to indicate we will disable user limit factor > Logic need to be changed is below: > (Inside LeafQueue.java) > {code} > Resource maxUserLimit = Resources.none(); > if (schedulingMode == SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY) { > maxUserLimit = Resources.multiplyAndRoundDown(queueCapacity, > getUserLimitFactor()); > } else if (schedulingMode == SchedulingMode.IGNORE_PARTITION_EXCLUSIVITY) > { > maxUserLimit = partitionResource; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org