[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Description: HADOOP-17524 removed the metrics: LogFatal LogError LogWarn LogInfo These needs to be reflected in the invariable list of the TestMetricsInvariantChecker as well. was: HADOOP-17524 removed the metrics: > TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524 > - > > Key: YARN-10750 > URL: https://issues.apache.org/jira/browse/YARN-10750 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10750.001.patch > > > HADOOP-17524 removed the metrics: > LogFatal > LogError > LogWarn > LogInfo > These needs to be reflected in the invariable list of the > TestMetricsInvariantChecker as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Description: HADOOP-17524 removed the metrics: > TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524 > - > > Key: YARN-10750 > URL: https://issues.apache.org/jira/browse/YARN-10750 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10750.001.patch > > > HADOOP-17524 removed the metrics: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Attachment: YARN-10750.001.patch > TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524 > - > > Key: YARN-10750 > URL: https://issues.apache.org/jira/browse/YARN-10750 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10750.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524
Gergely Pollak created YARN-10750: - Summary: TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524 Key: YARN-10750 URL: https://issues.apache.org/jira/browse/YARN-10750 Project: Hadoop YARN Issue Type: Task Reporter: Gergely Pollak -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10746) RmWebApp add default-node-label-expression to the queue info
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330380#comment-17330380 ] Gergely Pollak commented on YARN-10746: --- Test failure is unrelated, it fails without my patch as well. The remaining 2 checkstyle issues are ignored for consistency with the rest of the fields. > RmWebApp add default-node-label-expression to the queue info > > > Key: YARN-10746 > URL: https://issues.apache.org/jira/browse/YARN-10746 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10746.001.patch, YARN-10746.002.patch, > YARN-10746.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10746) RmWebApp add default-node-label-expression to the queue info
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10746: -- Attachment: YARN-10746.003.patch > RmWebApp add default-node-label-expression to the queue info > > > Key: YARN-10746 > URL: https://issues.apache.org/jira/browse/YARN-10746 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10746.001.patch, YARN-10746.002.patch, > YARN-10746.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10746) RmWebApp add default-node-label-expression to the queue info
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10746: -- Attachment: YARN-10746.002.patch > RmWebApp add default-node-label-expression to the queue info > > > Key: YARN-10746 > URL: https://issues.apache.org/jira/browse/YARN-10746 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10746.001.patch, YARN-10746.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10654) Dots '.' in CSMappingRule path variables should be replaced
[ https://issues.apache.org/jira/browse/YARN-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329183#comment-17329183 ] Gergely Pollak commented on YARN-10654: --- [~pbacsko] thank you for the patch, LGTM+1 (Non-binding) You integrated the change really smoothly into the placement engine, I don't see any possible issues with it. I really like how we keep the original user name for all logging purposes, and only change it in the variable context. Perhaps in the future we might need to make the replacement string configurable, but now it works like FS does, which is fine for now I think. > Dots '.' in CSMappingRule path variables should be replaced > --- > > Key: YARN-10654 > URL: https://issues.apache.org/jira/browse/YARN-10654 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10654-001.patch > > > Dots are used as separators, so we should escape them somehow in the > variables when substituting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10746) RmWebApp add default-node-label-expression to the queue info
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10746: -- Attachment: YARN-10746.001.patch > RmWebApp add default-node-label-expression to the queue info > > > Key: YARN-10746 > URL: https://issues.apache.org/jira/browse/YARN-10746 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10746.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10746) RmWebApp add default-node-label-expression to the queue info
Gergely Pollak created YARN-10746: - Summary: RmWebApp add default-node-label-expression to the queue info Key: YARN-10746 URL: https://issues.apache.org/jira/browse/YARN-10746 Project: Hadoop YARN Issue Type: Task Reporter: Gergely Pollak -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10370) [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue Mapping rules
[ https://issues.apache.org/jira/browse/YARN-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317319#comment-17317319 ] Gergely Pollak commented on YARN-10370: --- Resolved, moved the leftover to a followup umbrella, where I'll create jiras for the further features as well. > [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue > Mapping rules > --- > > Key: YARN-10370 > URL: https://issues.apache.org/jira/browse/YARN-10370 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: capacity-scheduler, capacityscheduler > Attachments: MappingRuleEnhancements.pdf, Possible extensions of > mapping rule format in Capacity Scheduler.pdf > > > To continue closing the feature gaps between Fair Scheduler and Capacity > Scheduler to help users migrate between the scheduler more easy, we need to > add some of the Fair Scheduler placement rules to the capacity scheduler's > queue mapping functionality. > With [~snemeth] and [~pbacsko] we've created the following design docs about > the proposed changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10586) Create a command line tool for converting from legacy CS mapping rule format
[ https://issues.apache.org/jira/browse/YARN-10586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10586: -- Parent Issue: YARN-10729 (was: YARN-10370) > Create a command line tool for converting from legacy CS mapping rule format > > > Key: YARN-10586 > URL: https://issues.apache.org/jira/browse/YARN-10586 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10654) Dots '.' in CSMappingRule path variables should be replaced
[ https://issues.apache.org/jira/browse/YARN-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10654: -- Parent Issue: YARN-10729 (was: YARN-10370) > Dots '.' in CSMappingRule path variables should be replaced > --- > > Key: YARN-10654 > URL: https://issues.apache.org/jira/browse/YARN-10654 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > Dots are used as separators, so we should escape them somehow in the > variables when substituting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10426) Cleanup CSMappingPlacementRule remaining minor issues
[ https://issues.apache.org/jira/browse/YARN-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10426: -- Parent Issue: YARN-10729 (was: YARN-10370) > Cleanup CSMappingPlacementRule remaining minor issues > - > > Key: YARN-10426 > URL: https://issues.apache.org/jira/browse/YARN-10426 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Minor > > During the reviews of CSMappingPlacementRule changes, there were a few not > urgent changes, which should be done to keep the code as clean as possible, > fix these. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10414) Improve logging in CS placement rules to help debuggability
[ https://issues.apache.org/jira/browse/YARN-10414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10414: -- Parent Issue: YARN-10729 (was: YARN-10370) > Improve logging in CS placement rules to help debuggability > --- > > Key: YARN-10414 > URL: https://issues.apache.org/jira/browse/YARN-10414 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > It should be always clear which MappingRule determined the placement (or > rejection) of the application, in the case of debug logging the whys should > also be more visible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10729) CapacityScheduler placement rule improvements
Gergely Pollak created YARN-10729: - Summary: CapacityScheduler placement rule improvements Key: YARN-10729 URL: https://issues.apache.org/jira/browse/YARN-10729 Project: Hadoop YARN Issue Type: Task Reporter: Gergely Pollak This is an umbrella for all the leftover JIRAS from YARN-10370, and for the upcoming placement features. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10370) [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue Mapping rules
[ https://issues.apache.org/jira/browse/YARN-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak resolved YARN-10370. --- Resolution: Fixed > [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue > Mapping rules > --- > > Key: YARN-10370 > URL: https://issues.apache.org/jira/browse/YARN-10370 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: capacity-scheduler, capacityscheduler > Attachments: MappingRuleEnhancements.pdf, Possible extensions of > mapping rule format in Capacity Scheduler.pdf > > > To continue closing the feature gaps between Fair Scheduler and Capacity > Scheduler to help users migrate between the scheduler more easy, we need to > add some of the Fair Scheduler placement rules to the capacity scheduler's > queue mapping functionality. > With [~snemeth] and [~pbacsko] we've created the following design docs about > the proposed changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups
[ https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308012#comment-17308012 ] Gergely Pollak commented on YARN-10597: --- I checked and [~ahussein] was right, the configuration argument was missing, in this case it would just create a new configuration object, which might cause inconsistencies, so fixed it in patch 2, no other changes were required, tests pass properly. > CSMappingPlacementRule should not create new instance of Groups > --- > > Key: YARN-10597 > URL: https://issues.apache.org/jira/browse/YARN-10597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10597.001.patch, YARN-10597.002.patch > > > As [~ahussein] pointed out in YARN-10425, no new Groups instance should be > created. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups
[ https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10597: -- Attachment: YARN-10597.002.patch > CSMappingPlacementRule should not create new instance of Groups > --- > > Key: YARN-10597 > URL: https://issues.apache.org/jira/browse/YARN-10597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10597.001.patch, YARN-10597.002.patch > > > As [~ahussein] pointed out in YARN-10425, no new Groups instance should be > created. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17307805#comment-17307805 ] Gergely Pollak commented on YARN-10571: --- [~gandras] thank you for the update, patch 2 LGTM+1 (Non-binding). > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch, YARN-10571.002.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups
[ https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306175#comment-17306175 ] Gergely Pollak commented on YARN-10597: --- [~ahussein] thank you for the feedback, I'll take a look into the config argument passing, and double check it if we need to add it or not. [~pbacsko] I think during other patches we might have fixed the failing unit tests, when last time I checked it, there were some failures, but since then we made changes to the tests, and I think we added a test helper method to set the Groups externally. However I wasn't sure it would cover all the failing tests, it was a surprise to me as well. I expected some failures. Anyway I look into [~ahussein]'s suggestion, and get back with the results. > CSMappingPlacementRule should not create new instance of Groups > --- > > Key: YARN-10597 > URL: https://issues.apache.org/jira/browse/YARN-10597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10597.001.patch > > > As [~ahussein] pointed out in YARN-10425, no new Groups instance should be > created. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10659) Improve CS MappingRule %secondary_group evaluation
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10659: -- Attachment: YARN-10659.003.patch > Improve CS MappingRule %secondary_group evaluation > -- > > Key: YARN-10659 > URL: https://issues.apache.org/jira/browse/YARN-10659 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10659.001.patch, YARN-10659.002.patch, > YARN-10659.003.patch > > > Since the leaf queue names are not unique, there are a lot of use cases where > %secondary_group evaluation fail, or behave inconsistently. > We should extend it's behavior, when it's under a defined parent, > %secondary_group evaluation should only check for queue existence under that > queue. Egy root.group.%secondary_group, should only evaluate to groups which > exist under root.group, while the legacy %secondary_group.%user should still > look for groups by their leaf name globally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10370) [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue Mapping rules
[ https://issues.apache.org/jira/browse/YARN-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303767#comment-17303767 ] Gergely Pollak commented on YARN-10370: --- [~pbacsko] thank you for the advice, I agree we can close this jira soon, just let me fnish YARN-10659 and YARN-10597, then we can move the remaining to a follow up umbrella, or create standalone jiras of them. But this 2 JIRAS is quite important to consider this umbrella finished. > [Umbrella] Reduce the feature gap between FS Placement Rules and CS Queue > Mapping rules > --- > > Key: YARN-10370 > URL: https://issues.apache.org/jira/browse/YARN-10370 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: capacity-scheduler, capacityscheduler > Attachments: MappingRuleEnhancements.pdf, Possible extensions of > mapping rule format in Capacity Scheduler.pdf > > > To continue closing the feature gaps between Fair Scheduler and Capacity > Scheduler to help users migrate between the scheduler more easy, we need to > add some of the Fair Scheduler placement rules to the capacity scheduler's > queue mapping functionality. > With [~snemeth] and [~pbacsko] we've created the following design docs about > the proposed changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10659) Improve CS MappingRule %secondary_group evaluation
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10659: -- Attachment: YARN-10659.002.patch > Improve CS MappingRule %secondary_group evaluation > -- > > Key: YARN-10659 > URL: https://issues.apache.org/jira/browse/YARN-10659 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10659.001.patch, YARN-10659.002.patch > > > Since the leaf queue names are not unique, there are a lot of use cases where > %secondary_group evaluation fail, or behave inconsistently. > We should extend it's behavior, when it's under a defined parent, > %secondary_group evaluation should only check for queue existence under that > queue. Egy root.group.%secondary_group, should only evaluate to groups which > exist under root.group, while the legacy %secondary_group.%user should still > look for groups by their leaf name globally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299260#comment-17299260 ] Gergely Pollak edited comment on YARN-10571 at 3/11/21, 3:16 AM: - [~gandras] thak you for the patch, I really like to see these functionalities are being moved where they belong. It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen: - CapacitySchedulerQueueManager minds his business - A wild call to createQueue with queue (parent.leaf) appears {code:java} Line 531 CSQueue parentQueue = getQueue(parentQueueName); determines it does not exists (because parent is ambiguous){code} - CreateQueue uses {code:java} Line 537 return createAutoQueue(queue); //It's not very effective{code} - CreateAutoQueue tries to determine the missing parents {code:java} 565 ApplicationPlacementContext queueCandidateContext = parentContext; 566 CSQueue firstExistingQueue = getQueue( 567 queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous 559 public List determineMissingParents( 560 ApplicationPlacementContext queue) throws SchedulerDynamicEditException { //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf' 561 ApplicationPlacementContext parentContext = 562 CSQueueUtils.extractQueuePath(queue.getParentQueue()); //parentContext parent: null leaf: 'parent' fullPath: 'parent' 563 List parentsToCreate = new ArrayList<>(); 569 while (firstExistingQueue == null) { 570 parentsToCreate.add(queueCandidateContext); //parent added to list 571 queueCandidateContext = CSQueueUtils.extractQueuePath( 572 queueCandidateContext.getParentQueue()); //NPE because queueCandidateContext.getParentQueue() == null{code} was (Author: shuzirra): It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with t
[jira] [Comment Edited] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299260#comment-17299260 ] Gergely Pollak edited comment on YARN-10571 at 3/11/21, 3:15 AM: - It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen: - CapacitySchedulerQueueManager minds his business - A wild call to createQueue with queue (parent.leaf) appears {code:java} Line 531 CSQueue parentQueue = getQueue(parentQueueName); determines it does not exists (because parent is ambiguous){code} - CreateQueue uses {code:java} Line 537 return createAutoQueue(queue); //It's not very effective{code} - CreateAutoQueue tries to determine the missing parents {code:java} 565 ApplicationPlacementContext queueCandidateContext = parentContext; 566 CSQueue firstExistingQueue = getQueue( 567 queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous 559 public List determineMissingParents( 560 ApplicationPlacementContext queue) throws SchedulerDynamicEditException { //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf' 561 ApplicationPlacementContext parentContext = 562 CSQueueUtils.extractQueuePath(queue.getParentQueue()); //parentContext parent: null leaf: 'parent' fullPath: 'parent' 563 List parentsToCreate = new ArrayList<>(); 569 while (firstExistingQueue == null) { 570 parentsToCreate.add(queueCandidateContext); //parent added to list 571 queueCandidateContext = CSQueueUtils.extractQueuePath( 572 queueCandidateContext.getParentQueue()); //NPE because queueCandidateContext.getParentQueue() == null{code} was (Author: shuzirra): It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something
[jira] [Comment Edited] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299260#comment-17299260 ] Gergely Pollak edited comment on YARN-10571 at 3/11/21, 3:15 AM: - It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen: - CapacitySchedulerQueueManager minds his business - A wild call to createQueue with queue (parent.leaf) appears {code:java} Line 531 CSQueue parentQueue = getQueue(parentQueueName); determines it does not exists (because parent is ambiguous){code} - CreateQueue uses {code:java} Line 537 return createAutoQueue(queue); //It's not very effective{code} - CreateAutoQueue tries to determine the missing parents {code:java} 565 ApplicationPlacementContext queueCandidateContext = parentContext; 566 CSQueue firstExistingQueue = getQueue( 567 queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous 559 public List determineMissingParents( 560 ApplicationPlacementContext queue) throws SchedulerDynamicEditException { //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf' 561 ApplicationPlacementContext parentContext = 562 CSQueueUtils.extractQueuePath(queue.getParentQueue()); //parentContext parent: null leaf: 'parent' fullPath: 'parent' 563 List parentsToCreate = new ArrayList<>(); 569 while (firstExistingQueue == null) { 570 parentsToCreate.add(queueCandidateContext); //parent added to list 571 queueCandidateContext = CSQueueUtils.extractQueuePath( 572 queueCandidateContext.getParentQueue()); //NPE because queueCandidateContext.getParentQueue() == null{code} was (Author: shuzirra): It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.par
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299260#comment-17299260 ] Gergely Pollak commented on YARN-10571: --- It's in the original code as well, but we can have a NPE here: CapacitySchedulerQueueManager#removeLegacyDynamicQueue {code:java} CSQueue q = this.getQueue(queueName); if (!(AbstractAutoCreatedLeafQueue.class.isAssignableFrom(q.getClass( { {code} q can be null if queueName is ambiguous (or queue doesn't exist); CapacitySchedulerQueueManager#determineMissingParents: 1) My main concern is we create some unnecessary garbage here, each time when we move up in the path we create a new instance of an ApplicationPlacementContext via the CSQueueUtils.extractQueuePath, just to have a parent path. It would be much more cost (both CPU and memory) efficient if we'd just split the path, and build from the root towards the leaf via eg a StringBuilder (you can even set the capacity for it, since you know the final length of the string, for maximal efficiency). Building from the root towards the leaf also makes it possible to find the first (or in this case the last) static parent as well, and also you can make an early exception based on the depthLimit if the lastStatic is further than the limit. 2) Instead of the Collections.reverse you might want to use a LinkedList and just simply insert to the head, this is not a big deal, but still unnecessary computation. 3) And here is some super weird edge case: In legacy CS queue naming (before the era of the multiple leaf queue with the same name), paths like parent.leaf did exist. Which means you might have a queue like, root.something.parent.leaf, and the parent.leaf reference would still find your leaf. However now we might have problems with ambiguity as the 'parent' portion of the path might become ambiguous, so the following can happen: - CapacitySchedulerQueueManager minds his business - A wild call to createQueue with queue (parent.leaf) appears {code:java} Line 531 CSQueue parentQueue = getQueue(parentQueueName); determines it does not exists (because parent is ambiguous){code} - CreateQueue uses {code:java} Line 537 return createAutoQueue(queue); //It's not very effective{code} - CreateAutoQueue tries to determine the missing parents {code:java} 565 ApplicationPlacementContext queueCandidateContext = parentContext; 566 CSQueue firstExistingQueue = getQueue( 567 queueCandidateContext.getFullQueuePath()); //firstExistingQueue = null, because parent IS ambiguous 559 public List determineMissingParents( 560 ApplicationPlacementContext queue) throws SchedulerDynamicEditException { //queue parent: 'parent' leaf: 'leaf' fullPath: 'parent.leaf' 561 ApplicationPlacementContext parentContext = 562 CSQueueUtils.extractQueuePath(queue.getParentQueue()); //parentContext parent: null leaf: 'parent' fullPath: 'parent' 563 List parentsToCreate = new ArrayList<>(); 569 while (firstExistingQueue == null) { 570 parentsToCreate.add(queueCandidateContext); //parent added to list 571 queueCandidateContext = CSQueueUtils.extractQueuePath( 572 queueCandidateContext.getParentQueue()); //NPE because queueCandidateContext.getParentQueue() == null {code} > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10659) Improve CS MappingRule %secondary_group evaluation
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10659: -- Attachment: YARN-10659.001.patch > Improve CS MappingRule %secondary_group evaluation > -- > > Key: YARN-10659 > URL: https://issues.apache.org/jira/browse/YARN-10659 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10659.001.patch > > > Since the leaf queue names are not unique, there are a lot of use cases where > %secondary_group evaluation fail, or behave inconsistently. > We should extend it's behavior, when it's under a defined parent, > %secondary_group evaluation should only check for queue existence under that > queue. Egy root.group.%secondary_group, should only evaluate to groups which > exist under root.group, while the legacy %secondary_group.%user should still > look for groups by their leaf name globally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10679) Better logging of uncaught exceptions throughout SLS
[ https://issues.apache.org/jira/browse/YARN-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298002#comment-17298002 ] Gergely Pollak commented on YARN-10679: --- [~snemeth] thank you for the patch, LGTM+1 (Non-binding) > Better logging of uncaught exceptions throughout SLS > > > Key: YARN-10679 > URL: https://issues.apache.org/jira/browse/YARN-10679 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-10679.001.patch > > > In our internal environment, there was a test failure while running SLS tests > with Jenkins. > It's difficult to align the uncaught exceptions (in this case an NPE) and the > log itself as the exception is logged with {{e.printStackTrace()}}. > This jira is to replace printStackTrace calls in SLS with {{LOG.error("msg", > exception)}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10681) Fix assertion failure message in BaseSLSRunnerTest
[ https://issues.apache.org/jira/browse/YARN-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298000#comment-17298000 ] Gergely Pollak commented on YARN-10681: --- [~snemeth] thank you for the patch LGTM+1(Non-binding) > Fix assertion failure message in BaseSLSRunnerTest > -- > > Key: YARN-10681 > URL: https://issues.apache.org/jira/browse/YARN-10681 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Trivial > Attachments: YARN-10681.001.patch > > > There is this failure message: > https://github.com/apache/hadoop/blob/a89ca56a1b0eb949f56e7c6c5c25fdf87914a02f/hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/BaseSLSRunnerTest.java#L129-L130 > The word "catched" should be replaced with "caught". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10678) Try blocks without catch blocks in SLS scheduler classes can swallow other exceptions
[ https://issues.apache.org/jira/browse/YARN-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297999#comment-17297999 ] Gergely Pollak commented on YARN-10678: --- [~snemeth] thank you for the patch, LGTM+1 (Non-binding) > Try blocks without catch blocks in SLS scheduler classes can swallow other > exceptions > - > > Key: YARN-10678 > URL: https://issues.apache.org/jira/browse/YARN-10678 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-10678-unchecked-exception-from-FS-allocate.diff, > YARN-10678-unchecked-exception-from-FS-allocate_fixed.diff, > YARN-10678.001.patch, > org.apache.hadoop.yarn.sls.TestReservationSystemInvariants__testSimulatorRunning_modified.log, > > org.apache.hadoop.yarn.sls.TestReservationSystemInvariants__testSimulatorRunning_original.log > > > In SLSFairScheduler, we have this try-finally block (without catch block) in > the allocate method: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L109-L123 > Similarly, in SLSCapacityScheduler: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java#L116-L131 > In the finally block, the updateQueueWithAllocateRequest is invoked: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L118 > In our internal environment, there was a situation when an NPE was logged > from this method: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.sls.scheduler.SLSFairScheduler.updateQueueWithAllocateRequest(SLSFairScheduler.java:262) > at > org.apache.hadoop.yarn.sls.scheduler.SLSFairScheduler.allocate(SLSFairScheduler.java:118) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:288) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348) > at > org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212) > at > org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > This can happen if the following events occur: > 1. A runtime exception is thrown in FairScheduler or CapacityScheduler's > allocate method > 2. In this case, the local variable called 'allocation' remains null: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L110 > 3. In updateQueueWithAllocateRequest, this null object will be dereferenced > here: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L262 > 4. Then, we have an NPE here: > https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L117-L122 > In this case, we lost the original exception thrown from > FairScheduler#allocate. > In order to fix this, a catch-block should be introduced and the exception > needs to be logged. > The whole thing applies to SLSCapacityScheduler as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (YARN-10676) Improve code quality in TestTimelineAuthenticationFilterForV1
[ https://issues.apache.org/jira/browse/YARN-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296062#comment-17296062 ] Gergely Pollak commented on YARN-10676: --- [~snemeth] thank you for the patch, LGTM+1 (Non-binding) > Improve code quality in TestTimelineAuthenticationFilterForV1 > - > > Key: YARN-10676 > URL: https://issues.apache.org/jira/browse/YARN-10676 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10676.001.patch > > > - In testcase "testDelegationTokenOperations", the exception message is > checked but in case it does not match the assertion, the exception is not > printed. This happens 3 times. > - Assertion messages can be added > - Fields called "httpSpnegoKeytabFile" and "httpSpnegoPrincipal" can be > static final. > - There's a typo in comment "avaiable" (happens 2 times) > - There are some Assert.fail() calls, without messages. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10659) Improve CS MappingRule %secondary_group evaluation
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10659: - Assignee: Gergely Pollak > Improve CS MappingRule %secondary_group evaluation > -- > > Key: YARN-10659 > URL: https://issues.apache.org/jira/browse/YARN-10659 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > Since the leaf queue names are not unique, there are a lot of use cases where > %secondary_group evaluation fail, or behave inconsistently. > We should extend it's behavior, when it's under a defined parent, > %secondary_group evaluation should only check for queue existence under that > queue. Egy root.group.%secondary_group, should only evaluate to groups which > exist under root.group, while the legacy %secondary_group.%user should still > look for groups by their leaf name globally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10659) Improve CS MappingRule %secondary_group evaluation
Gergely Pollak created YARN-10659: - Summary: Improve CS MappingRule %secondary_group evaluation Key: YARN-10659 URL: https://issues.apache.org/jira/browse/YARN-10659 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Since the leaf queue names are not unique, there are a lot of use cases where %secondary_group evaluation fail, or behave inconsistently. We should extend it's behavior, when it's under a defined parent, %secondary_group evaluation should only check for queue existence under that queue. Egy root.group.%secondary_group, should only evaluate to groups which exist under root.group, while the legacy %secondary_group.%user should still look for groups by their leaf name globally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10654) Dots '.' in CSMappingRule path variables should be replaced
[ https://issues.apache.org/jira/browse/YARN-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10654: - Assignee: Gergely Pollak > Dots '.' in CSMappingRule path variables should be replaced > --- > > Key: YARN-10654 > URL: https://issues.apache.org/jira/browse/YARN-10654 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > Dots are used as separators, so we should escape them somehow in the > variables when substituting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291614#comment-17291614 ] Gergely Pollak commented on YARN-10652: --- The problem here is the consistency. There are multiple places where user names are used in queue path parts, mapping rules is one example. If we don't handle user names with dots properly, then we cannot say we support user names with dots. We need to handle in each and every case, which means we need a more centralized or at least consistent solution. Simply replacing the dots with underscores for this property won't solve the issue CS wide, and we need a CS wide solution, which might consist of multiple separate smaller solutions like this, but we need to centralize this effort. For example as soon someone introduces a rule like root.user.%user, we already have an issue with user names with dots, and if we handle them differently here, than your solution, we might get inconsistent configuration and behavior. Also we need to check other places where usernames with dots (and group names) can cause issues. Also I prefer the FS solution for substitution with '_dot_' rather than a simple '_' since there is a much smaller chance of user name collision this way. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10654) Dots '.' in CSMappingRule path variables should be replaced
Gergely Pollak created YARN-10654: - Summary: Dots '.' in CSMappingRule path variables should be replaced Key: YARN-10654 URL: https://issues.apache.org/jira/browse/YARN-10654 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Dots are used as separators, so we should escape them somehow in the variables when substituting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10639) Queueinfo related capacity, should ajusted to weight mode.
[ https://issues.apache.org/jira/browse/YARN-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290022#comment-17290022 ] Gergely Pollak commented on YARN-10639: --- While this fix fills the capacity field in the case of a weight configured queue, I think we should investigate if we could painlessly add a weight field to the queue info and fill it rather. The problem is I've check the usages of the capacity field of the QueueInfo, and while it is mostly used for display purposes, it is considered a percentage value, and parsed as such, so it might cause confusion or even worse it might result in bogus behaviour. > Queueinfo related capacity, should ajusted to weight mode. > -- > > Key: YARN-10639 > URL: https://issues.apache.org/jira/browse/YARN-10639 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10639.001.patch, YARN-10639.002.patch > > > {color:#172b4d}The class QueueInfo capacity field should consider the weight > mode.{color} > {color:#172b4d}Now when client use getQueueInfo to get queue capacity in > weight mode, i always return 0, it is wrong.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10623) Capacity scheduler should support refresh queue automatically by a thread policy.
[ https://issues.apache.org/jira/browse/YARN-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290013#comment-17290013 ] Gergely Pollak commented on YARN-10623: --- [~zhuqi] thank you for the patch! I really like the idea that the solution is a configurable extra component (scheduling edit policy), this way it is cleanly decoupled from the main code base, won't cause any issue for those who don't want to use it. I very much agree with [~gandras] that last modification check was necessary however I think it introduced a bug: {code:java} 126if (lastModified > lastSuccessfulReload && 127time > lastModified + monitoringInterval) { {code} So in the edge case when you update this file MORE frequently than the monitoringInterval (which is configurable, so it might be in the minute or even 10 minutes range), then you won't ever refresh the config, since last modified + monitoringInterval will be ALWAYS greater than the current time. I think you should go with {code:java} 126 if (lastModified > lastSuccessfulReload && 127 time > lastSuccessfulReload + monitoringInterval) { {code} Or even better introduce a lastReloadAttempt, since a reload can fail, and in this case it would result keep trying to reload the invalid configuration, so if you'd introduce a lastReloadAttempt and set it each time before you try to reload the configuration, then you could use {code:java} 126 if (lastModified > lastReloadAttempt && 127 time > lastReloadAttempt + monitoringInterval) { {code} This would guarantee that you don't reload more frequently than the monitoringInterval, you don't reload if the configuration hasn't been modified, but still keep reloading if the file is updated frequently. Otherwise the patch looks good to me (non-binding). > Capacity scheduler should support refresh queue automatically by a thread > policy. > - > > Key: YARN-10623 > URL: https://issues.apache.org/jira/browse/YARN-10623 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10623.001.patch, YARN-10623.002.patch, > YARN-10623.003.patch > > > In fair scheduler, it is supported that refresh queue related conf > automatically by a thread to reload, but in capacity scheduler we only > support to refresh queue related changes by refreshQueues, it is needed for > our cluster to realize queue manage. > cc [~wangda] [~ztang] [~pbacsko] [~snemeth] [~gandras] [~bteke] [~shuzirra] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10632) Make maximum depth allowed configurable.
[ https://issues.apache.org/jira/browse/YARN-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287256#comment-17287256 ] Gergely Pollak commented on YARN-10632: --- Also please note that this has an effect on the placement engine as well, and currently the path validation methods are separated, so even if CS accepts deeper queue creations, placement rules wouldn't. First we need to centralize the validation for the queue paths to be created (YARN-10584), and also we need to make a few adjustments in the placement engine in order to make this work. > Make maximum depth allowed configurable. > > > Key: YARN-10632 > URL: https://issues.apache.org/jira/browse/YARN-10632 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10632.001.patch > > > Now the max depth allowed are fixed to 2. But i think this should be > configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10636) CS Auto Queue creation should reject submissions with empty path parts
[ https://issues.apache.org/jira/browse/YARN-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10636: -- Attachment: YARN-10636.002.patch > CS Auto Queue creation should reject submissions with empty path parts > -- > > Key: YARN-10636 > URL: https://issues.apache.org/jira/browse/YARN-10636 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10636.001.patch, YARN-10636.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10636) CS Auto Queue creation should reject submissions with empty path parts
[ https://issues.apache.org/jira/browse/YARN-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287056#comment-17287056 ] Gergely Pollak commented on YARN-10636: --- Thank you for the feedbacks [~zhuqi] [~gandras] [~bteke] much appreciated! [~bteke] [~gandras] I'd suggest to create a followup Jira for test refinements. Fixed the checkstyle issue and the redundant null check in patchset 2. Test failures seem to be unrelated, manually ran them, all passed. > CS Auto Queue creation should reject submissions with empty path parts > -- > > Key: YARN-10636 > URL: https://issues.apache.org/jira/browse/YARN-10636 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10636.001.patch, YARN-10636.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10414) Improve logging in CS placement rules to help debuggability
[ https://issues.apache.org/jira/browse/YARN-10414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287000#comment-17287000 ] Gergely Pollak commented on YARN-10414: --- A message improvement suggestion was provided in: https://issues.apache.org/jira/browse/YARN-10635 > Improve logging in CS placement rules to help debuggability > --- > > Key: YARN-10414 > URL: https://issues.apache.org/jira/browse/YARN-10414 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > It should be always clear which MappingRule determined the placement (or > rejection) of the application, in the case of debug logging the whys should > also be more visible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10636) CS Auto Queue creation should reject submissions with empty path parts
[ https://issues.apache.org/jira/browse/YARN-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10636: - Assignee: Gergely Pollak > CS Auto Queue creation should reject submissions with empty path parts > -- > > Key: YARN-10636 > URL: https://issues.apache.org/jira/browse/YARN-10636 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10636) CS Auto Queue creation should reject submissions with empty path parts
Gergely Pollak created YARN-10636: - Summary: CS Auto Queue creation should reject submissions with empty path parts Key: YARN-10636 URL: https://issues.apache.org/jira/browse/YARN-10636 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286816#comment-17286816 ] Gergely Pollak commented on YARN-10635: --- Reuploading to see if the yetus issue have been resolved. > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch, YARN-10635.002.patch, > YARN-10635.003.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10635: -- Attachment: YARN-10635.003.patch > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch, YARN-10635.002.patch, > YARN-10635.003.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286671#comment-17286671 ] Gergely Pollak commented on YARN-10635: --- Seems to be a YESTUS issue, reuploading to see if the issue persists. > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch, YARN-10635.002.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286671#comment-17286671 ] Gergely Pollak edited comment on YARN-10635 at 2/18/21, 6:58 PM: - Seems to be a YETUS issue, reuploading to see if the issue persists. was (Author: shuzirra): Seems to be a YESTUS issue, reuploading to see if the issue persists. > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch, YARN-10635.002.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10635: -- Attachment: YARN-10635.002.patch > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch, YARN-10635.002.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10635: -- Attachment: YARN-10635.001.patch > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10635.001.patch > > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10635) CSMapping rule can return paths with empty parts
[ https://issues.apache.org/jira/browse/YARN-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10635: - Assignee: Gergely Pollak > CSMapping rule can return paths with empty parts > > > Key: YARN-10635 > URL: https://issues.apache.org/jira/browse/YARN-10635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > When a variable to be substituted evaluates to empty string, we might result > with paths where one of the parts is empty, these paths are obviously > problematic, but sometimes (when the path includes a dynamicParent) we accept > them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10635) CSMapping rule can return paths with empty parts
Gergely Pollak created YARN-10635: - Summary: CSMapping rule can return paths with empty parts Key: YARN-10635 URL: https://issues.apache.org/jira/browse/YARN-10635 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak When a variable to be substituted evaluates to empty string, we might result with paths where one of the parts is empty, these paths are obviously problematic, but sometimes (when the path includes a dynamicParent) we accept them as valid paths instead of getting the fallback action of the rule. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10618) RM UI2 Application page shows the AM preempted containers instead of the nonAM ones
[ https://issues.apache.org/jira/browse/YARN-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282642#comment-17282642 ] Gergely Pollak commented on YARN-10618: --- [~bteke] thank you for the patch it is quite straightforward LGTM+1 (non-binding). > RM UI2 Application page shows the AM preempted containers instead of the > nonAM ones > --- > > Key: YARN-10618 > URL: https://issues.apache.org/jira/browse/YARN-10618 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Minor > Attachments: YARN-10618.001.patch > > > YARN RM UIv2 application page shows the AM preempted containers under both > the _Num Non-AM container preempted_ and _Num AM container preempted_. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10619) CS Mapping Rule %specified rule catches default submissions
Gergely Pollak created YARN-10619: - Summary: CS Mapping Rule %specified rule catches default submissions Key: YARN-10619 URL: https://issues.apache.org/jira/browse/YARN-10619 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Assignee: Gergely Pollak If we have a mapping rule which places the application to the %specified queue, then application submissions without specified queues (default) will get placed to default. The expected behaviour would be to fail the specified placement when no queue was specified, and move on or reject based on the fallback action of the rule. Also it is impossible to differentiate between explicitly specified 'default' and when the user does not specify any actual queue, so these will be handled the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278453#comment-17278453 ] Gergely Pollak edited comment on YARN-10612 at 2/4/21, 1:25 AM: Trunk compile shows the findbugs error, patch compile doesn't so I think we can consider it fixed. The reason we don't have any tests for this change is the actual findbugs warning was caused by an unnecessary null check, and null inputs already have test cases. Console log relevant part: {code:java} findbugs detection: patch ... Writing /home/jenkins/jenkins-agent/workspace/PreCommit-YARN-Build/out/combined-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.xmlhadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {code} was (Author: shuzirra): Trunk compile shows the findbugs error, patch compile doesn't so I think we can consider it fixed. The reason we don't have any tests for this change is the actual findbugs warning was caused by an unnecessary null check, and null inputs already have test cases. > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278453#comment-17278453 ] Gergely Pollak edited comment on YARN-10612 at 2/4/21, 1:18 AM: Trunk compile shows the findbugs error, patch compile doesn't so I think we can consider it fixed. The reason we don't have any tests for this change is the actual findbugs warning was caused by an unnecessary null check, and null inputs already have test cases. was (Author: shuzirra): Trunk compile shows the findbugs error, patch compile doesn't so I think we can consider it fixed. The reason we don't have any tests for this change is the actual findbugs warning was caused by an unnecessary null check, and null inputs already have test cases. > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278453#comment-17278453 ] Gergely Pollak commented on YARN-10612: --- Trunk compile shows the findbugs error, patch compile doesn't so I think we can consider it fixed. The reason we don't have any tests for this change is the actual findbugs warning was caused by an unnecessary null check, and null inputs already have test cases. > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278360#comment-17278360 ] Gergely Pollak commented on YARN-10585: --- [~ahussein] thank you for your suggestions. {code:java} For future code mergse and commits, please make sure that the patch/PR does not generate Yetus errors before merging. {code} While I'm really sorry for the oversight, mistakes unfortunately do happen, as soon we realized it we opened an other Jira to fix it. Please let's not argue about what are the "proper way to fix things" but rather let's focus on actually fixing the thing. {code:java} It is not scalable to have several Jiras filed just to fix checkstyle, and findbugs. {code} It has nothing to do with scalability, also we are not planning to make it a habit, but I think it's more important to get this resolved than to worry about one extra JIRA. > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10585.001.patch, YARN-10585.002.patch, > YARN-10585.003.patch > > > To make transition easier we need to create tooling to support the migration > effort. The first step is to create a class which can migrate from legacy to > the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278354#comment-17278354 ] Gergely Pollak commented on YARN-10612: --- [~ahussein] I don't see how it will make anything better if we reopen an other jira. Currently we have the patch which fails the tests in the trunk, so to remove it we either have to do a revert commit, then a recommit. But the commits between the actual commit and the revert will still fail the findbugs warning (which is a false positive I might add). So I don't see why would 2 commits (revert + fix) would be better than just fixing this Jira and solve all in one go. Anyways, until it settles, I'm setting it to patch available to let the jenkins run the findbugs again. > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10612: - Assignee: Gergely Pollak > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10612) Fix find bugs issue introduced in YARN-10585
[ https://issues.apache.org/jira/browse/YARN-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10612: -- Attachment: YARN-10612.001.patch > Fix find bugs issue introduced in YARN-10585 > > > Key: YARN-10612 > URL: https://issues.apache.org/jira/browse/YARN-10612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Priority: Major > Attachments: YARN-10612.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10612) Fix find bugs issue introduced in YARN-10585
Gergely Pollak created YARN-10612: - Summary: Fix find bugs issue introduced in YARN-10585 Key: YARN-10612 URL: https://issues.apache.org/jira/browse/YARN-10612 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10610) Add queuePath to restful api for CapacityScheduler consistent with FairScheduler queuePath.
[ https://issues.apache.org/jira/browse/YARN-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277968#comment-17277968 ] Gergely Pollak commented on YARN-10610: --- [~zhuqi] thank you for the patch! Nice change, when we introduced the leaf queue / full path change we intentionally did not change any external interfaces to make sure we don't break any tools relying on these interfaces, however this change is very clean, and only extends the response, so there should be no such issue with it! LGTM +1 (Non-binding) > Add queuePath to restful api for CapacityScheduler consistent with > FairScheduler queuePath. > --- > > Key: YARN-10610 > URL: https://issues.apache.org/jira/browse/YARN-10610 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10610.001.patch, image-2021-02-03-13-47-13-516.png > > > The cs only have queueName, but not full queuePath. > !image-2021-02-03-13-47-13-516.png|width=631,height=356! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10604) Support auto queue creation without mapping rules
[ https://issues.apache.org/jira/browse/YARN-10604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276351#comment-17276351 ] Gergely Pollak commented on YARN-10604: --- [~gandras]thank you for the patch, LGTM+1 (Non-binding) > Support auto queue creation without mapping rules > - > > Key: YARN-10604 > URL: https://issues.apache.org/jira/browse/YARN-10604 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10604.001.patch > > > Currently, the Capacity Scheduler skips auto queue creation entirely, if the > ApplicationPlacementContext is null, which happens, when the mapping rules > are turned off by: > {noformat} > > yarn.scheduler.capacity.queue-mappings-override.enable > false > {noformat} > We should allow the auto queue creation to be taken into consideration > without disrupting the application submission flow. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10605) Add queue-mappings-override.enable property in FS2CS conversions
[ https://issues.apache.org/jira/browse/YARN-10605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276347#comment-17276347 ] Gergely Pollak commented on YARN-10605: --- [~gandras]Thank you for the patch! The only thing I've noticed you import import static org.junit.Assert.assertFalse, but you don't use it. Otherwise LGTM+1 (Non-binding) > Add queue-mappings-override.enable property in FS2CS conversions > > > Key: YARN-10605 > URL: https://issues.apache.org/jira/browse/YARN-10605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10605.001.patch > > > In Capacity Scheduler the > {noformat} > queue-mappings-override.enable > {noformat} > property is false by default. As this is not set during an FS2CS conversion, > the converted placement rules (aka. mapping rules in CS) are ignored during > application submission. We should enable this property in the conversion > logic if there are placement rules to be converted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272072#comment-17272072 ] Gergely Pollak commented on YARN-10585: --- Latest patch fixed [~gandras] "userGroupMappingRules and applicationNameMappingRules might be initialised as empty sets" concern, now the null check is centralised and moved to the setters. Also fixed ASF warnings, and added a testcase for the "u:%user:root.%user.QUEUE" case. > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10585.001.patch, YARN-10585.002.patch, > YARN-10585.003.patch > > > To make transition easier we need to create tooling to support the migration > effort. The first step is to create a class which can migrate from legacy to > the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10585: -- Attachment: YARN-10585.003.patch > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10585.001.patch, YARN-10585.002.patch, > YARN-10585.003.patch > > > To make transition easier we need to create tooling to support the migration > effort. The first step is to create a class which can migrate from legacy to > the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10425) Replace the legacy placement engine in CS with the new one
[ https://issues.apache.org/jira/browse/YARN-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271331#comment-17271331 ] Gergely Pollak commented on YARN-10425: --- [~ahussein]Thank you for the feedback, I've created the followup Jira YARN-10597 . You are right, this shouldn't instantiated by the {{CSMappingPlacementRule}} class, because it will break the cache. The oversight was when I checked it, I thought the legacy code worked this way, and that's why I kept it here. However now I've double checked, and I realized, I must had checked an older branch last time, since it was fixed about a year ago. So thank you for pointing this out. We'll need to update a few tests, but I'll take a look at them. > Replace the legacy placement engine in CS with the new one > -- > > Key: YARN-10425 > URL: https://issues.apache.org/jira/browse/YARN-10425 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10425.001.patch, YARN-10425.002.patch, > YARN-10425.003.patch, YARN-10425.004.patch, YARN-10425.005.patch, > YARN-10425.006.patch, YARN-10425.007.patch > > > Remove the UserGroupMapping and ApplicationName mapping classes, and use the > new CSMappingPlacementRule instead. Also cleanup the orphan classes which are > used by these classes only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups
[ https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10597: - Assignee: Gergely Pollak > CSMappingPlacementRule should not create new instance of Groups > --- > > Key: YARN-10597 > URL: https://issues.apache.org/jira/browse/YARN-10597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > As [~ahussein] pointed out in YARN-10425, no new Groups instance should be > created. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups
Gergely Pollak created YARN-10597: - Summary: CSMappingPlacementRule should not create new instance of Groups Key: YARN-10597 URL: https://issues.apache.org/jira/browse/YARN-10597 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak As [~ahussein] pointed out in YARN-10425, no new Groups instance should be created. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271297#comment-17271297 ] Gergely Pollak edited comment on YARN-10585 at 1/25/21, 1:01 PM: - [~gandras] thank you for the review and your comments, let me address your concerns. * Nit: in splitRule#line:213 the classic for loop could be replaced by the enhanced loop/stream, which is nicer to readI It is VERY subjective what is nice / easy to read, but in this case it's quite hard to contradict this statement. Literally ANY developer who ever wrote a loop, will be able to read a simple for loop, while to understand streams they need some deeper java knowledge, so I think it points out very well, why is the for loop easier (this nicer) to read. But let's get technical a bit, because the overuse of the java streams is a pet peeve of mine. Java streams with lambdas will create an unnecessary (well for streams they ARE necessary) anonymous class in the background, and wrap the actual loop core with multiple method calls, which inherently put extra strain on the stack, gc and on the performance, for a very subjective little gain. {code:java} java.lang.ArithmeticException: / by zero at fiddle.main(fiddle.java:7) java.lang.ArithmeticException: / by zero at fiddle.lambda$main$0(fiddle.java:14) at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:581) at fiddle.main(fiddle.java:14){code} We can see the extra method calls and the created class as well. This has an effect on performance as well: {code:java} int sum = 0; for (int i = 0; i < 10; i++) { sum += i; } {code} Runs on my computer about 330ms {code:java} IntStream.range(0, 10).sum();{code} This has a bit of a range between 380-420 ms, but even in best case it is still a 20% performance degradation. So all in all, I prefer to avoid streams, when they are not strictly needed, because they are not to replace regular loops but to add some functional programming support to java. They have their place in the language, for example when one wants to use parallel streams they can really improve the performance, with much less hassle about thread handling and concurrency. * Nit: in creatUserMappingRule, you replace the match argument with its json equivalent (*). Overwriting arguments are generally not a good idea, because it could produce hard-to-find errors, also made the debugging somewhat harder. As a general rule it's true, however I'm converting the argument in place, before using it, it is not a repurpose of the variable, but I'm making sure, the function receives the input it will properly process, in this case I don't really find it problematic, since the goal here is to "hide" the incorrect value from the rest of the method, hence the overwrite. * I am not sure about this, but is the following format accepted in the legacy format: Yeah, it is a complex question. It is NOT supported by the legacy engine, but it works in the new engine with legacy mode... however I've took a look at it, and actually the converter DOES support this conversion, it converts it to a customRule. I don't want to add any validation to the input, besides it's syntax, because the tool will do a best effort conversion for unsupported cases. It should be able to convert EVERY use case supported by the legacy engine, and all cases supported by the new engine in legacy format, however we don't encourage to use the new features in the legacy format. So if someone still uses them we might be able to convert those rules as well (probably they will become custom rules) so I don't want to reject those just because they are not officially supported. * userGroupMappingRules and applicationNameMappingRules might be initialised as empty sets. This would eliminate the null checks in convert loops, but the empty cases could still be checked in the beginning. Since I expose the collection setter method then I'd have to move the null check there, so we don't really save null checks, but at least we don't create unnecessary objects. However I like the idea to have a centralised null check, so I'll see the other feedbacks, and if I'll make a new patch set I'll consider this suggestion. was (Author: shuzirra): [~gandras] thank you for the review and your comments, let me address your concerns. * Nit: in splitRule#line:213 the classic for loop could be replaced by the enhanced loop/stream, which is nicer to readI It is VERY subjective what is nice / easy to read, but in this case it's quite hard to contradict this statement. Literally ANY developer who ever wrote a loop, will be able to read a simple for loop, while to understand streams they need some deeper java knowledg
[jira] [Commented] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271297#comment-17271297 ] Gergely Pollak commented on YARN-10585: --- [~gandras] thank you for the review and your comments, let me address your concerns. * Nit: in splitRule#line:213 the classic for loop could be replaced by the enhanced loop/stream, which is nicer to readI It is VERY subjective what is nice / easy to read, but in this case it's quite hard to contradict this statement. Literally ANY developer who ever wrote a loop, will be able to read a simple for loop, while to understand streams they need some deeper java knowledge, so I think it points out very well, why is the for loop easier (this nicer) to read. But let's get technical a bit, because the overuse of the java streams is a pet peeve of mine. Java streams with lambdas will create an unnecessary (well for streams they ARE necessary) anonymous class in the background, and wrap the actual loop core with multiple method calls, which inherently put extra strain on the stack, gc and on the performance, for a very subjective little gain. {code:java} java.lang.ArithmeticException: / by zero at fiddle.main(fiddle.java:7) java.lang.ArithmeticException: / by zero at fiddle.lambda$main$0(fiddle.java:14) at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:581) at fiddle.main(fiddle.java:14){code} We can see the extra method calls and the created class as well. This has an effect on performance as well: {code:java} int sum = 0; for (int i = 0; i < 10; i++) { sum += i; } {code} Runs on my computer about 330ms {code:java} IntStream.range(0, 10).sum();{code} This has a bit of a range between 380-420 ms, but even in best case it is still a 20% performance degradation. So all in all, I prefer to avoid streams, when they are not strictly needed, because they are not to replace regular loops but to add some functional programming support to java. They have their place in the language, for example when one wants to use parallel streams they can really improve the performance, with much less hassle about thread handling and concurrency. * Nit: in creatUserMappingRule, you replace the match argument with its json equivalent (*). Overwriting arguments are generally not a good idea, because it could produce hard-to-find errors, also made the debugging somewhat harder. As a general rule it's true, however I'm converting the argument in place, before using it, it is not a repurpose of the variable, but I'm making sure, the function receives the input it will properly process, in this case I don't really find it problematic, since the goal here is to "hide" the incorrect value from the rest of the method, hence the overwrite. * I am not sure about this, but is the following format accepted in the legacy format: Yeah, it is a complex question. It is NOT supported by the legacy engine, but it works in the new engine with legacy mode... however I've took a look at it, and actually the converter DOES support this conversion, it converts it to a customRule. I don't want to add any validation to the input, besides it's syntax, because the tool will do a best effort conversion for unsupported cases. It should be able to convert EVERY use case supported by the legacy engine, and all cases supported by the new engine in legacy format, however we don't encourage to use the new features in the legacy format. So if someone still uses them we might be able to convert those rules as well (probably they will become custom rules) so I don't want to reject those just because they are not officially supported. * userGroupMappingRules and applicationNameMappingRules might be initialised as empty sets. This would eliminate the null checks in convert loops, but the empty cases could still be checked in the beginning. Since I expose the collection setter method then I'd have to move the null check there, so we don't really save null checks, but at least we don't create unnecessary objects. However I like the idea to have a centralised null check, so I'll see the other feedbacks, and if I'll make a new patch set I'll consider this suggestion. > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10585.001.patch, YARN-10585.002.patch > > > To make t
[jira] [Updated] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10585: -- Attachment: YARN-10585.002.patch > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10585.001.patch, YARN-10585.002.patch > > > To make transition easier we need to create tooling to support the migration > effort. The first step is to create a class which can migrate from legacy to > the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
[ https://issues.apache.org/jira/browse/YARN-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269316#comment-17269316 ] Gergely Pollak commented on YARN-10585: --- Fixed the unit test failures, improved the conversion of rules like root.deep.%primary_group.%secondary_group, which was not supported by the legacy mapping rule engine, but the new mapping engine supports it even with the legacy format, so the converter should too. > Create a class which can convert from legacy mapping rule format to the new > JSON format > --- > > Key: YARN-10585 > URL: https://issues.apache.org/jira/browse/YARN-10585 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10585.001.patch, YARN-10585.002.patch > > > To make transition easier we need to create tooling to support the migration > effort. The first step is to create a class which can migrate from legacy to > the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10586) Create a command line tool for converting from legacy CS mapping rule format
Gergely Pollak created YARN-10586: - Summary: Create a command line tool for converting from legacy CS mapping rule format Key: YARN-10586 URL: https://issues.apache.org/jira/browse/YARN-10586 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Assignee: Gergely Pollak -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10585) Create a class which can convert from legacy mapping rule format to the new JSON format
Gergely Pollak created YARN-10585: - Summary: Create a class which can convert from legacy mapping rule format to the new JSON format Key: YARN-10585 URL: https://issues.apache.org/jira/browse/YARN-10585 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Assignee: Gergely Pollak To make transition easier we need to create tooling to support the migration effort. The first step is to create a class which can migrate from legacy to the new JSON format. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10584) Move the functionality of MappingRuleValidationHelper to CSQueueManager
Gergely Pollak created YARN-10584: - Summary: Move the functionality of MappingRuleValidationHelper to CSQueueManager Key: YARN-10584 URL: https://issues.apache.org/jira/browse/YARN-10584 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Assignee: Gergely Pollak The only reason we created the helper class was to make the features introduced in YARN-10506 accessible as soon as possible. Due to our current test helpers and frameworks, integrating these changes into the CapacitySchedulerQueueManager would have been more time consuming. However we need to remove this standalone class and merge the functionality into either the QueueManager or into CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10583) Increase the test coverage for validation in MappingRuleValidationHelper
[ https://issues.apache.org/jira/browse/YARN-10583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10583: - Assignee: Gergely Pollak > Increase the test coverage for validation in MappingRuleValidationHelper > > > Key: YARN-10583 > URL: https://issues.apache.org/jira/browse/YARN-10583 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > MappingRuleValidationHelper introduced new validation cases due to the > changes in YARN-10506, we need to increase the coverage before we move the > class into QueueManager -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10583) Increase the test coverage for validation in MappingRuleValidationHelper
Gergely Pollak created YARN-10583: - Summary: Increase the test coverage for validation in MappingRuleValidationHelper Key: YARN-10583 URL: https://issues.apache.org/jira/browse/YARN-10583 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak MappingRuleValidationHelper introduced new validation cases due to the changes in YARN-10506, we need to increase the coverage before we move the class into QueueManager -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10582) Increase the test coverage for CSMappingPlacementRule
Gergely Pollak created YARN-10582: - Summary: Increase the test coverage for CSMappingPlacementRule Key: YARN-10582 URL: https://issues.apache.org/jira/browse/YARN-10582 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Assignee: Gergely Pollak We have somewhat limited coverage for cases introduced via YARN-10506, to make sure we won't have any regression we need to improve the coverage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10578) Fix Auto Queue Creation parent handling
[ https://issues.apache.org/jira/browse/YARN-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267869#comment-17267869 ] Gergely Pollak commented on YARN-10578: --- [~gandras] thank you for the patch, my only observation is you are using a lock before calling the _autoQueueHandler.autoCreateQueue(placementContext_); however if this operation requires a lock, it would be better to have that lock in the _autoCreateQueue_ method, to make sure no other code path can concurrently modify it. (Even if we don't call it elsewhere currently). Otherwise LGTM+1 (Non-binding) > Fix Auto Queue Creation parent handling > --- > > Key: YARN-10578 > URL: https://issues.apache.org/jira/browse/YARN-10578 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10578.001.patch, YARN-10578.002.patch > > > YARN-10506 introduced the new auto queue creation logic, however a parent == > null check in CapacityScheduler#autoCreateLeafQueue will prevent a two levels > queue to be created. We need to revert it back to the normal logic, also, we > should wrap the auto queue handling with a lock. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10535: -- Attachment: YARN-10535.006.patch > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch, YARN-10535.005.patch, > YARN-10535.006.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266941#comment-17266941 ] Gergely Pollak commented on YARN-10535: --- [~snemeth] thank you for the review, a good portion of this work needs to be merged into CapacityScheduler or CapacitySchedulerQueueManager, this will be done as a separate effort, since it requires a bit more refactoring, and I wanted this patch out soon, because without this YARN-10506 won't be able to utilize it's new feature set. So for most of your point my answer will be "will be fixed in a followup jira". :) 1) I find it overkill to create a separate method for a simple split. It is technically a split which puts the result into an ArrayList, but later I'll try to find a more centralized place for parsing the queuePath this way. 1.1-1.5) We already have a lot of queue path storing classes (at least 2, but there might be 3), so we should create only one if possible, this might be the good reason to merge whatever possible. I simply didn't want to introduce a new one, but I agree we need to centralize the QueuePath operations in a dedicated class. 1.6) I don't want to add a getGrandparent method, because YARN-10506 is written in a way, it can support more depth than 2, so I would like to make this code flexible during the next iteration. Currently we limit the max depth to 2, but the underling code supports more depth, so should we. But we'll need some getNthParent method, where n=1 is the parent, n=2 grandparent ... etc, so generally I agree with the concept, but we need to increase flexibility, to make the code future proof. 2) The confusing part here is that "dynamic" is overused. Originally I used called static path when a TARGET path was full static eg. "root.someting.leaf" and dynamic when it had variables which are to be substituted, eg. "root.group.%primary_group". But YARN-10506 introduced the dynamic parent concept, so dynamic now can mean 2 different thing in this context. The validateXXXQueuePath (where XXX STRICTLY for dynamic/static) methods are only used for TARGET path validation. And not for validating paths with dynamic parents. I hope it makes sense. 3.1) Yes it could be, but still I prefer it dynamic, since it is used within the context, so there is not much point in calling one VALIDATION helper method without a context. Technically it could be, semantically I don't think it should. Also this is something which should be in the QueueManager in the first place, or something we should query about the queue, so the best would be this to be in the CSQueue or in QueueManager. During refactor I'll remove it from here for sure. 3.2) I had only 1 typo? Cool :) Thx for finding it 3.3) The WHOLE CLASS should go. And it will, as soon I can integrate it to QM, however that requires some test helper refactors as well, since currently QM is mocked in a good portion of the tests, and we need this functionality during the tests, so this is the only reason it is in a separate class for now. 3.3.1) Please see my earlier repsonse regarding queue parsing classes. 3.3.2) I don't think an empty check should be a separate method, This is literally a hasNext / throw check, I don't want to create a method with the only purpose of throwing an exception. 3.3.3) The isPathStatic IS the dedicated method, the staticPartBuffer.toString() is the argument. Again I think it is overkill to move it to an other method. {code:java} THIS: if (!isPathStatic(staticPartBuffer.toString())) { return true; } Would become THIS: if (!checkRoot(staticPartBuffer)) { return true; } {code} Also the stringBuffer is kind of a special representation here, so probably we would still keep the staticPartBuffer.toString() call, so all in all we would have an extra call on our stack 3.3.4) This is a special validation method which we need only once, we don't need to store the static part of the queue, we do some operation on it, the we won't touch it again, I see no reason to extract this, since it's mostly part of the validation, and not really reusable. 3.3.5-3.3.6) The whole static part of a queue is a mapping rule specific validation thing, we could make methods for all of it, but we would only use those methods once, and we'd need a few extra fields. Generally I agree we should keep the size of the functions minimal, but these are all steps in the very same and quite unique validation process, with almost 0 reusability, hence I kept all in one method. 4) Will take a look at it, actually it might result in false positive result, thx! Thank you again for the detailed feedbacks, I'll fix the most urgent issues here, and then keep the rest in mind for the followup tasks. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > --
[jira] [Commented] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266771#comment-17266771 ] Gergely Pollak commented on YARN-10535: --- Patchset#5 fixes the new checkstyle/findbugs/javadoc warnings introduced during patchet#4. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch, YARN-10535.005.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266771#comment-17266771 ] Gergely Pollak edited comment on YARN-10535 at 1/17/21, 12:41 PM: -- Patchset#5 fixes the new checkstyle/findbugs/javadoc warnings introduced during patchet#4. Test failure is unrelated, test seems to be flaky. was (Author: shuzirra): Patchset#5 fixes the new checkstyle/findbugs/javadoc warnings introduced during patchet#4. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch, YARN-10535.005.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10535: -- Attachment: YARN-10535.005.patch > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch, YARN-10535.005.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10574) Fix the FindBugs warning introduced in YARN-10506
[ https://issues.apache.org/jira/browse/YARN-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266769#comment-17266769 ] Gergely Pollak commented on YARN-10574: --- Test failure is unrelated, test seems to be flaky. Test have not been added, because there is no functionality change, only fixing a FindBugs recommendation, everything is expected to work exactly the same way. > Fix the FindBugs warning introduced in YARN-10506 > - > > Key: YARN-10574 > URL: https://issues.apache.org/jira/browse/YARN-10574 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10574.001.patch > > > Findbugs started to give warnings about an unnecessary null check, it's > better to get rid of it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10574) Fix the FindBugs warning introduced in YARN-10506
[ https://issues.apache.org/jira/browse/YARN-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266687#comment-17266687 ] Gergely Pollak commented on YARN-10574: --- I've just removed the null check, because the previous line will throw an exception if the variable is null, and it bothered findbugs. > Fix the FindBugs warning introduced in YARN-10506 > - > > Key: YARN-10574 > URL: https://issues.apache.org/jira/browse/YARN-10574 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10574.001.patch > > > Findbugs started to give warnings about an unnecessary null check, it's > better to get rid of it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266686#comment-17266686 ] Gergely Pollak commented on YARN-10535: --- Patchset 4 is supposed to fix all findbugs and checkstyle issues, as well as the relevant unit tests. TestCapacitySchedulerAutoQueueCreation were due to error message changes, while TestFairSchedulerOvercommit is simply flaky. Fixed the trunk findbugs warning in YARN-10574. There are still some more tests to add to increase coverage, but otherwise patch is near completion. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10535: -- Attachment: YARN-10535.004.patch > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch, YARN-10535.004.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10574) Fix the FindBugs warning introduced in YARN-10506
[ https://issues.apache.org/jira/browse/YARN-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak reassigned YARN-10574: - Assignee: Gergely Pollak > Fix the FindBugs warning introduced in YARN-10506 > - > > Key: YARN-10574 > URL: https://issues.apache.org/jira/browse/YARN-10574 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > > Findbugs started to give warnings about an unnecessary null check, it's > better to get rid of it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10574) Fix the FindBugs warning introduced in YARN-10506
Gergely Pollak created YARN-10574: - Summary: Fix the FindBugs warning introduced in YARN-10506 Key: YARN-10574 URL: https://issues.apache.org/jira/browse/YARN-10574 Project: Hadoop YARN Issue Type: Sub-task Reporter: Gergely Pollak Findbugs started to give warnings about an unnecessary null check, it's better to get rid of it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266413#comment-17266413 ] Gergely Pollak commented on YARN-10535: --- Dependency merged patchset 3 is a reupload to retrigger the build. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10535: -- Attachment: YARN-10535.003.patch > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch, > YARN-10535.003.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266139#comment-17266139 ] Gergely Pollak commented on YARN-10535: --- [~pbacsko] [~gandras] Thank you for the review, I fixed the most issues (checkstyle might find some more). Also when jenkins can run the full test suite we might get some test failures which needs to be fixed. *7. {{MappingRuleValidationContextImpl}}: {{staticPartParent}} is initially {{null}}. Any change of it remaining {{null}} after the {{while}} loop? If so, a null check is warranted.* There is an explicit check for null state of staticPartParent, but null is totally legit, if there is no parent of the static part. But that case is handled. *I can't really comment on the validation logic because there a lot of string operations and hopefully the unit tests will do their job.* Tests only cover the current cases, they won't provide enough coverage, for validation I've already added a few test, but I still need a few for the CSMappingPlacementRule class. *Talking about unit tests: {{TestMappingRuleValidationContextImpl}} was extended with some new cases, does that provide enough coverage? Have you run it through a coverage tool?* Tests are still being written *EDIT: ok, the patch does not compile yet, so probably the answer is "no".* It won't compile until YARN-10506 is merged. > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10535) Make queue placement in CapacityScheduler compliant with auto-queue-placement
[ https://issues.apache.org/jira/browse/YARN-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10535: -- Attachment: YARN-10535.002.patch > Make queue placement in CapacityScheduler compliant with auto-queue-placement > - > > Key: YARN-10535 > URL: https://issues.apache.org/jira/browse/YARN-10535 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10535.001.patch, YARN-10535.002.patch > > > Once YARN-10506 is done, we need to call the API from the queue placement > policy to create queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation
[ https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266081#comment-17266081 ] Gergely Pollak edited comment on YARN-10506 at 1/15/21, 2:46 PM: - Thank you [~gandras] for the fix, here are a few comments, most of them can wait for a followup JIRA: *CapacitySchedulerAutoQueueHandler class*: Why is it a separate class, this is the functionality of the actual queue, I think it would have been better to extend the parentQueue and add this functionality there, this is a class which automagicly changes the ParentQueue class and enhances it with new functionality, but I think it would be better to solve this via inheritance. Also that would make differentiating between ParentQueue and the DynamicParentQueue much easier. Probably we should fix it in a followup JIRA. *CapacitySchedulerAutoQueueHandler#getQueue* This method is unnecessary, since CSQueueStore.get method handles null paths, and returns null. See CapacitySchedulerQueueManager#getQueue and CSQueueStore#get *CapacitySchedulerAutoQueueHandler#autoCreateQueue* I might be wrong, but this is concerning: {code:java} CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue());{code} The getQueue will always return the leaf name of the queue, which might be ambiguous so this might return null even if the queue exists on the path we are checking, and might identify an already existing queue invalid. As a rule of thumb, wherever possible use the full path of the queues please! If we have the following structure: {code:java} root +--alice | +--test | +--dev +--bob | +--test | +--dev {code} Where both _dev_ and _test_ queues are dynamic parents, then when I try to submit something to root.bob.dev.something, this part of the code {code:java} ApplicationPlacementContext queueCandidateContext = parentContext; // (Parent context is root.bob.dev) CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue()); //queueCandidateContext.getQueue() = 'dev' getQueue(queueCandidateContext.getQueue()) == null //because dev is ambiguous {code} Will identify root.bob.dev as a not existing queue. So please use _ApplciationPlacementContext#getFullQueuePath_ *CapacitySchedulerAutoQueueHandler#isDynamicQueue* Is not fool/future proof, while I'm aware currently only the AbstractCSQueue class implements the interface, but still it would be better to check before casting it to AbstractCSQueue. Or move the isDynamic down to the interface, then you don't have to cast was (Author: shuzirra): Thank you [~gandras] for the fix, here are a few comments, most of them can wait for a followup JIRA: *CapacitySchedulerAutoQueueHandler class*: Why is it a separate class, this is the functionality of the actual queue, I think it would have been better to extend the parentQueue and add this functionality there, this is a class which automagicly changes the ParentQueue class and enhances it with new functionality, but I think it would be better to solve this via inheritance. Also that would make differentiating between ParentQueue and the DynamicParentQueue much easier. Probably we should fix it in a followup JIRA. *CapacitySchedulerAutoQueueHandler#getQueue* This method is unnecessary, since CSQueueStore.get method handles null paths, and returns null. See CapacitySchedulerQueueManager#getQueue and CSQueueStore#get *CapacitySchedulerAutoQueueHandler#autoCreateQueue* I might be wrong, but this is concerning: {code:java} CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue());{code} The getQueue will always return the leaf name of the queue, which might be ambiguous so this might return null even if the queue exists on the path we are checking, and might identify an already existing queue invalid. As a rule of thumb, wherever possible use the full path of the queues please! If we have the following structure: {code:java} root +--alice | +--test | +--dev +--bob | +--test | +--dev {code} Where both _dev_ and _test_ queues are dynamic parents, then when I try to submit something to root.bob.dev.something, this part of the code {code:java} ApplicationPlacementContext queueCandidateContext = parentContext; // (Parent context is root.bob.dev) CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue()); //queueCandidateContext.getQueue() = 'dev' getQueue(queueCandidateContext.getQueue()) == null //because dev is ambiguous {code} Will identify root.bob.dev as a not existing queue. So please use _ApplciationPlacementContext#getFullQueuePath_ *CapacitySchedulerAutoQueueHandler#isDynamicQueue* Is not fool/future proof, while I'm aware currently only the AbstractCSQueue class implements the interface, but still it would be bett
[jira] [Commented] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation
[ https://issues.apache.org/jira/browse/YARN-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266081#comment-17266081 ] Gergely Pollak commented on YARN-10506: --- Thank you [~gandras] for the fix, here are a few comments, most of them can wait for a followup JIRA: *CapacitySchedulerAutoQueueHandler class*: Why is it a separate class, this is the functionality of the actual queue, I think it would have been better to extend the parentQueue and add this functionality there, this is a class which automagicly changes the ParentQueue class and enhances it with new functionality, but I think it would be better to solve this via inheritance. Also that would make differentiating between ParentQueue and the DynamicParentQueue much easier. Probably we should fix it in a followup JIRA. *CapacitySchedulerAutoQueueHandler#getQueue* This method is unnecessary, since CSQueueStore.get method handles null paths, and returns null. See CapacitySchedulerQueueManager#getQueue and CSQueueStore#get *CapacitySchedulerAutoQueueHandler#autoCreateQueue* I might be wrong, but this is concerning: {code:java} CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue());{code} The getQueue will always return the leaf name of the queue, which might be ambiguous so this might return null even if the queue exists on the path we are checking, and might identify an already existing queue invalid. As a rule of thumb, wherever possible use the full path of the queues please! If we have the following structure: {code:java} root +--alice | +--test | +--dev +--bob | +--test | +--dev {code} Where both _dev_ and _test_ queues are dynamic parents, then when I try to submit something to root.bob.dev.something, this part of the code {code:java} ApplicationPlacementContext queueCandidateContext = parentContext; // (Parent context is root.bob.dev) CSQueue existingQueueCandidate = getQueue(queueCandidateContext.getQueue()); //queueCandidateContext.getQueue() = 'dev' getQueue(queueCandidateContext.getQueue()) == null //because dev is ambiguous {code} Will identify root.bob.dev as a not existing queue. So please use _ApplciationPlacementContext#getFullQueuePath_ *CapacitySchedulerAutoQueueHandler#isDynamicQueue* Is not fool/future proof, while I'm aware currently only the AbstractCSQueue class implements the interface, but still it would be better to check before casting it to AbstractCSQueue. Or move the isDynamic down to the interface, then you don't have to cast > Update queue creation logic to use weight mode and allow the flexible > static/dynamic creation > - > > Key: YARN-10506 > URL: https://issues.apache.org/jira/browse/YARN-10506 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10506-006-10504-010.patch, > YARN-10506-007-10504-010.patch, YARN-10506-008.patch, YARN-10506-010.patch, > YARN-10506-012.patch, YARN-10506-013.patch, YARN-10506.001.patch, > YARN-10506.002.patch, YARN-10506.003.patch, YARN-10506.004.patch, > YARN-10506.005.patch, YARN-10506.006-combined.patch, YARN-10506.006.patch, > YARN-10506.007.patch, YARN-10506.009.patch, YARN-10506.011.patch, > YARN-10506.014.patch > > > The queue creation logic should be updated to use weight mode and support the > flexible creation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org