[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291628#comment-17291628 ]
Siddharth Ahuja edited comment on YARN-10652 at 2/26/21, 1:01 PM: ------------------------------------------------------------------ Thanks [~pbacsko] for confirming that this issue has no direct relation to placement. Thanks [~shuzirra] for your insights. I see what you are trying to say, however, I don't believe we need to wait for a CS-wide solution on how usernames are internally stored. My arguments are below: In regards to: {quote} If we don't handle user names with dots properly, then we cannot say we support user names with dots. {quote} But we are supporting usernames with dots today. Users with dots in their usernames can submit jobs to the cluster having CS with no issues today (again, I am not talking about queue placement with queues with dots here). There are no errors reported when users with dots are supplied against "yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" and in fact, there should NOT be any errors when it is done so. These are real-world usernames and we will have to accept them from any interface, whether it be UI or CLI or anything. There is no need to wait to open this up on the front-end as they are already being stored as a String in our code, please see https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3902 & https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3903. In regards to mapping rules, users will again specify them with real-world usernames. If the real-world username has a dot in them, then, thats how it should be accepted. How we store it internally should not matter to the user when they are specifying these settings. Opening up this setting - "yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" should have no bearing on how this is actually stored internally in YARN CS whether now or in future especially when the specification of this setting is fairly limited to only one use and that is to specify a user's share in a particular queue, that's it. It does not interfere with any placement rules or anything else. My solution is catering for an issue that only surfaces on the front-end, not the back-end so I still don't probably see how this needs to wait for any future refactoring on implementation side of things. was (Author: sahuja): Thanks [~pbacsko] for confirming that this issue has no direct relation to placement. Thanks [~shuzirra] for your insights. I see what you are trying to say, however, I don't believe we need to wait for a CS-wide solution on how usernames are internally stored. My arguments are below: In regards to: {quote} If we don't handle user names with dots properly, then we cannot say we support user names with dots. {quote} But we are supporting usernames with dots today. Users with dots in their usernames can submit jobs to the cluster having CS with no issues today (again, I am not talking about queue placement with queues with dots here). There are no errors reported when users with dots are supplied against "yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" and in fact, there should NOT be any errors when it is done so. These are real-world usernames and we will have to accept them from any interface, whether it be UI or CLI or anything. There is no need to wait to open this up on the front-end as they are already being stored as a String in our code, please see https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3902 & https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3903. In regards to mapping rules, users will again specify them with real-world usernames. If the real-world username has a dot in them, then, thats how it should be accepted. How we store it internally should not matter to the user when they are specifying these settings. Opening up this setting - "yarn.scheduler.capacity.<queue-path>.user-settings.<user-name>.weight setting" should have no bearing on how this is actually stored internally in YARN CS whether now or in future. My solution is catering for an issue that only surfaces on the front-end, not the back-end so I still don't probably see how this needs to wait for any future refactoring on implementation side of things. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > ------------------------------------------------------------------------------------- > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Affects Versions: 3.3.0 > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org