[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HADOOP-12782: Fix Version/s: (was: 2.8.0) 2.9.0 > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch, HADOOP-12782.branch-2.010.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HADOOP-12782: --- Component/s: security > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0 > > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch, HADOOP-12782.branch-2.010.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-12782: --- Fix Version/s: (was: 3.0.0-alpha1) 2.8.0 > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0 > > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch, HADOOP-12782.branch-2.010.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.branch-2.010.patch Thanks [~drankye] for reviewing and committing! This is not incompatible with Hadoop 2, so it would be great to add this to branch-2. I did a local cherry-pick and found no conflicts. To be cautious, I am attaching the branch-2 patch for precommit validation. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 3.0.0-alpha1 > > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch, HADOOP-12782.branch-2.010.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-12782: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha1 Status: Resolved (was: Patch Available) Committed to trunk. In case it's needed, my opinion is it's also good for other branches as well. Thanks [~jojochuang] for the contribution! > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 3.0.0-alpha1 > > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.009.patch Resubmit the v09 patch. The latest precommit failures were probably not related to this patch. My local yetus precommit test did not fail. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: (was: HADOOP-12782.009.patch) > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.009.patch Rev09: fixed checkstyle warning. The test failure is unrelated. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch, > HADOOP-12782.009.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.008.patch Hi Kai, thanks again for your review. Good catch! I've refactored the code to include the suggestion you mentioned. However, not all mock code is moved to the base test class, because some code is specific to one test, and might cause confusion if done so. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch, HADOOP-12782.008.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.007.patch HADOOP-12701 added checkstyle precommit checking for test files. I uploaded a new patch to fix checkstyle issues. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch, HADOOP-12782.007.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.006.patch Rev06: thanks for the reminder. Here's the new patch after rebase. Please review again. Thanks! > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch, > HADOOP-12782.006.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.005.patch Thanks [~drankye] for the detailed review. Uploaded rev05 to address some of the comments: bq. We could remove the variable useFastLookup, instead use memberOfAttr != null directly to indicate the case. What about renaming {{useFastLookup}} as {{useOneQuery}}? If this boolean is not used, it may not appear straightforward why {{!memberOfAttr.isEmpty()}} enables this feature. bq. Ref. the following block, we could embed the logic in fastLookup directly to simplify some bit? Done. bq. Please have a line break between functions; Done. bq. Please also avoid star imports; Done. bq. Could we remove the word experimental in the user doc? Done. Thanks again for checking for the code style! > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch, HADOOP-12782.005.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Status: Patch Available (was: Open) > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: (was: HADOOP-12782.004.patch) > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Status: Open (was: Patch Available) > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.004.patch Resubmit to kick off the precommit build. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.004.patch rev04: fixed checkstyle and javac warnings. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch, HADOOP-12782.004.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.003.patch Rev03: Described this experimental feature in the documentation. Fast lookup is enabled by setting {{hadoop.security.group.mapping.ldap.search.attr.memberof}} to {{memberOf}}, or any other non-empty value. Hadoop will lookup this attribute in the returned user object if this property is set. If the fast lookup fails or disabled (by default), Hadoop send two LDAP queries for group name resolution. I have tested this patch against our internal Active Directory server and it worked as expected. So far, I have not yet found any LDAP servers other than MS AD that support this feature. Most other LDAP servers support user/group mapping by following RFC-2307 (An Approach for Using LDAP as a Network Information Service). RFC-2307 defines {{posixAccount}} and {{posixGroup}} objectClass; the former has attributes {{uidNumber}} and {{gidNumber}}, which are numerical numbers, and therefore it is not possible to get group names from looking up the user object. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Release Note: If the user object returned by LDAP server has the user's group object DN (supported by Active Directory), Hadoop can reduce LDAP group mapping latency by setting hadoop.security.group.mapping.ldap.search.attr.memberof to memberOf. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch, > HADOOP-12782.003.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.002.patch Rev02: fixed findbugs, checkstyle and javac warnings. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch, HADOOP-12782.002.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Status: Patch Available (was: Open) > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12782) Faster LDAP group name resolution with ActiveDirectory
[ https://issues.apache.org/jira/browse/HADOOP-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-12782: - Attachment: HADOOP-12782.001.patch Rev01: implemented fast ldap group name lookup, the associated test case, and associated documentation. In this implementation, there are basically three cases: # general scenario, perform two ldap queries per group lookup. # If the server supports posix semantics, perform two ldap queries using posix gid/uid to find groups of the user # (new implementation) perform one ldap query per group lookup, if fast lookup is enabled (The server must be an Active Directory, no recursive group membership and use CN attribute to identify a group's name) To enable this feature, set hadoop.security.group.mapping.ldap.search.filter.group=ldapFastLookup. I also updated the first two scenarios so that more verbose message will be logged in case of exceptions. (supportability) Finally, a new test file TestLdapGroupsMappingWithFastLookup is added that tests the new feature. The test (as well as TestLdapGroupsMapping and TestLdapGroupsMappingWithPosixGroup) all passed locally. > Faster LDAP group name resolution with ActiveDirectory > -- > > Key: HADOOP-12782 > URL: https://issues.apache.org/jira/browse/HADOOP-12782 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-12782.001.patch > > > The typical LDAP group name resolution works well under typical scenarios. > However, we have seen cases where a user is mapped to many groups (in an > extreme case, a user is mapped to more than 100 groups). The way it's being > implemented now makes this case super slow resolving groups from > ActiveDirectory. > The current LDAP group resolution implementation sends two queries to a > ActiveDirectory server. The first query returns a user object, which contains > DN (distinguished name). The second query looks for groups where the user DN > is a member. If a user is mapped to many groups, the second query returns all > group objects associated with the user, and is thus very slow. > After studying a user object in ActiveDirectory, I found a user object > actually contains a "memberOf" field, which is the DN of all group objects > where the user belongs to. Assuming that an organization has no recursive > group relation (that is, a user A is a member of group G1, and group G1 is a > member of group G2), we can use this properties to avoid the second query, > which can potentially run very slow. > I propose that we add a configuration to only enable this feature for users > who want to reduce group resolution time and who does not have recursive > groups, so that existing behavior will not be broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)