[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216295#comment-14216295 ] Hudson commented on HDFS-7146: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/9/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216274#comment-14216274 ] Hudson commented on HDFS-7146: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1961 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1961/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216214#comment-14216214 ] Hudson commented on HDFS-7146: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/9/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216200#comment-14216200 ] Hudson commented on HDFS-7146: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1937 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1937/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216101#comment-14216101 ] Hudson commented on HDFS-7146: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #747 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/747/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216088#comment-14216088 ] Hudson commented on HDFS-7146: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/9/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215251#comment-14215251 ] Yongjun Zhang commented on HDFS-7146: - And thanks [~aw] a lot for earlier suggestion to fix the infrastructure, it does look much better now! > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215248#comment-14215248 ] Yongjun Zhang commented on HDFS-7146: - Thanks a lot for your help Brandon! > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215216#comment-14215216 ] Hudson commented on HDFS-7146: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6560 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6560/]) HDFS-7146. NFS ID/Group lookup requires SSSD enumeration on the server. Contributed by Yongjun Zhang (brandonli: rev 351c5561c2fd380ab7746ca4e91d7b838e61e03f) * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Fix For: 2.7.0 > > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215208#comment-14215208 ] Brandon Li commented on HDFS-7146: -- I've committed the patch. Thank you, [~yzhangal], for the contribution! > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213254#comment-14213254 ] Hadoop QA commented on HDFS-7146: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12681667/HDFS-7146.005.patch against trunk revision 4fb96db. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8746//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8746//console This message is automatically generated. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213188#comment-14213188 ] Brandon Li commented on HDFS-7146: -- +1. Pending Jenkins. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213165#comment-14213165 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], Nice idea to add some java docs and describing the solution, I just uploaded 005 to have that. Thanks for your flexibility, I will create a separate jira for the platform coverage issue, 'cause I think that may involve looking into multiple places for platform differences. Thanks for taking a further look. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch, HDFS-7146.005.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212985#comment-14212985 ] Brandon Li commented on HDFS-7146: -- {quote} The defaulyStaticIdMappingFile was introduced in the HADOOP-11195, and I actually have removed it in rev 004. Would you please indicate the place you were looking at?{quote} My bad. I looked into the wrong side of the diff. {quote}Relaxing the platform support is a different issue to solve and it seems deserving a separate jira, what do you think?{quote} I am ok with either fixing it here or a different JIRA. {quote}I introduced this for testing purpose. {quote} Please add java doc for it. Also, it would to nice to add the solution in the class java doc. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212844#comment-14212844 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], Thanks a lot for the review and comments! I have a few questions to clarify: {quote} 1. it doesn’t seem to be necessary to introduce defaultStaticIdMappingFile {quote} The defaulyStaticIdMappingFile was introduced in the HADOOP-11195, and I actually have removed it in rev 004. Would you please indicate the place you were looking at? {quote} 2. Do we need checkSupportedPlatform()? We don’t have to limit the platform with only linux and mac. Some other UNIX flavors might also be able to run the NFS server. We could do the follow: if (Shell.Mac) { // mac command } else { // linux command for everything else } {quote} About checkSupportedPlatform, I simply followed the existing implementation ({{ if (!OS.startsWith("Linux") && !OS.startsWith("Mac"))}}), which says only mac and linux are supported. Relaxing the platform support is a different issue to solve and it seems deserving a separate jira, what do you think? {quote} 3. do we still need constructFullMapAtInit since it’s always false? {quote} I introduced this for testing purpose. If you look at the new test I introduced, it's first called with "true" to create a reference (refIdMapping). That's why I tagged constructor with @VisibleForTesting. Does this sound ok to you? Thanks again! > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212800#comment-14212800 ] Brandon Li commented on HDFS-7146: -- The patch looks nice. Some comments: 1. it doesn’t seem to be necessary to introduce defaultStaticIdMappingFile 2. Do we need checkSupportedPlatform()? We don’t have to limit the platform with only linux and mac. Some other UNIX flavors might also be able to run the NFS server. We could do the follow: if (Shell.Mac) { // mac command } else { // linux command for everything else } 3. do we still need constructFullMapAtInit since it’s always false? > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211647#comment-14211647 ] Yongjun Zhang commented on HDFS-7146: - Many thanks Brandon! It's understandable and no worry about the late response at all. I had to delay working on this issue for quite a bit earlier myself. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211631#comment-14211631 ] Brandon Li commented on HDFS-7146: -- Thank you, Yongjun. Sorry for the late response. I will review the patch pretty soon. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211563#comment-14211563 ] Yongjun Zhang commented on HDFS-7146: - Hi [~brandonli], thanks for your earlier review, I posted a patch (004) several days ago, would you please help taking a look again? thanks a lot! > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205143#comment-14205143 ] Hadoop QA commented on HDFS-7146: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12680610/HDFS-7146.004.patch against trunk revision ab30d51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8705//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8705//console This message is automatically generated. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205080#comment-14205080 ] Yongjun Zhang commented on HDFS-7146: - Hi Guys, Sorry to get back late. I just uploaded a patch on top of the HADOOP-11195. I'd appreciate it that you could help reviewing it when you have time. A recap of the solution: # At initialization, the maps are empty # Both users/groups/ids are added to the map on demand (e.g. when requested), # When groupId is requested for a given groupName, and if the groupName is numerical, the full group map is loaded (this is lazy full list load I referred to earlier) # Periodically update the cached maps for both user and group. What I do here is actually to clear (reinitialize the maps). I imaged that some users and groups might be removed (for example, a user changed job, so their entries need to be removed). # Steps 2 and 3 will be repeated. BTW, because now we changed to incrementally updating the map, there tends to be a lot of messages like {quote} LOG.info("Updated " + mapName + " map size: " + map.size()); {quote} I took the liberty to change it to a debug message in the patch. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch, HDFS-7146.004.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169585#comment-14169585 ] Yongjun Zhang commented on HDFS-7146: - Hi [~brandonli], thanks for the feedback, your point is well taken. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169561#comment-14169561 ] Brandon Li commented on HDFS-7146: -- [~yzhangal], thanks for filing HADOOP-11195 to track the effort. It's good to not mix bug fixes and code improvement in the same JIRA. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169445#comment-14169445 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw] and [~brandonli], Thanks for the earlier review and discussion here. I created HADOOP-11195 per Allen's suggestion to merge the two existing mechanisms that caches user/group info to the hadoop-common area. Certainly I agree upon general software engineering principal of code sharing and its benefits. My original thought was that we could do things in different order. But let's try this route of fixing HADOOP-11195 first and then HDFS-7146. I might incorporate HDFS-7146 in the same fix of HADOOP-11195 though. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167428#comment-14167428 ] Yongjun Zhang commented on HDFS-7146: - Correction: I don't see name-2-id mapping in the common area with current code base, but they are the main thing in the existing nfs code. Maybe the name-2-id mapping existed in the common area before but got removed at some point? > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167424#comment-14167424 ] Yongjun Zhang commented on HDFS-7146: - HI [~aw], Thanks for the good info. I'm not aware that {quote} The existing NFS code is similar to the stuff that hadoop had in common about 4 years ago. {quote} I don't see name<->id mapping in the common area with current code base, but they are the main thing in the existing nfs code. Maybe the name->id mapping existed in the common area before but got removed at some point? Thanks again for your input. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167404#comment-14167404 ] Allen Wittenauer commented on HDFS-7146: bq. I personally think fixing the jira here first would allow us to make better progress. ... until you hit the next issue. and the issue after that. and the issue after that. One of the big wins of common is that shared code is better debugged code. The existing NFS code is similar to the stuff that hadoop had in common about 4 years ago. It isn't productive to continue going down this path. As a community, we already have experience with shelling out to get this info. It's the whole reason that libhadoop.so approx doubled in size to avoid having to deal with these issues. I don't particularly care to put in more bandages on a broken implementation. I'm still -1, regardless of the impact on your customers. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167342#comment-14167342 ] Yongjun Zhang commented on HDFS-7146: - HI [~aw], Thanks for your reply. Frankly speaking, because of the field issue [~brandonli] described, {quote} Here is the problem I noticed in the field. A user has more than 70K users on an LDAP server which configured to return no more than roughly 70K entries for each query. NFS gateway could not load all users since it tried to get the complete use list in one command. Therefore, some users can't access their own files because NFS gateway can't find their mapping in the cache. {quote} I personally think fixing the jira here first would allow us to make better progress. Creating a new jira to address the consolidation would allow us more time for discussions and iterations (given the two two mechanisms and their use cases are quite different). I will look into the consolidation part as you suggested. Since the infrastructure and application of the two mechanisms are quite different, I expect it will take quite some time, and follow-up discussions. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167210#comment-14167210 ] Allen Wittenauer commented on HDFS-7146: bq. Would you please let me know whether you think it's reasonable to limit the scope of this jira to fix the issue reported here, and create a new one for consolidating the two existing mechanisms? No, i don't think it's reasonable. bq. or we have to consolidate the two existing mechanisms before we fix the issue here? Yes, I think you should migrate to the existing caching system first rather than continue re-discovering bugs. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166911#comment-14166911 ] Yongjun Zhang commented on HDFS-7146: - HI [~aw], Would you please let me know whether you think it's reasonable to limit the scope of this jira to fix the issue reported here, and create a new one for consolidating the two existing mechanisms? or we have to consolidate the two existing mechanisms before we fix the issue here? Thanks a lot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164201#comment-14164201 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw], Thanks for your input. I did some search per your comment, I guess you were referring to Groups.java and JniBasedUnixGroupsMappingWithFallback related code in hadoop-common area? Given that these two areas are doing similar things, I agree with you that we will try our best to consolidate. Since there are already two caching mechanisms in the current code base (nfs, hadoop-common), do you think it's reasonable to limit the scope of this jira to fix the particular issue that we are trying to fix in nfs mechanism, and open a new jira for unifying the implementation? I looked at the hadoop-common area, its content appears to be quite different than what's need by the current the nfs code (IdUserGroup.java), so I expect the scope of change to consolidate them would be much larger. What do you think? Thanks a lot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164123#comment-14164123 ] Allen Wittenauer commented on HDFS-7146: The more and more I think about this, the more and more I'm -1 on the current implementation. The primary reason being that this seems to be duplicating a lot of code/concepts that already exists in hadoop-common. It doesn't make a lot of sense to me to have two user and group caching implementations. It would make much more sense to add whatever functionality is missing there (if any), and then use the code in common. This will also help deal with a lot of the platform-isms, since that opens up the JNI code. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163735#comment-14163735 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw], Thanks for your earlier review and discussion. It seems that supporting numerical user names is unavoidable. The issue reported in this jira is to avoid loading the full name-id map. Using command "id" can help achieving the goal for supported platforms. I wonder whether there is a better alternative out there than "id" command. -- I searched and didn't find one so far. Would you please comment? Many thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161105#comment-14161105 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw], About the username pattern allowed on different platforms, there were discussion in HDFS-4983 and HDFS-4733: {quote} Alejandro Abdelnur added a comment - 04/Dec/13 17:01 Allowed usernames are the OS allowed user names. Different versions of Unix/Linux have different restrictions by default. This was discussed when this was done for httpfs. Refer to HDFS-4733 for details. {quote} I agree with you that ideally all allowed usernames would comply with the same convention, that would make our life much easier. However, if user already had the numerical usernames, we probably have to support. To ask them to change user name is going to be much harder than for us to support it:-) That's what HDFS-4983 and HDFS-4733 about. Would you please also address the questions I asked in "Another thought Allen Wittenauer," comment above? Thanks a lot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161006#comment-14161006 ] Allen Wittenauer commented on HDFS-7146: bg. See HDFS-4983. That JIRA is sort of irrelevant to the discussion since HDFS (and therefore WebHDFS) has no such restrictions on usernames since they are published as strings. Unix does and we have to play by its rules since that's the space this code plays. bq. Seems the requirement on user name varies. Not really. Some useradd's do not enforce the entire rule set, which is why I said "most/all". Some Linux distributions include a useradd facility that do not. If you look at the upstream Linux shadow utilities source, however, (https://github.com/shadow-maint/shadow/blob/master/libmisc/chkname.c) you'll find that all digit usernames are not legal. Other OSes follow similar rules in their utilities ( e.g., Illumos: https://hg.openindiana.org/upstream/illumos/illumos-gate/file/68f95e015346/usr/src/cmd/aset/tasks/pwchk.awk ). Just because some distributions allowed users to do incredibly dumb things doesn't mean we need to as well. FWIW, if you want true portability, you'll need to use the native system calls to follow whatever rules are allowed on that machine. Otherwise, expect to make some compatibility decisions. To me, this is an easy call: all numeric usernames are super rare since they have unpredictable results (e.g., chown). portability > naive admins who shot themselves in the foot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160976#comment-14160976 ] Yongjun Zhang commented on HDFS-7146: - Another thought [~aw], If you look at nfs code, only two platforms are currently supported: linux and macos. The commands used for them are crafted for differently. For example, getent is used for linux, and dscl and is used for mac. Given that we have the need to use different commands for different platforms, if there is a new platform to be added, I would assume that likely we have to craft command for the new platform. Based on this info, do you think it's ok for us to use "id" command (for linux and mac) will has the advantage of avoiding loading full user map (when there is numerical user name)? Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160952#comment-14160952 ] Yongjun Zhang commented on HDFS-7146: - Thanks [~aw]. Seems the requirement on user name varies. For example, I can add user with numerical username on my system: [yzhang@localhost hadoop]$ su adduser 23456 su: user adduser does not exist [yzhang@localhost hadoop]$ sudo adduser 23456 [sudo] password for yzhang: [yzhang@localhost hadoop]$ getent passwd | grep 23456 23456:x:504:505::/home/23456:/bin/bash [yzhang@localhost hadoop]$ We had cases where use numerical user names are used often. See HDFS-4983. I wish there is a portable command like "id" to address this issue better. Otherwise, we might do the following: 1. do incremental update to the map 2. do full load of passwd or group when the name is numerial I will do some more study, comments are welcome. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160906#comment-14160906 ] Allen Wittenauer commented on HDFS-7146: bq. If user name 123 That's not a legal Unix user name and most/all compliant useradd's will kick it back as invalid. FWIW, all sorts of problems happen with all numeric usernames if one tries to use them. For example, if one runs 'chown 123 file' what permissions would be on the file? It's perfectly reasonable for the system to fail in this scenario. bq. About "id" command I'm -1 on using id for this, even if it works on Linux and OS X. It limits any future portability to systems on SysV machines where /usr/bin/id is typically the SysV id and not POSIX id. We've been down this road before with id in the pre-security days. It was a problem then and it will be a problem in the future. (Never mind the fact that I suspect the code actually works on other operating systems, but we've artificially limited it for reasons which I'm unclear on.) tl;dr: So use getent on everything but OS X. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160678#comment-14160678 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw], Thanks for the info you provided. Here is what the comment says (man getent): {code} group When no key is provided, use setgrent(3), getgrent(3), and endgrent(3) to enumerate the group database. When one or more key arguments are provided, pass each numeric key to getgrgid(3) and each nonnumeric key to getgrnam(3) and display the result. passwdWhen no key is provided, use setpwent(3), getpwent(3), and endpwent(3) to enumerate the passwd database. When one or more key arguments are provided, pass each numeric key to getpwuid(3) and each nonnumeric key to getpwnam(3) and display the result. {code} If user name 123 has uid 456, and we do "getent passwd 123", it will think 123 is uid, and search for user with uid 123, which may not exist, this is when we get back nothing. About "id" command, I tested it on centos and mac (thanks for [~j...@cloudera.com]'s help), would you please comment whether it's good enough and what could be missed? The nfs code is said to support linux and mac only. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160659#comment-14160659 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], thanks for your comments. I just uploaded rev 03. It works slightly different than what you described. 1. At initialization, the map is empty 2. Both users/groups/ids are added to the map on demand (e.g. when requested), 3. When groupId is requested for a given groupName, if the groupName is numerical, the full group map is loaded (this is lazy full list load I referred to ealier 4. Periodically update the cached maps for both user and group. What I do here is actually to clear the map. I imaged that some users and groups might be removed (for example, a user changed job), so I instead of loading anything, I cleared the map during this update, essentially reinitialize the map. And then steps 2 and 3 will be repeated I did not change the logic when to update the map. Would you please take a look again to see if the change makes sense to you? thanks a lot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, > HDFS-7146.003.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160410#comment-14160410 ] Allen Wittenauer commented on HDFS-7146: bq. For getent command, when it sees a numerical key, it thinks you are doing reverse lookup (from id to name). That's why it returns nothing. Something sounds broken if it is returning nothing. It should be able to map forward and reverse if things are working correctly. The problem with using id is that output isn't exactly portable. bq. Unfortunately there is no corresponding one for group. getent most definitely routes gid<->group mappings as well: Linux: {code} $ getent group 1 bin:x:1:bin,daemon $ getent group bin bin:x:1:bin,daemon {code} Solaris: {code} $ getent group 1 other::1:root $ getent group other other::1:root {code} > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160046#comment-14160046 ] Brandon Li commented on HDFS-7146: -- [~yzhangal], I agree what you said. To summarize our conclusion here: 1. load all groups when IdUserGroup object is initialized (only because we don't have a good way to get group id from its name) 2. users are loaded only when they are requested 3. periodically(15minutes by default) update the cached groups 4. when a user-id mapping is requested and the mapping is older than 15minutes, we will refresh this mapping Is this looking good? > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158913#comment-14158913 ] Yongjun Zhang commented on HDFS-7146: - Hi [~brandonli], Thanks for the good info, it's really helpful! I was aware that reverse look up was happening for numerical name, but I didn't know about command "id". Given that we have this command, I agree with you that we should use it to avoid full list load. Unfortunately there is no corresponding one for group. I think we can mitigate this by doing lazy full list load upon for group: only do full list load when a given numerical group name is missing from the cache. Assume that numerical names are rare, we can largely avoid loading the full list. Does this sound good to you? I will try to have a revised version asap. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158718#comment-14158718 ] Brandon Li commented on HDFS-7146: -- For getent command, when it sees a numerical key, it thinks you are doing reverse lookup (from id to name). That's why it returns nothing. To get the UID for a numerical user, we can just use "id ", so we don't have to try to load all users. However, I don't know any way to get GID from a given group name. But I guess we are OK to load all groups since usually there are not many groups. Comments? > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158669#comment-14158669 ] Brandon Li commented on HDFS-7146: -- [~yzhangal], thanks for uploading the new patch. Here is the problem I noticed in the field. A user has more than 70K users on an LDAP server which configured to return no more than roughly 70K entries for each query. NFS gateway could not load all users since it tried to get the complete use list in one command. Therefore, some users can't access their own files because NFS gateway can't find their mapping in the cache. The problems here with loading a complete user/group list are: 1) unused entries waste memory space, 2) to make it worse, it doesn't know if the list is complete 3) as mentioned in the JIRA description, it creates unnecessary load to AD servers > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157682#comment-14157682 ] Hadoop QA commented on HDFS-7146: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12672732/HDFS-7146.002.allIncremental.patch against trunk revision 2d8e6e2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8308//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8308//console This message is automatically generated. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157674#comment-14157674 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], Sorry I was out for couple of days so didn't get chance to work on a revised version to address your comments. Just uploaded 002 that initialize the map only with the entries with numerical names, and all the rest are updated to the map incrementally. Would you please help taking a look again? Thanks a lot. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154252#comment-14154252 ] Yongjun Zhang commented on HDFS-7146: - Or I can modify the initialization part to only create entry for numerical names, so to minimize the size of the map, assuming numerical names are the only ones that can't be found by incremental commands. If this assumption is not true, then there is some risk that the fix would break stuff that used to work. What do you think Brandon? Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154223#comment-14154223 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], Thanks a lot for the review and comments! I think you made a very good point. One problem I found when doing the test is, for numerical user name, "getent passwd " would return nothing, however, the initial initialization code will catch it correctly. So looks like we can't totally take the initialization out. That said, keeping the initialization part would help this issue I described, but may not completely fix it because of the issue reported by this jira (i.e., the initial list may not be complete). What about we keep the initialization part for now, and decide whether we can drop it later? This way, the impact of this jira fix is minimized: all stuff that worked should continue to work, and the issue reported by this jira will be fixed by the change? Thanks again. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153952#comment-14153952 ] Brandon Li commented on HDFS-7146: -- Thank you, [~yzhangal] for uploading the patch. In general the patch looks good. One question I have is: do we still want to load a bunch of name-id mapping when the IdUserGroup object is initialized? the initial loading could cache many useless user/group accounts. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152479#comment-14152479 ] Yongjun Zhang commented on HDFS-7146: - BTW, thanks [~j...@cloudera.com] a lot for testing out my change for mac. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-7146.001.patch > > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150216#comment-14150216 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli] and [~aw], I'm amazed by how you guys are helping out here, and I really appreciate it! I'm not a mac user and all the info you provided is very helpful to me. I will do some further study. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150146#comment-14150146 ] Allen Wittenauer commented on HDFS-7146: A quick primer on OS X naming services Apple uses a system called Directory Services. [dscl = Directory Services Command Line (utility)] It's based upon NextStep's NetInfo idea where objects are organized in a pseudo-directory layout with certain top level structures being an amalgamation of all of the services. So, for example, if a system is configured with LDAP and Files, /Users will be /etc/passwd + LDAP ou=people (or whatever). But you can specify /LDAPv3/server/Users to get specifically the LDAP part. This is similar to how nsswitch and sssd works on other OSes, but with more structure. This used to be a lot easier, but now if you go through System Preferences -> Users & Groups -> Login Options -> Network Account Server-> Join... you'll get to Directory Utility which allows you to add multiple sources for authentication and other naming services. (I've been doing this stuff for way too long. *sigh*) > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150112#comment-14150112 ] Brandon Li commented on HDFS-7146: -- I have to say I am not a MacOS expert, so not sure if the above command works with different databases, e.g., LDAP and so on. :-( > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150090#comment-14150090 ] Yongjun Zhang commented on HDFS-7146: - Thank you so much [~brandonli]! that's very helpful info. I will be working on this soon. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150059#comment-14150059 ] Brandon Li commented on HDFS-7146: -- [~yzhangal], on MacOS you can do the following: To get user "test1" id: $> dscl . -read /Users/test1 UniqueID To get group "staff " id: $>dscl . -read /Groups/staff PrimaryGroupID > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149714#comment-14149714 ] Yongjun Zhang commented on HDFS-7146: - HI [~brandonli], Many thanks for your comment and info. I found that there is no corresponding command of "getent passwd " on mac, this probably will make the solution less clean. But let me dig some more. I'd be happy to hear if you have insight. Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149631#comment-14149631 ] Brandon Li commented on HDFS-7146: -- It seems not trivial to set nscd to cache all entries from different databases. Even the user can set it perfectly, we don't want to make NFS gateway depends on a platform specific solution to work correctly. +1 to the idea of cache requested IDs instead of collecting all. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149604#comment-14149604 ] Yongjun Zhang commented on HDFS-7146: - Hi [~aw], Thanks a lot for your input. Will a properly configured system as you suggested ensure "getent passwd" return complete list of user IDs? It seems that it would work better if we get the user id mapping incrementally by calling 'getent passwd ', instead of collecting all of them in one shot. Does this make sense to you? Thanks. > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
[ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148243#comment-14148243 ] Allen Wittenauer commented on HDFS-7146: nscd on the local host should prevent the load on the back-end server on a properly configured system. (I recognize that a lot of people blankly disable all of nscd when all they really want is to disable hostname caches. This is still a misconfiguration.) > NFS ID/Group lookup requires SSSD enumeration on the server > --- > > Key: HDFS-7146 > URL: https://issues.apache.org/jira/browse/HDFS-7146 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > > The current implementation of the NFS UID and GID lookup works by running > 'getent passwd' with an assumption that it will return the entire list of > users available on the OS, local and remote (AD/etc.). > This behaviour of the command is advised to be and is prevented by > administrators in most secure setups to avoid excessive load to the ADs > involved, as the # of users to be listed may be too large, and the repeated > requests of ALL users not present in the cache would be too much for the AD > infrastructure to bear. > The NFS server should likely do lookups based on a specific UID request, via > 'getent passwd ', if the UID does not match a cached value. This reduces > load on the LDAP backed infrastructure. > Thanks [~qwertymaniac] for reporting the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)