[ https://issues.apache.org/jira/browse/HDFS-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888387#comment-13888387 ]
Yongjun Zhang commented on HDFS-5767: ------------------------------------- My build of eclipse:eclipse was successful. I can't see the detail about why apache build failed. The javadoc "-2" warnings is because I removed IdUserGroup.DuplicateNameOrIdException. The "-1 release audit" is a bit bogus because it refers to !????? hs_err_pid22678.log which is irrelevant to the patch itself. I'm uploading a new version to address the "-2" warnings. > Nfs implementation assumes userName userId mapping to be unique, which is not > true sometimes > -------------------------------------------------------------------------------------------- > > Key: HDFS-5767 > URL: https://issues.apache.org/jira/browse/HDFS-5767 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs > Affects Versions: 2.3.0 > Environment: With LDAP enabled > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Priority: Blocker > Attachments: HDFS-5767.001.patch > > > I'm seeing that the nfs implementation assumes unique <userName, userId> pair > to be returned by command "getent paswd". That is, for a given userName, > there should be a single userId, and for a given userId, there should be a > single userName. The reason is explained in the following message: > private static final String DUPLICATE_NAME_ID_DEBUG_INFO = "NFS gateway > can't start with duplicate name or id on the host system.\n" > + "This is because HDFS (non-kerberos cluster) uses name as the only > way to identify a user or group.\n" > + "The host system with duplicated user/group name or id might work > fine most of the time by itself.\n" > + "However when NFS gateway talks to HDFS, HDFS accepts only user and > group name.\n" > + "Therefore, same name means the same user or same group. To find the > duplicated names/ids, one can do:\n" > + "<getent passwd | cut -d: -f1,3> and <getent group | cut -d: -f1,3> > on Linux systms,\n" > + "<dscl . -list /Users UniqueID> and <dscl . -list /Groups > PrimaryGroupID> on MacOS."; > This requirement can not be met sometimes (e.g. because of the use of LDAP) > Let's do some examination: > What exist in /etc/passwd: > $ more /etc/passwd | grep ^bin > bin:x:2:2:bin:/bin:/bin/sh > $ more /etc/passwd | grep ^daemon > daemon:x:1:1:daemon:/usr/sbin:/bin/sh > The above result says userName "bin" has userId "2", and "daemon" has userId > "1". > > What we can see with "getent passwd" command due to LDAP: > $ getent passwd | grep ^bin > bin:x:2:2:bin:/bin:/bin/sh > bin:x:1:1:bin:/bin:/sbin/nologin > $ getent passwd | grep ^daemon > daemon:x:1:1:daemon:/usr/sbin:/bin/sh > daemon:x:2:2:daemon:/sbin:/sbin/nologin > We can see that there are multiple entries for the same userName with > different userIds, and the same userId could be associated with different > userNames. > So the assumption stated in the above DEBUG_INFO message can not be met here. > The DEBUG_INFO also stated that HDFS uses name as the only way to identify > user/group. I'm filing this JIRA for a solution. > Hi [~brandonli], since you implemented most of the nfs feature, would you > please comment? > Thanks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)