[ https://issues.apache.org/jira/browse/HDFS-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981505#comment-13981505 ]
Yongjun Zhang commented on HDFS-6165: ------------------------------------- HI Guys, Thanks a lot for the review/comments you did, and sorry again for getting back late. Instead of posting another patch to address all comments, I'd like to summary the solution here first. Sorry for a long posting again. Would you please help review the proposed solution below and see if you agree? I will address other comments when putting together a new revision of patch. Many thanks. There are two commands that have problem here 1. hdfs dfs -rmdir (refer to as rmdir in later discussion) 2. hdfs dfs -rm -r (refer to as rmr in later discussion) Both commands eventually will call FSnamesystem#deleteInternal method {code} private boolean deleteInternal(String src, boolean recursive, boolean enforcePermission, boolean logRetryCache) throws AccessControlException, SafeModeException, UnresolvedLinkException, IOException { {code} The deleteInternal methods throws exception if recursive is not true and the src to be deleted is not empty; otherwise, it will check "necessary" permissions and collect all blocks/inodes to be deleted and delete them recursively. The deletion process excludes snapshottable dirs that has at least one snapshot. Right now it requires FULL permission of the subdirs or files under the target dir to be deleted. This permission checking is also recursive, it requires all child has FULL permission), This is the place we try to fix for different scenarios. rmr calls with the "recursive" parameter passed as true, and rmdir calls with the "recursive" parameter with false. Solution summary: 1. rmdir The recursion issue in the comments you guys made is only relevant to rmr. So the solution for rmdir is a simple: - for nonempty directory, deleteInternal simply throws nonemptydir exception if it's nonempty, and the FsShell side catch the exception - for empty directory, only check parent/prefix permission, ignore the target dir's permission (posix compliant), delete if the permission is satisfied, throw exception otherwise. 2. rmr The last patch (version 004) I posted only checks whether the target dir to be deleted has READ permission (the earlier versions ignore the target dir's permission when it's empty), and I didn't change the behaviour to check subdir for non-empty target dir. For non-empty target dir, the current implementation requires FULL permission of subdir in order to delete a subdir even if the subdir is empty. This is not quite right as [~andrew.wang] pointed out. I'd like to try to implement what Andrew suggested with an additional/different parameter than subAccess of FSPermissionChecker#checkSubAccess E.g. add emptyDirSubAccess {code} void checkPermission(String path, INodeDirectory root, boolean doCheckOwner, FsAction ancestorAccess, FsAction parentAccess, FsAction access, FsAction subAccess, FsAction emptyDirSubAccess, boolean resolveLink) {code} The subAccess will passed FsAction.ALL (as is currently) and emptyDirSubAccess will be passed FsAction.NONE. And if a subdir is not empty, check against subAccess, if it's empty, check against emptyDirSubAccess. About using FsAction.ALL for subAccess parameter, it's a bit over stringent for intemediate path, say, if we want to delete <targetDir>/a/b/c, we don't have to have WRITE permission of <targetDir>/a, but we need to have WRITE permission of <targetDir>/a/b. We might address this issue in a separate JIRA if you agree. Hi [~daryn], the following is actually not true to me {quote} I bet for the described problem, you could delete the no-perms dir if you created a new directory, moved the no-perms dir into it, and then recursively deleted the new directory. {quote} Because the current implementation recursively requires FULL permission of subdirs/files to delete them. If we suddenly change the implementation to allow deleting non-empty dir without checking the subdir/file permission, I'm worried about bad user impact. Thanks again for your time. > "hdfs dfs -rm -r" and "hdfs -rmdir" commands can't remove empty directory > -------------------------------------------------------------------------- > > Key: HDFS-6165 > URL: https://issues.apache.org/jira/browse/HDFS-6165 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.3.0 > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Priority: Minor > Attachments: HDFS-6165.001.patch, HDFS-6165.002.patch, > HDFS-6165.003.patch, HDFS-6165.004.patch, HDFS-6165.004.patch > > > Given a directory owned by user A with WRITE permission containing an empty > directory owned by user B, it is not possible to delete user B's empty > directory with either "hdfs dfs -rm -r" or "hdfs dfs -rmdir". Because the > current implementation requires FULL permission of the empty directory, and > throws exception. > On the other hand, on linux, "rm -r" and "rmdir" command can remove empty > directory as long as the parent directory has WRITE permission (and prefix > component of the path have EXECUTE permission), For the tested OSes, some > prompt user asking for confirmation, some don't. > Here's a reproduction: > {code} > [root@vm01 ~]# hdfs dfs -ls /user/ > Found 4 items > drwxr-xr-x - userabc users 0 2013-05-03 01:55 /user/userabc > drwxr-xr-x - hdfs supergroup 0 2013-05-03 00:28 /user/hdfs > drwxrwxrwx - mapred hadoop 0 2013-05-03 00:13 /user/history > drwxr-xr-x - hdfs supergroup 0 2013-04-14 16:46 /user/hive > [root@vm01 ~]# hdfs dfs -ls /user/userabc > Found 8 items > drwx------ - userabc users 0 2013-05-02 17:00 /user/userabc/.Trash > drwxr-xr-x - userabc users 0 2013-05-03 01:34 /user/userabc/.cm > drwx------ - userabc users 0 2013-05-03 01:06 > /user/userabc/.staging > drwxr-xr-x - userabc users 0 2013-04-14 18:31 /user/userabc/apps > drwxr-xr-x - userabc users 0 2013-04-30 18:05 /user/userabc/ds > drwxr-xr-x - hdfs users 0 2013-05-03 01:54 /user/userabc/foo > drwxr-xr-x - userabc users 0 2013-04-30 16:18 > /user/userabc/maven_source > drwxr-xr-x - hdfs users 0 2013-05-03 01:40 > /user/userabc/test-restore > [root@vm01 ~]# hdfs dfs -ls /user/userabc/foo/ > [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -r -skipTrash /user/userabc/foo > rm: Permission denied: user=userabc, access=ALL, > inode="/user/userabc/foo":hdfs:users:drwxr-xr-x > {code} > The super user can delete the directory. > {code} > [root@vm01 ~]# sudo -u hdfs hdfs dfs -rm -r -skipTrash /user/userabc/foo > Deleted /user/userabc/foo > {code} > The same is not true for files, however. They have the correct behavior. > {code} > [root@vm01 ~]# sudo -u hdfs hdfs dfs -touchz /user/userabc/foo-file > [root@vm01 ~]# hdfs dfs -ls /user/userabc/ > Found 8 items > drwx------ - userabc users 0 2013-05-02 17:00 /user/userabc/.Trash > drwxr-xr-x - userabc users 0 2013-05-03 01:34 /user/userabc/.cm > drwx------ - userabc users 0 2013-05-03 01:06 > /user/userabc/.staging > drwxr-xr-x - userabc users 0 2013-04-14 18:31 /user/userabc/apps > drwxr-xr-x - userabc users 0 2013-04-30 18:05 /user/userabc/ds > -rw-r--r-- 1 hdfs users 0 2013-05-03 02:11 > /user/userabc/foo-file > drwxr-xr-x - userabc users 0 2013-04-30 16:18 > /user/userabc/maven_source > drwxr-xr-x - hdfs users 0 2013-05-03 01:40 > /user/userabc/test-restore > [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -skipTrash /user/userabc/foo-file > Deleted /user/userabc/foo-file > {code} > Using "hdfs dfs -rmdir" command: > {code} > bash-4.1$ hadoop fs -lsr / > lsr: DEPRECATED: Please use 'ls -R' instead. > drwxr-xr-x - hdfs supergroup 0 2014-03-25 16:29 /user > drwxr-xr-x - hdfs supergroup 0 2014-03-25 16:28 /user/hdfs > drwxr-xr-x - usrabc users 0 2014-03-28 23:39 /user/usrabc > drwxr-xr-x - abc abc 0 2014-03-28 23:39 > /user/usrabc/foo-empty1 > [root@vm01 usrabc]# su usrabc > [usrabc@vm01 ~]$ hdfs dfs -rmdir /user/usrabc/foo-empty1 > rmdir: Permission denied: user=usrabc, access=ALL, > inode="/user/usrabc/foo-empty1":abc:abc:drwxr-xr-x > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)