[ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127412#comment-17127412 ]
hemanthboyina commented on HDFS-15372: -------------------------------------- thanks for good analysis [~sodonnell] {quote}With the 001 patch in place, if you try to list /data/.snapshot/snapshot_1, the path seen by the attribute provider is: /user/snapshot_1 Before, it was: /user/.snapshot/snapshot1 When checking a path like /data/.snapshot/snap1 the provider will see /data/snap1, but on the branch-2, it would have seen /data/.snapshot/snap1. {quote} is the path seen by the attribute provider for branch and trunk was same ? it was bit confusing , can you add all in one comment with an example for a snapshot path If we try list for a path , the path will be resolved as Inodes from InodeInPath , and the same inodes components will be used by the provider right ? and INodesInPath handles .snapshot part of a path While creating a snapshot we add the inode directory as the root to snapshot {code:java} DirectorySnapshottableFeature#createSnaphot public Snapshot addSnapshot(INodeDirectory snapshotRoot, int id, String name, final Snapshot s = new Snapshot(id, name, snapshotRoot); {code} While getting inodesInPath for a file in snapshot we use the root of snapshot to get the file , IMO that means the if the file has an acl the file under snapshot root should have acl {code:java} if (isDotSnapshotDir(childName) && dir.isSnapshottable()) { final Snapshot s = dir.getSnapshot(components[count + 1]); else { curNode = s.getRoot(); snapshotId = s.getId(); } {code} please correct me if am missing some thing here > Files in snapshots no longer see attribute provider permissions > --------------------------------------------------------------- > > Key: HDFS-15372 > URL: https://issues.apache.org/jira/browse/HDFS-15372 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-15372.001.patch > > > Given a cluster with an authorization provider configured (eg Sentry) and the > paths covered by the provider are snapshotable, there was a change in > behaviour in how the provider permissions and ACLs are applied to files in > snapshots between the 2.x branch and Hadoop 3.0. > Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs > below are provided by Sentry: > {code} > hadoop fs -getfacl -R /data > # file: /data > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/tab1 > # owner: hive > # group: hive > user::rwx > group::--- > group:flume:rwx > user:hive:rwx > group:hive:rwx > group:testgroup:rwx > mask::rwx > other::--x > /data/tab1 > {code} > After taking a snapshot, the files in the snapshot do not see the provider > permissions: > {code} > hadoop fs -getfacl -R /data/.snapshot > # file: /data/.snapshot > # owner: > # group: > user::rwx > group::rwx > other::rwx > # file: /data/.snapshot/snap1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > # file: /data/.snapshot/snap1/tab1 > # owner: hive > # group: hive > user::rwx > group::rwx > other::--x > {code} > However pre-Hadoop 3.0 (when the attribute provider etc was extensively > refactored) snapshots did get the provider permissions. > The reason is this code in FSDirectory.java which ultimately calls the > attribute provider and passes the path we want permissions for: > {code} > INodeAttributes getAttributes(INodesInPath iip) > throws IOException { > INode node = FSDirectory.resolveLastINode(iip); > int snapshot = iip.getPathSnapshotId(); > INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot); > UserGroupInformation ugi = NameNode.getRemoteUser(); > INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi); > if (ap != null) { > // permission checking sends the full components array including the > // first empty component for the root. however file status > // related calls are expected to strip out the root component according > // to TestINodeAttributeProvider. > byte[][] components = iip.getPathComponents(); > components = Arrays.copyOfRange(components, 1, components.length); > nodeAttrs = ap.getAttributes(components, nodeAttrs); > } > return nodeAttrs; > } > {code} > The line: > {code} > INode node = FSDirectory.resolveLastINode(iip); > {code} > Picks the last resolved Inode and if you then call node.getPathComponents, > for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It > resolves the snapshot path to its original location, but its still the > snapshot inode. > However the logic passes 'iip.getPathComponents' which returns > "/user/.snapshot/snap1/tab" to the provider. > The pre Hadoop 3.0 code passes the inode directly to the provider, and hence > it only ever sees the path as "/user/data/tab1". > It is debatable which path should be passed to the provider - > /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as > the behaviour has changed I feel we should ensure the old behaviour is > retained. > It would also be fairly easy to provide a config switch so the provider gets > the full snapshot path or the resolved path. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org