[ 
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127412#comment-17127412
 ] 

hemanthboyina commented on HDFS-15372:
--------------------------------------

thanks for good analysis [~sodonnell]
{quote}With the 001 patch in place, if you try to list 
/data/.snapshot/snapshot_1, the path seen by the attribute provider is:

/user/snapshot_1

Before, it was:

/user/.snapshot/snapshot1

When checking a path like /data/.snapshot/snap1 the provider will see 
/data/snap1, but on the branch-2, it would have seen /data/.snapshot/snap1.
{quote}
is the path seen by the attribute provider for branch and trunk was same ? it 
was bit confusing , can you add all in one comment with an example for a 
snapshot path  

If we try list for a path , the path will be resolved as Inodes from 
InodeInPath , and the same inodes components will be used by the provider right 
? and INodesInPath handles .snapshot part of a path

While creating a snapshot we add the inode directory as the root to snapshot 
{code:java}
DirectorySnapshottableFeature#createSnaphot 
public Snapshot addSnapshot(INodeDirectory snapshotRoot, int id, String name,
 final Snapshot s = new Snapshot(id, name, snapshotRoot); {code}
While getting inodesInPath for a file in snapshot we use the root of snapshot 
to get the file , IMO that means the if the file has an acl the file under 
snapshot root should have acl
{code:java}
 if (isDotSnapshotDir(childName) && dir.isSnapshottable()) {
        final Snapshot s = dir.getSnapshot(components[count + 1]);
 else {
      curNode = s.getRoot();
       snapshotId = s.getId();
       } {code}
please correct me if am missing some thing here

> Files in snapshots no longer see attribute provider permissions
> ---------------------------------------------------------------
>
>                 Key: HDFS-15372
>                 URL: https://issues.apache.org/jira/browse/HDFS-15372
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-15372.001.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the 
> paths covered by the provider are snapshotable, there was a change in 
> behaviour in how the provider permissions and ACLs are applied to files in 
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs 
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider 
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner: 
> # group: 
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively 
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the 
> attribute provider and passes the path we want permissions for:
> {code}
>   INodeAttributes getAttributes(INodesInPath iip)
>       throws IOException {
>     INode node = FSDirectory.resolveLastINode(iip);
>     int snapshot = iip.getPathSnapshotId();
>     INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
>     UserGroupInformation ugi = NameNode.getRemoteUser();
>     INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
>     if (ap != null) {
>       // permission checking sends the full components array including the
>       // first empty component for the root.  however file status
>       // related calls are expected to strip out the root component according
>       // to TestINodeAttributeProvider.
>       byte[][] components = iip.getPathComponents();
>       components = Arrays.copyOfRange(components, 1, components.length);
>       nodeAttrs = ap.getAttributes(components, nodeAttrs);
>     }
>     return nodeAttrs;
>   }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents, 
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It 
> resolves the snapshot path to its original location, but its still the 
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns 
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence 
> it only ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider - 
> /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as 
> the behaviour has changed I feel we should ensure the old behaviour is 
> retained.
> It would also be fairly easy to provide a config switch so the provider gets 
> the full snapshot path or the resolved path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to