[ 
https://issues.apache.org/jira/browse/HDFS-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-4968:
------------------------------

    Description: 
With FileSystem symlink support incoming in HADOOP-8040, some clients will wish 
to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW 
in open(2).

Rationale for is for a security model where a user can invoke a third-party 
service running as a service user to operate on the user's data. For instance, 
users might want to use Hive to query data in their homedirs, where Hive runs 
as the Hive user and the data is readable by the Hive user. This leads to a 
security issue with symlinks:

# User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
# Hive checks permissions on the files in {{/user/mallory/hive/}} and allows 
the query to proceed.
# RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that 
point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't 
readable by Mallory, but she can create whatever symlinks she wants in her own 
scratch directory.
# Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

  was:
With FileSystem symlink support incoming in HADOOP-8040, some clients will wish 
to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW 
in open(2).

Rationale for is that in the new Hiveserver2 security model, all Hive queries 
run as the Hive user. Users can also use Hive to query data in their homedirs, 
provided it's readable by the Hive user. This leads to a security issue with 
symlinks:

# User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
# Hiveserver2 checks permissions on the files in {{/user/mallory/hive/}} and 
allows the query to proceed.
# RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that 
point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't 
readable by Mallory, but she can create whatever symlinks she wants in her own 
scratch directory.
# Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

    
> Provide configuration option for FileSystem symlink resolution
> --------------------------------------------------------------
>
>                 Key: HDFS-4968
>                 URL: https://issues.apache.org/jira/browse/HDFS-4968
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>
> With FileSystem symlink support incoming in HADOOP-8040, some clients will 
> wish to not transparently resolve symlinks. This is somewhat similar to 
> O_NOFOLLOW in open(2).
> Rationale for is for a security model where a user can invoke a third-party 
> service running as a service user to operate on the user's data. For 
> instance, users might want to use Hive to query data in their homedirs, where 
> Hive runs as the Hive user and the data is readable by the Hive user. This 
> leads to a security issue with symlinks:
> # User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
> # Hive checks permissions on the files in {{/user/mallory/hive/}} and allows 
> the query to proceed.
> # RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks 
> that point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't 
> readable by Mallory, but she can create whatever symlinks she wants in her 
> own scratch directory.
> # Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to