[ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13989:
------------------------------------
    Attachment: HIVE-13989-branch-2.3.patch

Attaching patch for branch 2.3. 

> Extended ACLs are not handled according to specification
> --------------------------------------------------------
>
>                 Key: HIVE-13989
>                 URL: https://issues.apache.org/jira/browse/HIVE-13989
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog
>    Affects Versions: 1.2.1, 2.0.0
>            Reporter: Chris Drome
>            Assignee: Vaibhav Gumashta
>         Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, 
> HIVE-13989-branch-1.patch, HIVE-13989-branch-2.3.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
>     if (aclEnabled) {
>       aclStatus =  sourceStatus.getAclStatus();
>       if (aclStatus != null) {
>         LOG.trace(aclStatus.toString());
>         aclEntries = aclStatus.getEntries();
>         removeBaseAclEntries(aclEntries);
>         //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
>         aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
>         aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
>         aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>       }
>     }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the directory sub-tree, so the above code is incomplete because it 
> effectively drops the DEFAULT rules. The second problem is with the call to 
> {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
> ACLs. When extended ACLs are used the GROUP permission is replaced with the 
> extended ACL mask. So the above code will apply the wrong permissions to the 
> GROUP. Instead the correct GROUP permissions now need to be pulled from the 
> AclEntry as returned by {{getAclStatus().getEntries()}}. See the 
> implementation of the new method {{getDefaultAclEntries}} for details.
> Similar issues exist with the HCatalog API. None of the API accounts for 
> setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
> API allow the extended ACLs to be passed into the required methods similar to 
> how basic permissions are passed in. When building the directory sub-tree the 
> extended ACLs of the table directory are inherited by all sub-directories, 
> including the DEFAULT rules.
> Replicating the problem:
> Create a table to write data into (I will use acl_test as the destination and 
> words_text as the source) and set the ACLs as follows:
> {noformat}
> $ hdfs dfs -setfacl -m 
> default:user::rwx,default:group::r-x,default:mask::rwx,default:user:hdfs:rwx,group::r-x,user:hdfs:rwx
>  /user/cdrome/hive/acl_test
> $ hdfs dfs -ls -d /user/cdrome/hive/acl_test
> drwxrwx---+  - cdrome hdfs          0 2016-07-13 20:36 
> /user/cdrome/hive/acl_test
> $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
> # file: /user/cdrome/hive/acl_test
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::r-x
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::r-x
> default:mask::rwx
> default:other::---
> {noformat}
> Note that the basic GROUP permission is set to {{rwx}} after setting the 
> ACLs. The ACLs explicitly set the DEFAULT rules and a rule specifically for 
> the {{hdfs}} user.
> Run the following query to populate the table:
> {noformat}
> insert into acl_test partition (dt='a', ds='b') select a, b from words_text 
> where dt = 'c';
> {noformat}
> Note that words_text only has a single partition key.
> Now examine the ACLs for the resulting directories:
> {noformat}
> $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
> # file: /user/cdrome/hive/acl_test
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::r-x
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::r-x
> default:mask::rwx
> default:other::---
> # file: /user/cdrome/hive/acl_test/dt=a
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::rwx
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::rwx
> default:mask::rwx
> default:other::---
> # file: /user/cdrome/hive/acl_test/dt=a/ds=b
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::rwx
> mask::rwx
> other::---
> default:user::rwx
> default:user:hdfs:rwx
> default:group::rwx
> default:mask::rwx
> default:other::---
> # file: /user/cdrome/hive/acl_test/dt=a/ds=b/000000_0.deflate
> # owner: cdrome
> # group: hdfs
> user::rwx
> user:hdfs:rwx
> group::rwx
> mask::rwx
> other::---
> {noformat}
> Note that the GROUP permission is now erroneously set to {{rwx}} because of 
> the code mentioned above; it is set to the same value as the ACL mask.
> The code changes for the HCatalog APIs is synonymous to the 
> {{applyGroupAndPerms}} method which ensures that all new directories are 
> created with the same permissions as the table. This patch will ensure that 
> changes to intermediate directories will not be propagated, instead the table 
> ACLs will be applied to all new directories created.
> I would also like to call out that the older versions of HDFS which support 
> ACLs had a number issues in addition to those mentioned here which appear to 
> have been addressed in later versions of Hadoop. This patch was originally 
> written to work with a version of Hadoop-2.6, we are now using Hadoop-2.7 
> which appears to have fixed some of them. However, I think that this patch is 
> still required for correct behavior of ACLs with Hive/HCatalog.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to