[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4

2015-06-18 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-8427:
---
Priority: Critical  (was: Blocker)

> Incorrect ACL checking for partitioned table in Spark SQL-1.4
> -
>
> Key: SPARK-8427
> URL: https://issues.apache.org/jira/browse/SPARK-8427
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0
> Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 
> 2.6.0
>Reporter: Karthik Subramanian
>Priority: Critical
>  Labels: security
>
> Problem Statement:
> While doing query on a partitioned table using Spark SQL (Version 1.4.0), 
> access denied exception is observed on the partition the user doesn’t belong 
> to (The user permission is controlled using HDF ACLs). The same works 
> correctly in hive.
> Usercase: To address Multitenancy
> Consider a table containing multiple customers and each customer with 
> multiple facility. The table is partitioned by customer and facility. The 
> user belonging to on facility will not have access to other facility. This is 
> enforced using HDFS ACLs on corresponding directories. When querying on the 
> table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular 
> partition (using ‘where’ clause) only the corresponding directory access 
> should be verified and not the entire table. 
> The above use case works as expected when using HIVE client, version 0.13.1 & 
> 1.1.0. 
> The query used: select count(*) from customertable where customer=‘customer1’ 
> and facility=‘facility1’
> Below is the exception received in Spark-shell:
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=user1, access=READ_EXECUTE, 
> inode="/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971)
>   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileS

[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4

2015-06-17 Thread Karthik Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Subramanian updated SPARK-8427:
---
Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0  
(was: CentOS 6, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0)

> Incorrect ACL checking for partitioned table in Spark SQL-1.4
> -
>
> Key: SPARK-8427
> URL: https://issues.apache.org/jira/browse/SPARK-8427
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0
> Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 
> 2.6.0
>Reporter: Karthik Subramanian
>Priority: Blocker
>  Labels: security
>
> Problem Statement:
> While doing query on a partitioned table using Spark SQL (Version 1.4.0), 
> access denied exception is observed on the partition the user doesn’t belong 
> to (The user permission is controlled using HDF ACLs). The same works 
> correctly in hive.
> Usercase: To address Multitenancy
> Consider a table containing multiple customers and each customer with 
> multiple facility. The table is partitioned by customer and facility. The 
> user belonging to on facility will not have access to other facility. This is 
> enforced using HDFS ACLs on corresponding directories. When querying on the 
> table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular 
> partition (using ‘where’ clause) only the corresponding directory access 
> should be verified and not the entire table. 
> The above use case works as expected when using HIVE client, version 0.13.1 & 
> 1.1.0. 
> The query used: select count(*) from customertable where customer=‘customer1’ 
> and facility=‘facility1’
> Below is the exception received in Spark-shell:
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=user1, access=READ_EXECUTE, 
> inode="/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971)
>   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
>   at 
> org.apache.hadoop.hdfs.DistributedFileS