[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4
[ https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-8427: --- Priority: Critical (was: Blocker) > Incorrect ACL checking for partitioned table in Spark SQL-1.4 > - > > Key: SPARK-8427 > URL: https://issues.apache.org/jira/browse/SPARK-8427 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0 > Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop > 2.6.0 >Reporter: Karthik Subramanian >Priority: Critical > Labels: security > > Problem Statement: > While doing query on a partitioned table using Spark SQL (Version 1.4.0), > access denied exception is observed on the partition the user doesn’t belong > to (The user permission is controlled using HDF ACLs). The same works > correctly in hive. > Usercase: To address Multitenancy > Consider a table containing multiple customers and each customer with > multiple facility. The table is partitioned by customer and facility. The > user belonging to on facility will not have access to other facility. This is > enforced using HDFS ACLs on corresponding directories. When querying on the > table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular > partition (using ‘where’ clause) only the corresponding directory access > should be verified and not the entire table. > The above use case works as expected when using HIVE client, version 0.13.1 & > 1.1.0. > The query used: select count(*) from customertable where customer=‘customer1’ > and facility=‘facility1’ > Below is the exception received in Spark-shell: > org.apache.hadoop.security.AccessControlException: Permission denied: > user=user1, access=READ_EXECUTE, > inode="/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105) > at > org.apache.hadoop.hdfs.DistributedFileS
[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4
[ https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Subramanian updated SPARK-8427: --- Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0 (was: CentOS 6, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0) > Incorrect ACL checking for partitioned table in Spark SQL-1.4 > - > > Key: SPARK-8427 > URL: https://issues.apache.org/jira/browse/SPARK-8427 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0 > Environment: CentOS 6 & OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop > 2.6.0 >Reporter: Karthik Subramanian >Priority: Blocker > Labels: security > > Problem Statement: > While doing query on a partitioned table using Spark SQL (Version 1.4.0), > access denied exception is observed on the partition the user doesn’t belong > to (The user permission is controlled using HDF ACLs). The same works > correctly in hive. > Usercase: To address Multitenancy > Consider a table containing multiple customers and each customer with > multiple facility. The table is partitioned by customer and facility. The > user belonging to on facility will not have access to other facility. This is > enforced using HDFS ACLs on corresponding directories. When querying on the > table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular > partition (using ‘where’ clause) only the corresponding directory access > should be verified and not the entire table. > The above use case works as expected when using HIVE client, version 0.13.1 & > 1.1.0. > The query used: select count(*) from customertable where customer=‘customer1’ > and facility=‘facility1’ > Below is the exception received in Spark-shell: > org.apache.hadoop.security.AccessControlException: Permission denied: > user=user1, access=READ_EXECUTE, > inode="/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693) > at > org.apache.hadoop.hdfs.DistributedFileS