[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16714245#comment-16714245 ]
ASF GitHub Bot commented on SPARK-11248: ---------------------------------------- GitHub user VicoWu opened a pull request: https://github.com/apache/hive/pull/504 fix the UGI problem when reading ORC files As mentioned in SPARK-11248, the spark thrift server have security bugs , cause the result that user A sometimes have the authority of user B and User B sometimes have the authority of user A in turn. I debugged it and I find that it is caused by the hive 1.2.1 library , OrcInputFormat.java, in which a thread pool is created to contact with remote HDFS. Since threads in pool is reused and shared, so , when thread-1-pool-1 is used by user A previously and after that user B is assigned to this thread in coincidence, then user B will have the security context of User A. I have fixed this bug by add UserGroupInformation in this pool, to make sure that when a user is assigned a thread, then the security is switched to this user at the same time. You can merge this pull request into a Git repository by running: $ git pull https://github.com/VicoWu/hive hotfix-ugi-problem-for-thrift Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/504.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #504 ---- commit e086a98393ab2d68d4750dff4aa07a991a030e6d Author: Chang Wu <chang.wu@...> Date: 2018-12-10T03:20:26Z fix the UGI problem when reading ORC files ---- > Spark hivethriftserver is using the wrong user to while getting HDFS > permissions > -------------------------------------------------------------------------------- > > Key: SPARK-11248 > URL: https://issues.apache.org/jira/browse/SPARK-11248 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0, 1.5.1, 2.1.1, 2.2.0 > Reporter: Trystan Leftwich > Priority: Major > > While running spark as a hivethrift-server via Yarn Spark will use the user > running the Hivethrift server rather than the user connecting via JDBC to > check HDFS perms. > i.e. > In HDFS the perms are > rwx------ 3 testuser testuser /user/testuser/table/testtable > And i connect via beeline as user testuser > beeline -u 'jdbc:hive2://localhost:10511' -n 'testuser' -p '' > If i try to hit that table > select count(*) from test_table; > I get the following error > Error: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch > table test_table. java.security.AccessControlException: Permission denied: > user=hive, access=READ, > inode="/user/testuser/table/testtable":testuser:testuser:drwxr-x--x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6795) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6777) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6702) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:9529) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1516) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1433) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > (state=,code=0) > I have the following in set in hive-site.xml so it should be using the > correct user. > <property> > <name>hive.server2.enable.doAs</name> > <value>true</value> > </property> > <property> > <name>hive.metastore.execute.setugi</name> > <value>true</value> > </property> > > This works correctly in hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org