[ https://issues.apache.org/jira/browse/SPARK-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061453#comment-15061453 ]
Vijay Singh commented on SPARK-9042: ------------------------------------ Hi Charmee, You can invoke spark-shell or spark-submit in following fasion to gain access to hivecontext functionality. Here is an example for spark-shell {code} HADOOP_CONF_DIR=/etc/hive/conf spark-shell --master yarn-client --driver-class-path '/opt/cloudera/parcels/CDH/lib/hive/lib/*' --driver-java-options '-Dspark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*' {code} Additionally, the service/user account's group can be granted access to metastore in following fashion if metastore access is restricted. # Go to Cloudera Manager > Hive > Configuration > Service-Wide > Proxy > Hive Metastore Access Control and Proxy User Groups Override # Add the group name for {color:red} all service account and users that should require hive metastore access if required {color} in addition to hive and hue users. # Restart the Hive Metastore Server for the changes to take effect. > Spark SQL incompatibility if security is enforced on the Hive warehouse > ----------------------------------------------------------------------- > > Key: SPARK-9042 > URL: https://issues.apache.org/jira/browse/SPARK-9042 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.2.0 > Reporter: Nitin Kak > > Hive queries executed from Spark using HiveContext use CLI to create the > query plan and then access the Hive table directories(under > /user/hive/warehouse/) directly. This gives AccessContolException if Apache > Sentry is installed: > org.apache.hadoop.security.AccessControlException: Permission denied: > user=kakn, access=READ_EXECUTE, > inode="/user/hive/warehouse/mastering.db/sample_table":hive:hive:drwxrwx--t > With Apache Sentry, only "hive" user(created only for Sentry) has the > permissions to access the hive warehouse directory. After Sentry > installations all the queries are directed to HiveServer2 which translates > the changes the invoking user to "hive" and then access the hive warehouse > directory. However, HiveContext does not execute the query through > HiveServer2 which is leading to the issue. Here is an example of executing > hive query through HiveContext. > val hqlContext = new HiveContext(sc) // Create context to run Hive queries > val pairRDD = hqlContext.sql(hql) // where hql is the string with hive query -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org