Using Spark, SparkR and Ranger, please help.

2016-01-20 Thread Julien Carme
Hello,

I have been able to use Spark with Apache Ranger. I had the right
configuration files to Spark conf, I add Ranger jars to the classpath and
it works, Spark complies to Ranger rules when I access Hive tables.

However with SparkR it does not work, which is rather surprising
considering SparkR is supposed to be just a layer over Spark. I don't
understand why sparkR seem to behave differently, maybe I am just missing
something.

So when I run Spark when I do:

sqlContext.sql("show databases").collect()

it works, I get all my Hive databases.

But in sparkR it does not behave the same way.

when I do:

sql(sqlContext,"show databases")

...
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  java.lang.RuntimeException: [1.1] failure: ``with'' expected but
identifier show found
...

>From the documentation it seems that I need to instanciate an hiveContext.

hiveContext <- sparkRHive.init(sc)
sql(hiveContext, "show databases")

...
16/01/20 18:37:20 ERROR RBackendHandler: sql on 2 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  java.lang.AssertionError: Authorization plugins not initialized!
at
org.apache.hadoop.hive.ql.session.SessionState.getAuthorizationMode(SessionState.java:1511)
at
org.apache.hadoop.hive.ql.session.SessionState.isAuthorizationModeV2(SessionState.java:1515)
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:566)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:484)
at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:473)
at
org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.appl
...

Any help would be appreciated.

Regards,

Julien


Re: Using Spark, SparkR and Ranger, please help.

2016-01-20 Thread Ted Yu
The tail of the stack trace seems to be chopped off.

Can you include the whole trace ?

Which version of Spark / Hive / Ranger are you using ?

Cheers

On Wed, Jan 20, 2016 at 9:42 AM, Julien Carme 
wrote:

> Hello,
>
> I have been able to use Spark with Apache Ranger. I had the right
> configuration files to Spark conf, I add Ranger jars to the classpath and
> it works, Spark complies to Ranger rules when I access Hive tables.
>
> However with SparkR it does not work, which is rather surprising
> considering SparkR is supposed to be just a layer over Spark. I don't
> understand why sparkR seem to behave differently, maybe I am just missing
> something.
>
> So when I run Spark when I do:
>
> sqlContext.sql("show databases").collect()
>
> it works, I get all my Hive databases.
>
> But in sparkR it does not behave the same way.
>
> when I do:
>
> sql(sqlContext,"show databases")
>
> ...
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
>   java.lang.RuntimeException: [1.1] failure: ``with'' expected but
> identifier show found
> ...
>
> From the documentation it seems that I need to instanciate an hiveContext.
>
> hiveContext <- sparkRHive.init(sc)
> sql(hiveContext, "show databases")
>
> ...
> 16/01/20 18:37:20 ERROR RBackendHandler: sql on 2 failed
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
>   java.lang.AssertionError: Authorization plugins not initialized!
> at
> org.apache.hadoop.hive.ql.session.SessionState.getAuthorizationMode(SessionState.java:1511)
> at
> org.apache.hadoop.hive.ql.session.SessionState.isAuthorizationModeV2(SessionState.java:1515)
> at
> org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:566)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:484)
> at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:473)
> at
> org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.appl
> ...
>
> Any help would be appreciated.
>
> Regards,
>
> Julien
>