Re: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

2015-05-24 Thread Mark Hamstra
This discussion belongs on the dev list.  Please post any replies there.

On Sat, May 23, 2015 at 10:19 PM, Cheolsoo Park piaozhe...@gmail.com
wrote:

 Hi,

 I've been testing SparkSQL in 1.4 rc and found two issues. I wanted to
 confirm whether these are bugs or not before opening a jira.

 *1)* I can no longer compile SparkSQL with -Phive-0.12.0. I noticed that
 in 1.4, IsolatedClientLoader is introduced, and different versions of Hive
 metastore jars can be loaded at runtime. But instead, SparkSQL no longer
 compiles with Hive 0.12.0.

 My question is, is this intended? If so, shouldn't the hive-0.12.0 profile
 in POM be removed?

 *2)* After compiling SparkSQL with -Phive-0.13.1, I ran into my 2nd
 problem. Since I have Hive 0.12 metastore in production, I have to use it
 for now. But even if I set spark.sql.hive.metastore.version and
 spark.sql.hive.metastore.jars, SparkSQL cli throws an error as follows-

 15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost
 connection. Attempting to reconnect.
 org.apache.thrift.TApplicationException: Invalid method name:
 'get_functions'
 at
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(ThriftHiveMetastore.java:2886)
 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(ThriftHiveMetastore.java:2872)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(HiveMetaStoreClient.java:1727)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
 at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(Hive.java:2670)
 at
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:674)
 at
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
 at
 org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
 at
 org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)
 at
 org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)

 What's happening is that when SparkSQL Cli starts up, it tries to fetch
 permanent udfs from Hive metastore (due to HIVE-6330
 https://issues.apache.org/jira/browse/HIVE-6330, which was introduced
 in Hive 0.13). But then, it ends up invoking an incompatible thrift
 function that doesn't exist in Hive 0.12. To work around this error, I have
 to comment out the following line of code for now-
 https://goo.gl/wcfnH1

 My question is, is SparkSQL that is compiled against Hive 0.13 supposed to
 work with Hive 0.12 metastore (by setting
 spark.sql.hive.metastore.version and spark.sql.hive.metastore.jars)? It
 only works if I comment out the above line of code.

 Thanks,
 Cheolsoo



RE: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

2015-05-24 Thread Cheng, Hao
Thanks for reporting this.

We intend to support the multiple metastore versions in a single 
build(hive-0.13.1) by introducing the IsolatedClientLoader, but probably you’re 
hitting the bug, please file a jira issue for this.

I will keep investigating on this also.

Hao


From: Mark Hamstra [mailto:m...@clearstorydata.com]
Sent: Sunday, May 24, 2015 9:06 PM
To: Cheolsoo Park
Cc: u...@spark.apache.org; dev@spark.apache.org
Subject: Re: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

This discussion belongs on the dev list.  Please post any replies there.

On Sat, May 23, 2015 at 10:19 PM, Cheolsoo Park 
piaozhe...@gmail.commailto:piaozhe...@gmail.com wrote:
Hi,

I've been testing SparkSQL in 1.4 rc and found two issues. I wanted to confirm 
whether these are bugs or not before opening a jira.

1) I can no longer compile SparkSQL with -Phive-0.12.0. I noticed that in 1.4, 
IsolatedClientLoader is introduced, and different versions of Hive metastore 
jars can be loaded at runtime. But instead, SparkSQL no longer compiles with 
Hive 0.12.0.

My question is, is this intended? If so, shouldn't the hive-0.12.0 profile in 
POM be removed?

2) After compiling SparkSQL with -Phive-0.13.1, I ran into my 2nd problem. 
Since I have Hive 0.12 metastore in production, I have to use it for now. But 
even if I set spark.sql.hive.metastore.version and 
spark.sql.hive.metastore.jars, SparkSQL cli throws an error as follows-

15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost 
connection. Attempting to reconnect.
org.apache.thrift.TApplicationException: Invalid method name: 'get_functions'
at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(ThriftHiveMetastore.java:2886)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(ThriftHiveMetastore.java:2872)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(HiveMetaStoreClient.java:1727)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(Hive.java:2670)
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:674)
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)

What's happening is that when SparkSQL Cli starts up, it tries to fetch 
permanent udfs from Hive metastore (due to 
HIVE-6330https://issues.apache.org/jira/browse/HIVE-6330, which was 
introduced in Hive 0.13). But then, it ends up invoking an incompatible thrift 
function that doesn't exist in Hive 0.12. To work around this error, I have to 
comment out the following line of code for now-
https://goo.gl/wcfnH1

My question is, is SparkSQL that is compiled against Hive 0.13 supposed to work 
with Hive 0.12 metastore (by setting spark.sql.hive.metastore.version and 
spark.sql.hive.metastore.jars)? It only works if I comment out the above line 
of code.

Thanks,
Cheolsoo



Re: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

2015-05-24 Thread Cheolsoo Park
Thank you Hao for the confirmation!

I filed two jiras as follows-
https://issues.apache.org/jira/browse/SPARK-7850 (removing hive-0.12.0
profile from pom)
https://issues.apache.org/jira/browse/SPARK-7851 (thrift error with hive
metastore 0.12)


On Sun, May 24, 2015 at 8:18 PM, Cheng, Hao hao.ch...@intel.com wrote:

  Thanks for reporting this.



 We intend to support the multiple metastore versions in a single
 build(hive-0.13.1) by introducing the IsolatedClientLoader, but probably
 you’re hitting the bug, please file a jira issue for this.



 I will keep investigating on this also.



 Hao





 *From:* Mark Hamstra [mailto:m...@clearstorydata.com]
 *Sent:* Sunday, May 24, 2015 9:06 PM
 *To:* Cheolsoo Park
 *Cc:* u...@spark.apache.org; dev@spark.apache.org
 *Subject:* Re: SparkSQL errors in 1.4 rc when using with Hive 0.12
 metastore



 This discussion belongs on the dev list.  Please post any replies there.



 On Sat, May 23, 2015 at 10:19 PM, Cheolsoo Park piaozhe...@gmail.com
 wrote:

  Hi,



 I've been testing SparkSQL in 1.4 rc and found two issues. I wanted to
 confirm whether these are bugs or not before opening a jira.


 *1)* I can no longer compile SparkSQL with -Phive-0.12.0. I noticed that
 in 1.4, IsolatedClientLoader is introduced, and different versions of Hive
 metastore jars can be loaded at runtime. But instead, SparkSQL no longer
 compiles with Hive 0.12.0.



 My question is, is this intended? If so, shouldn't the hive-0.12.0 profile
 in POM be removed?



 *2)* After compiling SparkSQL with -Phive-0.13.1, I ran into my 2nd
 problem. Since I have Hive 0.12 metastore in production, I have to use it
 for now. But even if I set spark.sql.hive.metastore.version and
 spark.sql.hive.metastore.jars, SparkSQL cli throws an error as follows-



 15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost
 connection. Attempting to reconnect.

 org.apache.thrift.TApplicationException: Invalid method name:
 'get_functions'

 at
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)

 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)

 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(ThriftHiveMetastore.java:2886)

 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(ThriftHiveMetastore.java:2872)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(HiveMetaStoreClient.java:1727)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)

 at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)

 at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(Hive.java:2670)

 at
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:674)

 at
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)

 at
 org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)

 at
 org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)

 at
 org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)



 What's happening is that when SparkSQL Cli starts up, it tries to fetch
 permanent udfs from Hive metastore (due to HIVE-6330
 https://issues.apache.org/jira/browse/HIVE-6330, which was introduced
 in Hive 0.13). But then, it ends up invoking an incompatible thrift
 function that doesn't exist in Hive 0.12. To work around this error, I have
 to comment out the following line of code for now-

 https://goo.gl/wcfnH1



 My question is, is SparkSQL that is compiled against Hive 0.13 supposed to
 work with Hive 0.12 metastore (by setting
 spark.sql.hive.metastore.version and spark.sql.hive.metastore.jars)? It
 only works if I comment out the above line of code.



 Thanks,

 Cheolsoo