[jira] [Commented] (SPARK-18112) Spark2.x does not support read data from Hive 2.x metastore

Hyukjin Kwon (JIRA) Thu, 27 Sep 2018 07:47:11 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16630542#comment-16630542
 ]


Hyukjin Kwon commented on SPARK-18112:
--------------------------------------

To be more specific, the code you guys pointed out is executed by Spark's Hive 
fork 1.2.1 which contains that configuration 
(https://github.com/apache/hive/blob/branch-1.2/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1290)
 

That's meant to be executed with Spark's Hive fork. So you should leave the jar 
as is. And then, the higher jars for Hive to create Hive client should be 
provided to {{spark.sql.hive.metastore.jars}} and 
{{spark.sql.hive.metastore.version}} should be set accordingly.

The problem here looks, you guys completely replaced the jars into higher Hive 
jars. Therefore, it throws {{NoSuchFieldError}}

I recently manually tested 1.2.1, 2.3.0 and 3.0.0 (against 
https://github.com/apache/spark/pull/21404) in few months ago against Apache 
Spark. I am pretty sure that it works for now.

If I am mistaken or misunderstood at some points, please provide a reproducible 
step, or at least why it fails. Let me take a look.

> Spark2.x does not support read data from Hive 2.x metastore
> -----------------------------------------------------------
>
>                 Key: SPARK-18112
>                 URL: https://issues.apache.org/jira/browse/SPARK-18112
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1
>            Reporter: KaiXu
>            Assignee: Xiao Li
>            Priority: Critical
>             Fix For: 2.2.0
>
>
> Hive2.0 has been released in February 2016, after that Hive2.0.1 and 
> Hive2.1.0 have also been released for a long time, but till now spark only 
> support to read hive metastore data from Hive1.2.1 and older version, since 
> Hive2.x has many bugs fixed and performance improvement it's better and 
> urgent to upgrade to support Hive2.x
> failed to load data from hive2.x metastore:
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at 
> org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:197)
>         at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
>         at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
>         at 
> org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
>         at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:4
>         at 
> org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
>         at 
> org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
>         at 
> org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
>         at 
> org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:31)
>         at org.apache.spark.sql.SparkSession.table(SparkSession.scala:568)
>         at org.apache.spark.sql.SparkSession.table(SparkSession.scala:564)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-18112) Spark2.x does not support read data from Hive 2.x metastore

Reply via email to