[jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2018-05-03 Thread thua...@fsoft.com.vn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463003#comment-16463003
 ] 

thua...@fsoft.com.vn edited comment on SPARK-13446 at 5/3/18 11:10 PM:
---

Hi Tavis and Spark Team,

Tavis explanation is very clear. As I read in the other thread, this bug is 
fixed and fully tested by QA team. Could someone please help to look into the 
merging the code.

I am running into the same problem in pyspark.  Bellow my pyspark submit stack 
trace:

py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState.

: java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

at 
org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)

at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)

at 
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)

at 
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)

at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)

at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)

at 
org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)

at 
org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)

at 
org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)

at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)

at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)

at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)

at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)

at 
org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)

at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)

at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

at scala.Option.getOrElse(Option.scala:121)


was (Author: thua...@fsoft.com.vn):
I am running into the same problem in pyspark. Could anyone help resolve this 
issue, please? Bellow my pyspark submit stack trace:

py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState.

: java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

 at 
org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)

 at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)

 at 
org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)

 at 
org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)

 at 
org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)

 at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)

 at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at scala.Option.getOrElse(Option.scala:121)

> Spark need to support reading data from Hive 2.0.0 metastore
> 

[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2018-05-03 Thread thua...@fsoft.com.vn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463003#comment-16463003
 ] 

thua...@fsoft.com.vn commented on SPARK-13446:
--

I am running into the same problem in pyspark. Could anyone help resolve this 
issue, please? Bellow my pyspark submit stack trace:

py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState.

: java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

 at 
org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)

 at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)

 at 
org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)

 at 
org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)

 at 
org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)

 at 
org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)

 at 
org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)

 at 
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)

 at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at 
org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at scala.Option.getOrElse(Option.scala:121)

> Spark need to support reading data from Hive 2.0.0 metastore
> 
>
> Key: SPARK-13446
> URL: https://issues.apache.org/jira/browse/SPARK-13446
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Lifeng Wang
>Assignee: Xiao Li
>Priority: Major
> Fix For: 2.2.0
>
>
> Spark provided HIveContext class to read data from hive metastore directly. 
> While it only supports hive 1.2.1 version and older. Since hive 2.0.0 has 
> released, it's better to upgrade to support Hive 2.0.0.
> {noformat}
> 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI 
> thrift://hsw-node13:9083
> 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current 
> connections: 1
> 16/02/23 02:35:02 INFO metastore: Connected to metastore.
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
> at 
> org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
> at 
> org.apache.spark.sql.hive.HiveContext$$anon$1.(HiveContext.scala:422)
> at 
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422)
> at 
> org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421)
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72)
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org