[jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463003#comment-16463003 ] thua...@fsoft.com.vn edited comment on SPARK-13446 at 5/3/18 11:10 PM: --- Hi Tavis and Spark Team, Tavis explanation is very clear. As I read in the other thread, this bug is fixed and fully tested by QA team. Could someone please help to look into the merging the code. I am running into the same problem in pyspark. Bellow my pyspark submit stack trace: py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState. : java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265) at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66) at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93) at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35) at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at scala.Option.getOrElse(Option.scala:121) was (Author: thua...@fsoft.com.vn): I am running into the same problem in pyspark. Could anyone help resolve this issue, please? Bellow my pyspark submit stack trace: py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState. : java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265) at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66) at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93) at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35) at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at scala.Option.getOrElse(Option.scala:121) > Spark need to support reading data from Hive 2.0.0 metastore >
[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463003#comment-16463003 ] thua...@fsoft.com.vn commented on SPARK-13446: -- I am running into the same problem in pyspark. Could anyone help resolve this issue, please? Bellow my pyspark submit stack trace: py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState. : java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265) at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66) at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93) at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35) at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at scala.Option.getOrElse(Option.scala:121) > Spark need to support reading data from Hive 2.0.0 metastore > > > Key: SPARK-13446 > URL: https://issues.apache.org/jira/browse/SPARK-13446 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Lifeng Wang >Assignee: Xiao Li >Priority: Major > Fix For: 2.2.0 > > > Spark provided HIveContext class to read data from hive metastore directly. > While it only supports hive 1.2.1 version and older. Since hive 2.0.0 has > released, it's better to upgrade to support Hive 2.0.0. > {noformat} > 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI > thrift://hsw-node13:9083 > 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current > connections: 1 > 16/02/23 02:35:02 INFO metastore: Connected to metastore. > Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT > at > org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473) > at > org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192) > at > org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185) > at > org.apache.spark.sql.hive.HiveContext$$anon$1.(HiveContext.scala:422) > at > org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422) > at > org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421) > at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72) > at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739) > at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org