[ 
https://issues.apache.org/jira/browse/SPARK-30643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024062#comment-17024062
 ] 

Dongjoon Hyun commented on SPARK-30643:
---------------------------------------

It sounds like a misunderstanding on the role of embedded Hive. It's just used 
to talk Hive metastore.
> But if I chose to run Hive 3 and Spark with embedded Hive 2.3, then SparkSQL 
> and Hive queries behavior could differ in some cases.

Everything (SQL Parser/Analyzer/Optimizer and execution engine) are Spark's own 
code. So, in general, the embedded Hive 1.2/2.3 doesn't make a different. The 
exceptional cases might be Hive bugs. For example, Spark 3.0.0 will ship with 
Hive 1.2 and Hive 2.3 (default), and all UTs passed in both environment with 
same results.

I don't think Apache Spark need to have Hive 1.2 and Hive 2.3 and 3.1 in Apache 
Spark 3.x era. Adding 2.3 took away too many efforts from Apache Spark 
community, so it couldn't happen in Apache Spark 2.x. Maybe, we can consider 
that for Apache Spark 4.0 if there is many users who running Hive 3.x in the 
production stably (not beta.)
> I think that majority of reasons that went into support of embedding Hive 2.3 
> will apply to support of embedding Hive 3.


> Add support for embedding Hive 3
> --------------------------------
>
>                 Key: SPARK-30643
>                 URL: https://issues.apache.org/jira/browse/SPARK-30643
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Igor Dvorzhak
>            Priority: Major
>
> Currently Spark can be compiled only against Hive 1.2.1 and Hive 2.3, 
> compilation fails against Hive 3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to