[jira] [Commented] (SPARK-26422) Unable to disable Hive support in SparkR when Hadoop version is unsupported

ASF GitHub Bot (JIRA) Fri, 21 Dec 2018 00:14:31 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726526#comment-16726526
 ]


ASF GitHub Bot commented on SPARK-26422:
----------------------------------------

asfgit closed pull request #23356: [SPARK-26422][R] Support to disable Hive 
support in SparkR even for Hadoop versions unsupported by Hive fork
URL: https://github.com/apache/spark/pull/23356
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
index becb05cf72aba..e98cab8b56d13 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
@@ -49,9 +49,17 @@ private[sql] object SQLUtils extends Logging {
       sparkConfigMap: JMap[Object, Object],
       enableHiveSupport: Boolean): SparkSession = {
     val spark =
-      if (SparkSession.hiveClassesArePresent && enableHiveSupport &&
+      if (enableHiveSupport &&
           jsc.sc.conf.get(CATALOG_IMPLEMENTATION.key, 
"hive").toLowerCase(Locale.ROOT) ==
-            "hive") {
+            "hive" &&
+          // Note that the order of conditions here are on purpose.
+          // `SparkSession.hiveClassesArePresent` checks if Hive's `HiveConf` 
is loadable or not;
+          // however, `HiveConf` itself has some static logic to check if 
Hadoop version is
+          // supported or not, which throws an `IllegalArgumentException` if 
unsupported.
+          // If this is checked first, there's no way to disable Hive support 
in the case above.
+          // So, we intentionally check if Hive classes are loadable or not 
only when
+          // Hive support is explicitly enabled by short-circuiting. See also 
SPARK-26422.
+          SparkSession.hiveClassesArePresent) {
         
SparkSession.builder().sparkContext(withHiveExternalCatalog(jsc.sc)).getOrCreate()
       } else {
         if (enableHiveSupport) {


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unable to disable Hive support in SparkR when Hadoop version is unsupported
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-26422
>                 URL: https://issues.apache.org/jira/browse/SPARK-26422
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 3.0.0
>            Reporter: Hyukjin Kwon
>            Assignee: Hyukjin Kwon
>            Priority: Major
>             Fix For: 2.3.3, 2.4.1, 3.0.0
>
>
> When we make a Spark session as below:
> {code}
> sparkSession <- sparkR.session("local[4]", "SparkR", Sys.getenv("SPARK_HOME"),
>                                list(spark.driver.extraClassPath = jarpaths,
>                                     spark.executor.extraClassPath = jarpaths),
>                                enableHiveSupport = FALSE)
> {code}
> I faced an issue that it's unable to disable Hive support explicitly with the 
> error below:
> {code}
> java.lang.reflect.InvocationTargetException
> ...
> Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major 
> version number: 3.1.1.3.1.0.0-78
>       at 
> org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
>       at 
> org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
>       at 
> org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
>       at 
> org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
>       ... 43 more
> Error in handleErrors(returnStatus, conn) :
>   java.lang.ExceptionInInitializerError
>       at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
>       at java.lang.Class.forName0(Native Method)
>       at java.lang.Class.forName(Class.java:348)
>       at org.apache.spark.util.Utils$.classForName(Utils.scala:193)
>       at 
> org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1116)
>       at 
> org.apache.spark.sql.api.r.SQLUtils$.getOrCreateSparkSession(SQLUtils.scala:52)
>       at 
> org.apache.spark.sql.api.r.SQLUtils.getOrCreateSparkSession(SQLUtils.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:167)
>       at 
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:108)
> ...
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26422) Unable to disable Hive support in SparkR when Hadoop version is unsupported

Reply via email to