[ https://issues.apache.org/jira/browse/SPARK-26937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772650#comment-16772650 ]
Dongjoon Hyun commented on SPARK-26937: --------------------------------------- Thank you for reporting, [~xujiang]. BTW, please see the PR description of SPARK-23807 for the workaround. - https://github.com/apache/spark/pull/20923 There are many efforts for this, but the key issue is SPARK-23534. I close this as a duplicate of SPARK-23534. > Build Spark 2.4 Support Hadoop-3.1 faild > ---------------------------------------- > > Key: SPARK-26937 > URL: https://issues.apache.org/jira/browse/SPARK-26937 > Project: Spark > Issue Type: Bug > Components: Build > Affects Versions: 2.4.0 > Environment: h2. Hi, my environmental information is as follows: > Operating System > {code:java} > CentOS Linux release 7.4.1708 (Core) > Linux e26cf748c48f 4.9.87-linuxkit-aufs #1 SMP Fri Mar 16 18:16:33 UTC 2018 > x86_64 x86_64 x86_64 GNU/Linux > {code} > Maven > {code:java} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-17T18:33:14Z) > Maven home: /usr/local/maven > Java version: 1.8.0_151, vendor: Oracle Corporation, runtime: > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el7_4.x86_64/jre > Default locale: en_US, platform encoding: UTF-8 > OS name: "linux", version: "4.9.87-linuxkit-aufs", arch: "amd64", family: > "unix"{code} > R version > {code:java} > R version 3.5.1 (2018-07-02) – "Feather Spray" > Copyright (C) 2018 The R Foundation for Statistical Computing > Platform: x86_64-redhat-linux-gnu (64-bit){code} > Allocation of resources > {code:java} > Mem: 5957M > Swap: 2047M > Cpus: 3{code} > Reporter: Xu Jiang > Priority: Major > Labels: build > > The build command I am running is: > {code:java} > ./dev/make-distribution.sh --name jdp-spark --tgz --pip --r -Psparkr > -Phadoop-3.1 -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes > -DskipTests{code} > The main reason is that the package that relies on hive-exec-*.jar does not > support Hadoop 3.1.0. > {code:java} > {{org.apache.spark.sql.api.r.SQLUtils.getOrCreateSparkSession(SQLUtils.scala) > ... 36 more > Caused by: java.lang.IllegalArgumentException:{color:red} Unrecognized > Hadoop major version number: 3.1.0{color} > at > org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)}}{code} > Detailed error: > {code:java} > {{+ > SPARK_JARS_DIR=/ws/jdp-package/dl/spark2-2.4.0-src/assembly/target/scala-2.11/jars > + '[' -d /ws/jdp-package/dl/spark2-2.4.0-src/assembly/target/scala-2.11/jars > ']' > + SPARK_HOME=/ws/jdp-package/dl/spark2-2.4.0-src > + /usr/bin/R CMD build /ws/jdp-package/dl/spark2-2.4.0-src/R/pkg > checking for file ‘/ws/jdp-package/dl/spark2-2.4.0-src/R/pkg/DESCRIPTION’ ... > OK preparing ‘SparkR’: checking DESCRIPTION meta-information ... OK > installing the package to build vignettes creating vignettes ... ERROR > Attaching package: 'SparkR' > The following objects are masked from 'package:stats': > cov, filter, lag, na.omit, predict, sd, var, window > The following objects are masked from 'package:base': > as.data.frame, colnames, colnames<-, drop, endsWith, > intersect, rank, rbind, sample, startsWith, subset, summary, > transform, union > Picked up _JAVA_OPTIONS: -XX:-UsePerfData > Picked up _JAVA_OPTIONS: -XX:-UsePerfData > 2019-02-20 01:00:07 WARN NativeCodeLoader:60 - Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 2019-02-20 01:00:14 ERROR RBackendHandler:91 - getOrCreateSparkSession on > org.apache.spark.sql.api.r.SQLUtils failed > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:167) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:108) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:40) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) > at > io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.ExceptionInInitializerError > at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at org.apache.spark.util.Utils$.classForName(Utils.scala:238) > at > org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1117) > at > org.apache.spark.sql.api.r.SQLUtils$.getOrCreateSparkSession(SQLUtils.scala:52) > at > org.apache.spark.sql.api.r.SQLUtils.getOrCreateSparkSession(SQLUtils.scala) > ... 36 more > Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major > version number: 3.1.0 > at > org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174) > at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139) > at > org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100) > at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368) > ... 43 more > Quitting from lines 65-67 (sparkr-vignettes.Rmd) > Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics: > java.lang.ExceptionInInitializerError > at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at org.apache.spark.util.Utils$.classForName(Utils.scala:238) > at > org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1117) > at > org.apache.spark.sql.api.r.SQLUtils$.getOrCreateSparkSession(SQLUtils.scala:52) > at > org.apache.spark.sql.api.r.SQLUtils.getOrCreateSparkSession(SQLUtils.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:167) > at org.apache.spark.api.r.RBackendHandl > Execution halted}} > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org