Zeppelin 0.8.0 issue with Hadoop 2.9.2 on Spark 2.1.0

Nguyen Xuan Truong Sun, 28 Apr 2019 23:19:08 -0700

Hi,

We were having a Zeppelin instance 0.8.0 (binary package) running smoothly
on Spark 2.1.0 and Hadoop 2.6.4


We recently upgrade our hadoop version from 2.6.4 to 2.9.2 and I start
getting this error with Zeppelin when reading from HDFS (using Scala 2.11.8)

java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.spark.rdd.RDDOperationScope$ at
> org.apache.spark.SparkContext.withScope(SparkContext.scala:701) at
> org.apache.spark.SparkContext.parallelize(SparkContext.scala:715) at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:594)
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:235)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:184)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:184)
> at scala.Option.orElse(Option.scala:289) at
> org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$getOrInferFileFormatSchema(DataSource.scala:183)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425) ...
> 52 elided


I think it's related to the *com.fasterxml.jackson.core* dependency.
Current version I am using is 2.8.10. I already tried replacing version
2.8.10 with 2.7.8 and 2.8.8 but the issue still persists. Instead of the
above error, I got the following for 2.8.8 version:

com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson
> version: 2.8.8 at
> com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
> at
> com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
> at
> com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:745)
> at
> org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
> at
> org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
> at org.apache.spark.SparkContext.withScope(SparkContext.scala:701) at
> org.apache.spark.SparkContext.parallelize(SparkContext.scala:715) at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:594)
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:235)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:184)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:184)
> at scala.Option.orElse(Option.scala:289) at
> org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$getOrInferFileFormatSchema(DataSource.scala:183)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425) ...
> 52 elided
>

Wonder if anyone has any idea to resolve the issue? (We can't change our
Spark and Hadoop version) but we can change Zeppelin version if needed.

Thanks,
Truong

Zeppelin 0.8.0 issue with Hadoop 2.9.2 on Spark 2.1.0

Reply via email to