Maziyar PANAHI created ZEPPELIN-3939:
----------------------------------------
Summary: Spark 2.4 incompatibility with commons-lang3 in Zeppelin
Key: ZEPPELIN-3939
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3939
Project: Zeppelin
Issue Type: Bug
Components: Interpreters, zeppelin-interpreter
Affects Versions: 0.8.1
Environment: Cloudera 6.1
Spark 2.4
Hadoop 3.0
Reporter: Maziyar PANAHI
Hi,
I have built Zeppelin in my Cloudera 6.1 cluster for Spark 2.4 (Hadoop 3.0) and
everything went well with the support of Spark 2.4.
However, I can't read JSON nor CSV files due to the following error:
{noformat}
java.io.InvalidClassException: org.apache.commons.lang3.time.FastDateParser;
local class incompatible
{noformat}
{code:java}
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID
117, hadoop-16, executor 3): java.io.InvalidClassException:
org.apache.commons.lang3.time.FastDateParser; local class incompatible: stream
classdesc serialVersionUID = 2, local class serialVersionUID = 3 at
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699) at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885) at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83) at
org.apache.spark.scheduler.Task.run(Task.scala:121) at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) Driver stacktrace: at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1890)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1878)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1877) at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:929)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:929)
at scala.Option.foreach(Option.scala:257) at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:929)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2111)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2060)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2049)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:740) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2073) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2094) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2113) at
org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:365) at
org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3383)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2544) at
org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2544) at
org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364) at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363) at
org.apache.spark.sql.Dataset.head(Dataset.scala:2544) at
org.apache.spark.sql.Dataset.take(Dataset.scala:2758) at
org.apache.spark.sql.Dataset.getRows(Dataset.scala:254) at
org.apache.spark.sql.Dataset.showString(Dataset.scala:291) at
org.apache.spark.sql.Dataset.show(Dataset.scala:745) at
org.apache.spark.sql.Dataset.show(Dataset.scala:704) at
org.apache.spark.sql.Dataset.show(Dataset.scala:713) ... 47 elided Caused by:
java.io.InvalidClassException: org.apache.commons.lang3.time.FastDateParser;
local class incompatible: stream classdesc serialVersionUID = 2, local class
serialVersionUID = 3 at
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699) at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885) at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
scala.collection.immutable.List$SerializationProxy.readObject(List.scala:490)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1170) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2178) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83) at
org.apache.spark.scheduler.Task.run(Task.scala:121) at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413) ... 3 more
{code}
I have seen this error before due to Zeppelin using an older version
commons.lang3 than the Spark:
https://issues.apache.org/jira/browse/ZEPPELIN-1977
Spark 2.4 is using 3.5 but Zeppelin is using 3.4 in branch-0.8:
*Spark*:
https://github.com/apache/spark/blob/3ece0aa479bd32732742d1d8e607de25520a9f5a/pom.xml#L170
*Zeppelin*:
https://github.com/apache/zeppelin/blob/2c2deb1891b8700d3d9d68695891e09e3139a48d/zeppelin-interpreter/pom.xml#L88
I have seen that Spark 2.4 has become supported in Zeppelin 0.8.1 so I assume
it should be fully compatible which as now it is impossible to read any CSV or
JSON file (might be more) due to serialization incompatibility.
Many thanks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)