Looks there are some circular references in SQL making the immutable List
serialization fail in 2.11.

In 2.11, Scala immutable List uses writeReplace()/readResolve() which don't
play nicely with circular references. Here is an example to reproduce this
issue in 2.11.6:

  class Foo extends Serializable {
    var l: Seq[Any] = null
  }

  import java.io._

  val o = new ByteArrayOutputStream()
  val o1 = new ObjectOutputStream(o)
  val m = new Foo
  val n = List(1, m)
  m.l = n
  o1.writeObject(n)
  o1.close()
  val i = new ByteArrayInputStream(o.toByteArray)
  val i1 = new ObjectInputStream(i)
  i1.readObject()

Could you provide the "explain" output? It would be helpful to find the
circular references.



Best Regards,
Shixiong Zhu

2015-09-05 0:26 GMT+08:00 Jeff Jones <jjo...@adaptivebiotech.com>:

> We are using Scala 2.11 for a driver program that is running Spark SQL
> queries in a standalone cluster. I’ve rebuilt Spark for Scala 2.11 using
> the instructions at
> http://spark.apache.org/docs/latest/building-spark.html.  I’ve had to
> work through a few dependency conflict but all-in-all it seems to work for
> some simple Spark examples. I integrated the Spark SQL code into my
> application and I’m able to run using a local client, but when I switch
> over to the standalone cluster I get the following error.  Any help
> tracking this down would be appreciated.
>
> This exception occurs during a DataFrame.collect() call. I’ve tried to use
> –Dsun.io.serialization.extendedDebugInfo=true to get more information but
> it didn’t provide anything more.
>
> [error] o.a.s.s.TaskSetManager - Task 0 in stage 1.0 failed 4 times;
> aborting job
>
> [error] c.a.i.c.Analyzer - Job aborted due to stage failure: Task 0 in
> stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0
> (TID 4, 10.248.0.242): java.lang.ClassCastException: cannot assign instance
> of scala.collection.immutable.List$SerializationProxy to field
> org.apache.spark.sql.execution.Project.projectList of type
> scala.collection.Seq in instance of org.apache.spark.sql.execution.Project
>
> at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(Unknown
> Source)
>
> at java.io.ObjectStreamClass.setObjFieldValues(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.readObject(Unknown Source)
>
> at
> scala.collection.immutable.List$SerializationProxy.readObject(List.scala:477)
>
> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.readObject(Unknown Source)
>
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
>
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
>
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
>
> at org.apache.spark.scheduler.Task.run(Task.scala:70)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>
> at java.lang.Thread.run(Unknown Source)
>
> Thanks,
> Jeff
>
>
> This message (and any attachments) is intended only for the designated
> recipient(s). It
> may contain confidential or proprietary information, or have other
> limitations on use as
> indicated by the sender. If you are not a designated recipient, you may
> not review, use,
> copy or distribute this message. If you received this in error, please
> notify the sender by
> reply e-mail and delete this message.
>

Reply via email to