[ https://issues.apache.org/jira/browse/SPARK-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-11751: ------------------------------------ Assignee: Apache Spark > Doc describe error in the "Spark Streaming Programming Guide" page > ------------------------------------------------------------------ > > Key: SPARK-11751 > URL: https://issues.apache.org/jira/browse/SPARK-11751 > Project: Spark > Issue Type: Documentation > Components: Documentation > Affects Versions: 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Reporter: yangping wu > Assignee: Apache Spark > Priority: Trivial > > In the *Task Launching Overheads* section, > {quote}*Task Serialization*: Using Kryo serialization for serializing tasks > can reduce the task sizes, and therefore reduce the time taken to send them > to the slaves.{quote} > As we known *Task Serialization* is configuration by > *spark.closure.serializer* parameter, but currently only the Java serializer > is supported. If we set *spark.closure.serializer* to > *org.apache.spark.serializer.KryoSerializer*, then this will throw a > exception as follow: > {code} > org.apache.spark.SparkException: Job aborted due to stage failure: Task 516 > in stage 0.0 failed 4 times, most recent failure: Lost task 516.3 in stage > 0.0 (TID 21, spark-cluster.data.com): java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.spark.scheduler.Task$.deserializeWithDependencies(Task.scala:188) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:192) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org