Hi all, I have a question regarding the Power Iteration Clustering. I have an input file (tab separated edge list) which I read in and map it to the required format of RDD[(Long, Long, Double)] to then apply PIC. So far so good… The implementation works fine if the input is small (up to 50MB). But it crashes if I try to apply it to a file of size 650 MB. My technical setup is a compute cluster with 1 master 2 workers. The executor memory is set to 50 GB and in total 24 cores are available. Is it normal that the program crashes at such a file size? I attached my program code as well as the error output. I hope someone can help me! Best regards, Lydia |
PIC.scala
Description: Binary data
16/11/23 13:34:19 INFO spark.SparkContext: Running Spark version 2.1.0-SNAPSHOT 16/11/23 13:34:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/23 13:34:20 INFO spark.SecurityManager: Changing view acls to: icklerly 16/11/23 13:34:20 INFO spark.SecurityManager: Changing modify acls to: icklerly 16/11/23 13:34:20 INFO spark.SecurityManager: Changing view acls groups to: 16/11/23 13:34:20 INFO spark.SecurityManager: Changing modify acls groups to: 16/11/23 13:34:20 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(icklerly); groups with view permissions: Set(); users with modify permissions: Set(icklerly); groups with modify permissions: Set() 16/11/23 13:34:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 36371. 16/11/23 13:34:20 INFO spark.SparkEnv: Registering MapOutputTracker 16/11/23 13:34:20 INFO spark.SparkEnv: Registering BlockManagerMaster 16/11/23 13:34:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 16/11/23 13:34:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 16/11/23 13:34:20 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-80b089a7-be21-4d14-ab6f-7e0ef1f14396 16/11/23 13:34:20 INFO memory.MemoryStore: MemoryStore started with capacity 396.3 MB 16/11/23 13:34:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator 16/11/23 13:34:20 INFO util.log: Logging initialized @1120ms 16/11/23 13:34:20 INFO server.Server: jetty-9.2.z-SNAPSHOT 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3543df7d{/jobs,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7c541c15{/jobs/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3542162a{/jobs/job,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@698122b2{/jobs/job/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4212a0c8{/stages,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1e7aa82b{/stages/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3b2c0e88{/stages/stage,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5bd82fed{/stages/stage/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@c1bd0be{/stages/pool,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@476b0ae6{/stages/pool/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1c6804cd{/storage,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@655f7ea{/storage/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@549949be{/storage/rdd,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4b3a45f1{/storage/rdd/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@17a87e37{/environment,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3eeb318f{/environment/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20a14b55{/executors,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@39ad977d{/executors/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6da00fb9{/executors/threadDump,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@a202ccb{/executors/threadDump/json,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20f12539{/static,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@75b25825{/,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@18025ced{/api,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@13cf7d52{/jobs/job/kill,null,AVAILABLE} 16/11/23 13:34:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3a3e4aff{/stages/stage/kill,null,AVAILABLE} 16/11/23 13:34:20 INFO server.ServerConnector: Started ServerConnector@2cae1042{HTTP/1.1}{0.0.0.0:4040} 16/11/23 13:34:20 INFO server.Server: Started @1207ms 16/11/23 13:34:20 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 16/11/23 13:34:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://130.73.20.224:4040 16/11/23 13:34:20 INFO spark.SparkContext: Added JAR file:/home/icklerly/spark-master/examples/target/scala-2.11/jars/spark-examples_2.11-2.1.0-SNAPSHOT.jar at spark://130.73.20.224:36371/jars/spark-examples_2.11-2.1.0-SNAPSHOT.jar with timestamp 1479904460674 16/11/23 13:34:20 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://medlab04:7077... 16/11/23 13:34:20 INFO client.TransportClientFactory: Successfully created connection to medlab04/130.73.20.224:7077 after 25 ms (0 ms spent in bootstraps) 16/11/23 13:34:20 INFO cluster.StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20161123133420-0006 16/11/23 13:34:20 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20161123133420-0006/0 on worker-20161123131030-130.73.21.134-38384 (130.73.21.134:38384) with 12 cores 16/11/23 13:34:20 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20161123133420-0006/0 on hostPort 130.73.21.134:38384 with 12 cores, 50.0 GB RAM 16/11/23 13:34:20 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20161123133420-0006/1 on worker-20161123131042-130.73.20.224-35492 (130.73.20.224:35492) with 12 cores 16/11/23 13:34:20 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20161123133420-0006/1 on hostPort 130.73.20.224:35492 with 12 cores, 50.0 GB RAM 16/11/23 13:34:20 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20161123133420-0006/1 is now RUNNING 16/11/23 13:34:20 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20161123133420-0006/0 is now RUNNING 16/11/23 13:34:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36463. 16/11/23 13:34:20 INFO netty.NettyBlockTransferService: Server created on 130.73.20.224:36463 16/11/23 13:34:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 16/11/23 13:34:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 130.73.20.224, 36463, None) 16/11/23 13:34:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager 130.73.20.224:36463 with 396.3 MB RAM, BlockManagerId(driver, 130.73.20.224, 36463, None) 16/11/23 13:34:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 130.73.20.224, 36463, None) 16/11/23 13:34:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, 130.73.20.224, 36463, None) 16/11/23 13:34:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@486be205{/metrics/json,null,AVAILABLE} 16/11/23 13:34:21 INFO cluster.StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 16/11/23 13:34:21 WARN mllib.PIC$: Start:I 16/11/23 13:34:21 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 128.0 KB, free 396.2 MB) 16/11/23 13:34:21 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.4 KB, free 396.2 MB) 16/11/23 13:34:21 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 130.73.20.224:36463 (size: 14.4 KB, free: 396.3 MB) 16/11/23 13:34:21 INFO spark.SparkContext: Created broadcast 0 from textFile at PIC.scala:28 16/11/23 13:34:21 WARN mllib.PIC$: End:I 16/11/23 13:34:22 WARN scheduler.TaskSetManager: Lost task 3.0 in stage 2.0 (TID 13, 130.73.21.134, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
16/11/23 13:34:22 ERROR scheduler.TaskSetManager: Task 3 in stage 2.0 failed 4 times; aborting job Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 2.0 failed 4 times, most recent failure: Lost task 3.3 in stage 2.0 (TID 23, 130.73.21.134, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1436) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1424) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1423) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1651) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1606) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1595) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1977) at org.apache.spark.rdd.RDD$$anonfun$fold$1.apply(RDD.scala:1078) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) at org.apache.spark.rdd.RDD.fold(RDD.scala:1072) at org.apache.spark.rdd.DoubleRDDFunctions$$anonfun$sum$1.apply$mcD$sp(DoubleRDDFunctions.scala:35) at org.apache.spark.rdd.DoubleRDDFunctions$$anonfun$sum$1.apply(DoubleRDDFunctions.scala:35) at org.apache.spark.rdd.DoubleRDDFunctions$$anonfun$sum$1.apply(DoubleRDDFunctions.scala:35) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:358) at org.apache.spark.rdd.DoubleRDDFunctions.sum(DoubleRDDFunctions.scala:34) at org.apache.spark.mllib.clustering.PowerIterationClustering$.initDegreeVector(PowerIterationClustering.scala:447) at org.apache.spark.mllib.clustering.PowerIterationClustering.run(PowerIterationClustering.scala:209) at org.apache.spark.examples.mllib.PIC$.main(PIC.scala:42) at org.apache.spark.examples.mllib.PIC.main(PIC.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)