Setting the following while creating the sparkContext will sort it out.
.set(spark.core.connection.ack.wait.timeout,600)
.set(spark.akka.frameSize,50)
On 27 Oct 2014 21:15, shahab shahab.mok...@gmail.com wrote:
Hi,
I have a stand alone Spark Cluster, where worker and master reside on the
same machine. I submit a job to the cluster, the job is executed for a
while and suddenly I get this exception with no additional trace.
ConnectionManager: key already cancelled ?
sun.nio.ch.SelectionKeyImpl@2490dce9
java.nio.channels.CancelledKeyException at
org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386)
at
org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
Any idea where should I look for the cause?
best,
/shahab
This following is the part of printout from driver application logs:
14/10/27 15:21:15 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
ip-10-89-32-179.eu-west-1.compute.internal:40479 in memory (size: 3.4 KB,
free: 1565.6 MB)
14/10/27 15:21:15 INFO ContextCleaner: Cleaned broadcast 1
14/10/27 15:21:15 INFO ShuffleBlockManager: Could not find files for
shuffle 1 for deleting
14/10/27 15:21:15 INFO ContextCleaner: Cleaned shuffle 1
14/10/27 15:21:15 INFO ShuffleBlockManager: Could not find files for
shuffle 0 for deleting
14/10/27 15:21:15 INFO ContextCleaner: Cleaned shuffle 0
14/10/27 15:21:15 INFO BlockManagerInfo: Removed taskresult_9 on
ip-10-zz.xx-yy:40479 in memory (size: 24.1 MB, free: 1589.8 MB)
14/10/27 15:21:16 INFO DAGScheduler: Stage 7 (collect at
TimeBenchmarking_SimpleModel.scala:55) finished in 3.209 s
14/10/27 15:21:16 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID
9) in 2640 ms onip-10-zz.xx-yy (1/1)
14/10/27 15:21:16 INFO SparkContext: Job finished: collect at
TimeBenchmarking_SimpleModel.scala:55, took 102.661420511 s
14/10/27 15:21:16 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks
have all completed, from pool
14/10/27 15:21:16 INFO SparkUI: Stopped Spark web UI at
http://ip-10-zz.xx-yy:4040
14/10/27 15:21:16 INFO DAGScheduler: Stopping DAGScheduler
14/10/27 15:21:16 INFO SparkDeploySchedulerBackend: Shutting down all
executors
14/10/27 15:21:16 INFO SparkDeploySchedulerBackend: Asking each executor
to shut down
14/10/27 15:21:16 INFO ConnectionManager: Removing ReceivingConnection to
ConnectionManagerId(ip-10-zz.xx-yy, 40479)
14/10/27 15:21:16 INFO ConnectionManager: Removing SendingConnection to
ConnectionManagerId(ip-10-zz.xx-yy,40479)
14/10/27 15:21:16 INFO ConnectionManager: Removing SendingConnection to
ConnectionManagerId(ip-10-zz.xx-yy,40479)
14/10/27 15:21:16 INFO ConnectionManager: Key not valid ?
sun.nio.ch.SelectionKeyImpl@2490dce9
14/10/27 15:21:16 INFO ConnectionManager: key already cancelled ?
sun.nio.ch.SelectionKeyImpl@2490dce9
java.nio.channels.CancelledKeyException
at
org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386)
at
org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
14/10/27 15:21:17 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor
stopped!
14/10/27 15:21:17 INFO ConnectionManager: Selector thread was interrupted!