When I ran a spark streaming application longer, I noticed the local directory's size was kept increasing.
I set "spark.cleaner.ttl" to 1800 seconds in order clean the metadata. The spark streaming batch duration is 10 seconds and checkpoint duration is 10 minutes. The setting took effect but after that, below exception happened. Do you have any idea about this error? Thank you! ==================================================== 15/06/09 12:57:30 WARN TaskSetManager: Lost task 3.0 in stage 5038.0 (TID 27045, host2): java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_82_piece0 of broadcast_82 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1155) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBr oadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBro adcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scal a:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.sc ala:87) at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at org.apache.spark.streaming.dstream.HashmapEnrichDStream$$anonfun$compute $3.apply(HashmapEnrichDStream.scala:39) at org.apache.spark.streaming.dstream.HashmapEnrichDStream$$anonfun$compute $3.apply(HashmapEnrichDStream.scala:39) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter .scala:202) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter. scala:56) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:6 8) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:4 1) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav a:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to get broadcast_82_piece0 of broadcast_82 at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$br oadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast .scala:137) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$br oadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast .scala:137) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$br oadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.sc ala:136) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$br oadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$br oadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$T orrentBroadcast$$readBlocks(TorrentBroadcast.scala:119) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$ 1.apply(TorrentBroadcast.scala:174) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1152) ... 25 more 15/06/09 12:57:30 ERROR TaskSetManager: Task 2 in stage 5038.0 failed 4 times; aborting job 15/06/09 12:57:30 ERROR JobScheduler: Error running job streaming job 1433825850000 ms.0 --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org