Hi, Can you attach more logs to see if there is some entry from ContextCleaner?
I met very similar issue before…but haven’t get resolved Best, -- Nan Zhu On Thursday, September 11, 2014 at 10:13 AM, Dibyendu Bhattacharya wrote: > Dear All, > > Not sure if this is a false alarm. But wanted to raise to this to understand > what is happening. > > I am testing the Kafka Receiver which I have written > (https://github.com/dibbhatt/kafka-spark-consumer) which basically a low > level Kafka Consumer implemented custom Receivers for every Kafka topic > partitions and pulling data in parallel. Individual streams from all topic > partitions are then merged to create Union stream which used for further > processing. > > The custom Receiver working fine in normal load with no issues. But when I > tested this with huge amount of backlog messages from Kafka ( 50 million + > messages), I see couple of major issue in Spark Streaming. Wanted to get some > opinion on this.... > > I am using latest Spark 1.1 taken from the source and built it. Running in > Amazon EMR , 3 m1.xlarge Node Spark cluster running in Standalone Mode. > > Below are two main question I have.. > > 1. What I am seeing when I run the Spark Streaming with my Kafka Consumer > with a huge backlog in Kafka ( around 50 Million), Spark is completely busy > performing the Receiving task and hardly schedule any processing task. Can > you let me if this is expected ? If there is large backlog, Spark will take > long time pulling them . Why Spark not doing any processing ? Is it because > of resource limitation ( say all cores are busy puling ) or it is by design ? > I am setting the executor-memory to 10G and driver-memory to 4G . > > 2. This issue seems to be more serious. I have attached the Driver trace with > this email. What I can see very frequently Block are selected to be > Removed...This kind of entries are all over the place. But when a Block is > removed , below problem happen.... May be this issue cause the issue 1 that > no Jobs are getting processed .. > > > INFO : org.apache.spark.storage.MemoryStore - 1 blocks selected for dropping > INFO : org.apache.spark.storage.BlockManager - Dropping block > input-0-1410443074600 from memory > INFO : org.apache.spark.storage.MemoryStore - Block input-0-1410443074600 of > size 12651900 dropped from memory (free 21220667) > INFO : org.apache.spark.storage.BlockManagerInfo - Removed > input-0-1410443074600 on ip-10-252-5-113.asskickery.us:53752 > (http://ip-10-252-5-113.asskickery.us:53752) in memory (size: 12.1 MB, free: > 100.6 MB) > > ........... > > INFO : org.apache.spark.storage.BlockManagerInfo - Removed > input-0-1410443074600 on ip-10-252-5-62.asskickery.us:37033 > (http://ip-10-252-5-62.asskickery.us:37033) in memory (size: 12.1 MB, free: > 154.6 MB) > .............. > > > WARN : org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 7.0 > (TID 118, ip-10-252-5-62.asskickery.us > (http://ip-10-252-5-62.asskickery.us)): java.lang.Exception: Could not > compute split, block input-0-1410443074600 not found > > ........... > > INFO : org.apache.spark.scheduler.TaskSetManager - Lost task 0.1 in stage 7.0 > (TID 126) on executor ip-10-252-5-62.asskickery.us > (http://ip-10-252-5-62.asskickery.us): java.lang.Exception (Could not compute > split, block input-0-1410443074600 not found) [duplicate 1] > > > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 7.0 failed 4 times, most recent failure: Lost task 0.3 in stage 7.0 > (TID 139, ip-10-252-5-62.asskickery.us > (http://ip-10-252-5-62.asskickery.us)): java.lang.Exception: Could not > compute split, block input-0-1410443074600 not found > org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61) > org.apache.spark.rdd.RDD.iterator(RDD.scala:227) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) > org.apache.spark.scheduler.Task.run(Task.scala:54) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:744) > > > Regards, > Dibyendu > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > (mailto:user-unsubscr...@spark.apache.org) > For additional commands, e-mail: user-h...@spark.apache.org > (mailto:user-h...@spark.apache.org) > > > > > Attachments: > - driver-trace.txt >