Tim, I asked a similar question twice: here http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-Cannot-get-executors-to-stay-alive-tt12940.html and here http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-Executor-OOM-tt12383.html
and have not yet received any responses. I noticed that the heapdump only contains a very large byte array consuming about 66%(the second link contains a picture of my heap -- I ran with a small heap to be able to get the failure quickly) I don't have solutions but wanted to affirm that I've observed a similar situation... On Wed, Sep 10, 2014 at 2:24 PM, Tim Smith <secs...@gmail.com> wrote: > I am using Spark 1.0.0 (on CDH 5.1) and have a similar issue. In my case, > the receivers die within an hour because Yarn kills the containers for high > memory usage. I set ttl.cleaner to 30 seconds but that didn't help. So I > don't think stale RDDs are an issue here. I did a "jmap -histo" on a couple > of running receiver processes and in a heap of 30G, roughly ~16G is taken > by "[B" which is byte arrays. > > Still investigating more and would appreciate pointers for > troubleshooting. I have dumped the heap of a receiver and will try to go > over it. > > > > > On Wed, Sep 10, 2014 at 1:43 AM, Luis Ángel Vicente Sánchez < > langel.gro...@gmail.com> wrote: > >> I somehow missed that parameter when I was reviewing the documentation, >> that should do the trick! Thank you! >> >> 2014-09-10 2:10 GMT+01:00 Shao, Saisai <saisai.s...@intel.com>: >> >> Hi Luis, >>> >>> >>> >>> The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be >>> used to remove useless timeout streaming data, the difference is that >>> “spark.cleaner.ttl” is time-based cleaner, it does not only clean streaming >>> input data, but also Spark’s useless metadata; while >>> “spark.streaming.unpersist” is reference-based cleaning mechanism, >>> streaming data will be removed when out of slide duration. >>> >>> >>> >>> Both these two parameter can alleviate the memory occupation of Spark >>> Streaming. But if the data is flooded into Spark Streaming when start up >>> like your situation using Kafka, these two parameters cannot well mitigate >>> the problem. Actually you need to control the input data rate to not inject >>> so fast, you can try “spark.straming.receiver.maxRate” to control the >>> inject rate. >>> >>> >>> >>> Thanks >>> >>> Jerry >>> >>> >>> >>> *From:* Luis Ángel Vicente Sánchez [mailto:langel.gro...@gmail.com] >>> *Sent:* Wednesday, September 10, 2014 5:21 AM >>> *To:* user@spark.apache.org >>> *Subject:* spark.cleaner.ttl and spark.streaming.unpersist >>> >>> >>> >>> The executors of my spark streaming application are being killed due to >>> memory issues. The memory consumption is quite high on startup because is >>> the first run and there are quite a few events on the kafka queues that are >>> consumed at a rate of 100K events per sec. >>> >>> I wonder if it's recommended to use spark.cleaner.ttl and >>> spark.streaming.unpersist together to mitigate that problem. And I also >>> wonder if new RDD are being batched while a RDD is being processed. >>> >>> Regards, >>> >>> Luis >>> >> >> >