Hi all Sorry but this was totally my mistake. In my persistence logic, I was creating async http client instance in RDD foreach but was never closing it leading to memory leaks.
Apologies for wasting everyone's time. Thanks, Aniket On 12 September 2014 02:20, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Which version of spark are you running? > > If you are running the latest one, then could try running not a window but > a simple event count on every 2 second batch, and see if you are still > running out of memory? > > TD > > > On Thu, Sep 11, 2014 at 10:34 AM, Aniket Bhatnagar < > aniket.bhatna...@gmail.com> wrote: > >> I did change it to be 1 gb. It still ran out of memory but a little later. >> >> The streaming job isnt handling a lot of data. In every 2 seconds, it >> doesn't get more than 50 records. Each record size is not more than 500 >> bytes. >> On Sep 11, 2014 10:54 PM, "Bharat Venkat" <bvenkat.sp...@gmail.com> >> wrote: >> >>> You could set "spark.executor.memory" to something bigger than the >>> default (512mb) >>> >>> >>> On Thu, Sep 11, 2014 at 8:31 AM, Aniket Bhatnagar < >>> aniket.bhatna...@gmail.com> wrote: >>> >>>> I am running a simple Spark Streaming program that pulls in data from >>>> Kinesis at a batch interval of 10 seconds, windows it for 10 seconds, maps >>>> data and persists to a store. >>>> >>>> The program is running in local mode right now and runs out of memory >>>> after a while. I am yet to investigate heap dumps but I think Spark isn't >>>> releasing memory after processing is complete. I have even tried changing >>>> storage level to disk only. >>>> >>>> Help! >>>> >>>> Thanks, >>>> Aniket >>>> >>> >>> >