Hi all

Sorry but this was totally my mistake. In my persistence logic, I was
creating async http client instance in RDD foreach but was never closing it
leading to memory leaks.

Apologies for wasting everyone's time.

Thanks,
Aniket

On 12 September 2014 02:20, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> Which version of spark are you running?
>
> If you are running the latest one, then could try running not a window but
> a simple event count on every 2 second batch, and see if you are still
> running out of memory?
>
> TD
>
>
> On Thu, Sep 11, 2014 at 10:34 AM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
>> I did change it to be 1 gb. It still ran out of memory but a little later.
>>
>> The streaming job isnt handling a lot of data. In every 2 seconds, it
>> doesn't get more than 50 records. Each record size is not more than 500
>> bytes.
>>  On Sep 11, 2014 10:54 PM, "Bharat Venkat" <bvenkat.sp...@gmail.com>
>> wrote:
>>
>>> You could set "spark.executor.memory" to something bigger than the
>>> default (512mb)
>>>
>>>
>>> On Thu, Sep 11, 2014 at 8:31 AM, Aniket Bhatnagar <
>>> aniket.bhatna...@gmail.com> wrote:
>>>
>>>> I am running a simple Spark Streaming program that pulls in data from
>>>> Kinesis at a batch interval of 10 seconds, windows it for 10 seconds, maps
>>>> data and persists to a store.
>>>>
>>>> The program is running in local mode right now and runs out of memory
>>>> after a while. I am yet to investigate heap dumps but I think Spark isn't
>>>> releasing memory after processing is complete. I have even tried changing
>>>> storage level to disk only.
>>>>
>>>> Help!
>>>>
>>>> Thanks,
>>>> Aniket
>>>>
>>>
>>>
>

Reply via email to