Re: Spark streaming spilling all the data to disk even if memory available

Akhil Das Thu, 31 Mar 2016 05:48:03 -0700

Use StorageLevel MEMORY_ONLY. Also have a look at the createDirectStream
API. Most likely in your case your batch duration must be less than your
processing time and the addition of delay probably blows up the memory.
On Mar 31, 2016 6:13 PM, "Mayur Mohite" <mayur.moh...@applift.com> wrote:


> We are using KafkaUtils.createStream API to read data from kafka topics
> and we are using StorageLevel.MEMORY_AND_DISK_SER option while configuring
> kafka streams.
>
> On Wed, Mar 30, 2016 at 12:58 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Can you elaborate more on from where you are streaming the data and what
>> type of consumer you are using etc?
>>
>> Thanks
>> Best Regards
>>
>> On Tue, Mar 29, 2016 at 6:10 PM, Mayur Mohite <mayur.moh...@applift.com>
>> wrote:
>>
>>> Hi,
>>>
>>> We are running spark streaming app on a single machine and we have
>>> configured spark executor memory to 30G.
>>> We noticed that after running the app for 12 hours, spark streaming
>>> started spilling ALL the data to disk even though we have configured
>>> sufficient memory for spark to use for storage.
>>>
>>> -Mayur
>>>
>>> Learn more about our inaugural *FirstScreen Conference
>>> <http://www.firstscreenconf.com/>*!
>>> *Where the worlds of mobile advertising and technology meet!*
>>>
>>> June 15, 2016 @ Urania Berlin
>>>
>>
>>
>
>
> --
> *Mayur Mohite*
> Senior Software Engineer
>
> Phone: +91 9035867742
> Skype: mayur.mohite_applift
>
>
> *AppLift India*
> 107/3, 80 Feet Main Road,
> Koramangala 4th Block,
> Bangalore - 560034
> www.AppLift.com <http://www.applift.com/>
>
>
> Learn more about our inaugural *FirstScreen Conference
> <http://www.firstscreenconf.com/>*!
> *Where the worlds of mobile advertising and technology meet!*
>
> June 15, 2016 @ Urania Berlin
>

Re: Spark streaming spilling all the data to disk even if memory available

Reply via email to