RE: Explanation regarding Spark Streaming

Jacek Laskowski Sat, 06 Aug 2016 01:54:13 -0700

Hi,

Thanks for explanation, but it does not prove Spark will OOM at some point.
You assume enough data to store but there could be none.


Jacek

On 6 Aug 2016 4:23 a.m., "Mohammed Guller" <moham...@glassbeam.com> wrote:

> Assume the batch interval is 10 seconds and batch processing time is 30
> seconds. So while Spark Streaming is processing the first batch, the
> receiver will have a backlog of 20 seconds worth of data. By the time Spark
> Streaming finishes batch #2, the receiver will have 40 seconds worth of
> data in memory buffer. This backlog will keep growing as time passes
> assuming data streams in consistently at the same rate.
>
> Also keep in mind that windowing operations on a DStream implicitly
> persist every RDD in a DStream in memory.
>
> Mohammed
>
> -----Original Message-----
> From: Jacek Laskowski [mailto:ja...@japila.pl]
> Sent: Thursday, August 4, 2016 4:25 PM
> To: Mohammed Guller
> Cc: Saurav Sinha; user
> Subject: Re: Explanation regarding Spark Streaming
>
> On Fri, Aug 5, 2016 at 12:48 AM, Mohammed Guller <moham...@glassbeam.com>
> wrote:
> > and eventually you will run out of memory.
>
> Why? Mind elaborating?
>
> Jacek
>

RE: Explanation regarding Spark Streaming

Reply via email to