Yes, check out
mapWithState:https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html
_
From: Nikhil Goyal
Sent: Monday, May 23, 2016 23:28
Subject: Timed aggregation in Spark
To:
Hi Iain,
Did you manage to solve this issue?
It looks like we have a similar issue with processing time increasing every
micro-batch but only after 30 batches.
Thanks.
On Thu, Mar 3, 2016 at 4:45 PM Iain Cundy wrote:
> Hi All
>
>
>
> I’m aggregating data using
Cody Koeninger <c...@koeninger.org> wrote:
> Solution 2 sounds better to me. You aren't always going to have graceful
> shutdowns.
>
> On Mon, Sep 14, 2015 at 1:49 PM, Ofir Kerker <ofir.ker...@gmail.com>
> wrote:
>
>> Hi,
>> My Spark Streaming application c
Hi,
My Spark Streaming application consumes messages (events) from Kafka every
10 seconds using the direct stream approach and aggregates these messages
into hourly aggregations (to answer analytics questions like: "How many
users from Paris visited page X between 8PM to 9PM") and save the data to