Hi Binh,
It stores the state as well the unprocessed data. It is a subset of the
records that you aggregated so far.
This provides a good reference for checkpointing.
http://spark.apache.org/docs/1.2.1/streaming-programming-guide.html#checkpointing
On Wed, Mar 18, 2015 at 12:52 PM, Binh
Hi Arush,
Thank you for answering!
When you say checkpoints hold metadata and Data, what is the Data? Is it
the Data that is pulled from input source or is it the state?
If it is state then is it the same number of records that I aggregated
since beginning or only a subset of it? How can I limit
Hi
Yes spark streaming is capable of stateful stream processing. With or
without state is a way of classifying state.
Checkpoints hold metadata and Data.
Thanks
On Wed, Mar 18, 2015 at 4:00 AM, Binh Nguyen Van binhn...@gmail.com wrote:
Hi all,
I am new to Spark so please forgive me if my
Hi all,
I am new to Spark so please forgive me if my questions is stupid.
I am trying to use Spark-Streaming in an application that read data
from a queue (Kafka) and do some aggregation (sum, count..) and
then persist result to an external storage system (MySQL, VoltDB...)
From my understanding