Re: Serial batching with Spark Streaming

2015-06-18 Thread Binh Nguyen Van
I haven’t tried with 1.4 but I tried with 1.3 a while ago and I could not get the serialized behavior by using default scheduler when there is failure and retry so I created a customized stream like this. class EachSeqRDD[T: ClassTag] ( parent: DStream[T], eachSeqFunc: (RDD[T], Time) = Unit

Re: Idempotent count

2015-03-18 Thread Binh Nguyen Van
and Data. Thanks On Wed, Mar 18, 2015 at 4:00 AM, Binh Nguyen Van binhn...@gmail.com wrote: Hi all, I am new to Spark so please forgive me if my questions is stupid. I am trying to use Spark-Streaming in an application that read data from a queue (Kafka) and do some aggregation (sum

Idempotent count

2015-03-17 Thread Binh Nguyen Van
Hi all, I am new to Spark so please forgive me if my questions is stupid. I am trying to use Spark-Streaming in an application that read data from a queue (Kafka) and do some aggregation (sum, count..) and then persist result to an external storage system (MySQL, VoltDB...) From my understanding