kafka pipeline exactly once semantics

2014-11-30 Thread Josh J
Hi, In the spark docs http://spark.apache.org/docs/latest/streaming-programming-guide.html#failure-of-a-worker-node it mentions However, output operations (like foreachRDD) have *at-least once* semantics, that is, the transformed data may get written to an external entity more than once in the

Re: kafka pipeline exactly once semantics

2014-11-30 Thread Tobias Pfeiffer
Josh, On Sun, Nov 30, 2014 at 10:17 PM, Josh J joshjd...@gmail.com wrote: I would like to setup a Kafka pipeline whereby I write my data to a single topic 1, then I continue to process using spark streaming and write the transformed results to topic2, and finally I read the results from topic