How about writing to a buffer ? Then you would flush the buffer to Kafka if and 
only if the output operation reports successful completion. In the event of a 
worker failure, that would not happen.



—
FG

On Sun, Nov 30, 2014 at 2:28 PM, Josh J <joshjd...@gmail.com> wrote:

> Is there a way to do this that preserves exactly once semantics for the
> write to Kafka?
> On Tue, Sep 2, 2014 at 12:30 PM, Tim Smith <secs...@gmail.com> wrote:
>> I'd be interested in finding the answer too. Right now, I do:
>>
>> val kafkaOutMsgs = kafkInMessages.map(x=>myFunc(x._2,someParam))
>> kafkaOutMsgs.foreachRDD((rdd,time) => { rdd.foreach(rec => {
>> writer.output(rec) }) } ) //where writer.ouput is a method that takes a
>> string and writer is an instance of a producer class.
>>
>>
>>
>>
>>
>> On Tue, Sep 2, 2014 at 10:12 AM, Massimiliano Tomassi <
>> max.toma...@gmail.com> wrote:
>>
>>> Hello all,
>>> after having applied several transformations to a DStream I'd like to
>>> publish all the elements in all the resulting RDDs to Kafka. What the best
>>> way to do that would be? Just using DStream.foreach and then RDD.foreach ?
>>> Is there any other built in utility for this use case?
>>>
>>> Thanks a lot,
>>> Max
>>>
>>> --
>>> ------------------------------------------------
>>> Massimiliano Tomassi
>>> ------------------------------------------------
>>> e-mail: max.toma...@gmail.com
>>> ------------------------------------------------
>>>
>>
>>

Reply via email to