Ok,
Thanks for your answers
On 3/22/17, 1:34 PM, "Cody Koeninger" wrote:
If you're talking about reading the same message multiple times in a
failure situation, see
https://github.com/koeninger/kafka-exactly-once
If you're talking about producing
If you're talking about reading the same message multiple times in a
failure situation, see
https://github.com/koeninger/kafka-exactly-once
If you're talking about producing the same message multiple times in a
failure situation, keep an eye on
You have to handle de-duplication upstream or downstream. It might
technically be possible to handle this in Spark but you'll probably have a
better time handling duplicates in the service that reads from Kafka.
On Wed, Mar 22, 2017 at 1:49 PM, Maurin Lenglart
wrote:
>
Hi,
we are trying to build a spark streaming solution that subscribe and push to
kafka.
But we are running into the problem of duplicates events.
Right now, I am doing a “forEachRdd” and loop over the message of each
partition and send those message to kafka.
Is there any good way of solving