Very enlightening presentation, thanks for sharing!

On Thu, Apr 13, 2017 at 9:07 AM, Thakrar, Jayesh <
jthak...@conversantmedia.com> wrote:

> Hi Dmitri,
>
> This presentation might help you understand and take appropriate actions
> to deal with data duplication (and data loss)
>
> https://www.slideshare.net/JayeshThakrar/kafka-68540012
>
> Regards,
> Jayesh
>
> On 4/13/17, 10:05 AM, "Vincent Dautremont" 
> <vincent.dautrem...@olamobile.com.INVALID>
> wrote:
>
>     One of the case where you would get a message more than once is if you
> get
>     disconnected / kicked off the consumer group / etc if you fail to
> commit
>     offset for messages you have already read.
>
>     What I do is that I insert the message in a in-memory cache redis
> database.
>     If it fails to insert because of primary key duplication, well that
> means
>     I've already received that message in the past.
>
>     You could even do an insert of the topic+partition+offset of the
> message
>     payload as the insert (instead of the full message) if you know for
> sure
>     that your message payload would not be duplicated in the the kafka
> topic.
>
>     Vincent.
>
>     On Thu, Apr 13, 2017 at 4:52 PM, Dmitry Goldenberg <
> dgoldenb...@hexastax.com
>     > wrote:
>
>     > Hi all,
>     >
>     > I was wondering if someone could list some of the causes which may
> lead to
>     > Kafka delivering the same messages more than once.
>     >
>     > We've looked around and we see no errors to notice, yet
> intermittently, we
>     > see messages being delivered more than once.
>     >
>     > Kafka documentation talks about the below delivery modes:
>     >
>     >    - *At most once*—Messages may be lost but are never redelivered.
>     >    - *At least once*—Messages are never lost but may be redelivered.
>     >    - *Exactly once*—this is what people actually want, each message
> is
>     >    delivered once and only once.
>     >
>     > So the default is 'at least once' and that is what we're running
> with (we
>     > don't want to do "at most once" as that appears to yield some
> potential for
>     > message loss).
>     >
>     > We had not seen duplicated deliveries for a while previously but just
>     > started seeing them quite frequently in our test cluster.
>     >
>     > What are some of the possible causes for this?  What are some of the
>     > available tools for troubleshooting this issue? What are some of the
>     > possible fixes folks have developed or instrumented for this issue?
>     >
>     > Also, is there an effort underway on Kafka side to provide support
> for the
>     > "exactly once" semantic?  That is exactly the semantic we want and
> we're
>     > wondering how that may be achieved.
>     >
>     > Thanks,
>     > - Dmitry
>     >
>
>     --
>     The information transmitted is intended only for the person or entity
> to
>     which it is addressed and may contain confidential and/or privileged
>     material. Any review, retransmission, dissemination or other use of, or
>     taking of any action in reliance upon, this information by persons or
>     entities other than the intended recipient is prohibited. If you
> received
>     this in error, please contact the sender and delete the material from
> any
>     computer.
>
>
>

Reply via email to