Very enlightening presentation, thanks for sharing! On Thu, Apr 13, 2017 at 9:07 AM, Thakrar, Jayesh < jthak...@conversantmedia.com> wrote:
> Hi Dmitri, > > This presentation might help you understand and take appropriate actions > to deal with data duplication (and data loss) > > https://www.slideshare.net/JayeshThakrar/kafka-68540012 > > Regards, > Jayesh > > On 4/13/17, 10:05 AM, "Vincent Dautremont" > <vincent.dautrem...@olamobile.com.INVALID> > wrote: > > One of the case where you would get a message more than once is if you > get > disconnected / kicked off the consumer group / etc if you fail to > commit > offset for messages you have already read. > > What I do is that I insert the message in a in-memory cache redis > database. > If it fails to insert because of primary key duplication, well that > means > I've already received that message in the past. > > You could even do an insert of the topic+partition+offset of the > message > payload as the insert (instead of the full message) if you know for > sure > that your message payload would not be duplicated in the the kafka > topic. > > Vincent. > > On Thu, Apr 13, 2017 at 4:52 PM, Dmitry Goldenberg < > dgoldenb...@hexastax.com > > wrote: > > > Hi all, > > > > I was wondering if someone could list some of the causes which may > lead to > > Kafka delivering the same messages more than once. > > > > We've looked around and we see no errors to notice, yet > intermittently, we > > see messages being delivered more than once. > > > > Kafka documentation talks about the below delivery modes: > > > > - *At most once*—Messages may be lost but are never redelivered. > > - *At least once*—Messages are never lost but may be redelivered. > > - *Exactly once*—this is what people actually want, each message > is > > delivered once and only once. > > > > So the default is 'at least once' and that is what we're running > with (we > > don't want to do "at most once" as that appears to yield some > potential for > > message loss). > > > > We had not seen duplicated deliveries for a while previously but just > > started seeing them quite frequently in our test cluster. > > > > What are some of the possible causes for this? What are some of the > > available tools for troubleshooting this issue? What are some of the > > possible fixes folks have developed or instrumented for this issue? > > > > Also, is there an effort underway on Kafka side to provide support > for the > > "exactly once" semantic? That is exactly the semantic we want and > we're > > wondering how that may be achieved. > > > > Thanks, > > - Dmitry > > > > -- > The information transmitted is intended only for the person or entity > to > which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipient is prohibited. If you > received > this in error, please contact the sender and delete the material from > any > computer. > > >