Hi, Joong,

Please check the following two links:

-
https://cwiki.apache.org/confluence/display/KAFKA/KIP-3+-+Mirror+Maker+Enhancement

-
https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API

They might help you understand the problem.

Cheers,

Xiao Li

2015-05-01 6:28 GMT-07:00 Joe Stein <joe.st...@stealth.ly>:

> If you want 0 data loss you should also look into the min.insync.repica
> setting in 0.8.2.1 as it guarantees data in multiple racks.
>
> If you don't have that set then you have this scenario as possible.
>
> lets say 1 topic, 1 partition, replication 3. You are producing with ACK=-1
>
> b1, b2, b3 (where b=broker and b1 is leader, b2, b3 replicas).
>
> b1,b2 dies, b3 is leader. so far all is well.
>
> 10 minutes go by and b3 dies
>
> 1 minute later b1 comes back online, it will truncate essentially 45
> minutes of data upstream thought was saved.
>
> but now, you can have ACK=-1 get a failure if you don't have a enough
> replica to survive data loss guarantees. min.isr=2 min.sir=3 //depends on
> data
>
> Also take a look at
> https://github.com/stealthly/go_kafka_client/tree/master/mirrormaker it
> might be helpful for what you are looking for.
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Fri, May 1, 2015 at 7:43 AM, Joong Lee <jo...@me.com> wrote:
>
> > It is based on our understanding from reading the documents.
> >
> > We aren't concerned of data duplication as that is going to be handled by
> > elasticsearch.
> >
> > > On May 1, 2015, at 12:15 AM, Daniel Compton <
> > daniel.compton.li...@gmail.com> wrote:
> > >
> > > When we evaluated MirrorMaker last year we didn't find any risk of data
> > > loss, only duplicate messages in the case of a network partition.
> > >
> > > Did you discover data loss in your tests, or were you just looking at
> the
> > > docs?
> > > On Fri, 1 May 2015 at 4:31 pm Jiangjie Qin <j...@linkedin.com.invalid>
> > > wrote:
> > >
> > >> Which mirror maker version did you look at? The MirrorMaker in trunk
> > >> should not have data loss if you just use the default setting.
> > >>
> > >>> On 4/30/15, 7:53 PM, "Joong Lee" <jo...@me.com> wrote:
> > >>>
> > >>> Hi,
> > >>> We are exploring Kafka to keep two data centers (primary and DR)
> > running
> > >>> hosts of elastic search nodes in sync. One key requirement is that we
> > >>> can't lose any data. We POC'd use of MirrorMaker and felt it may not
> > meet
> > >>> out data loss requirement.
> > >>>
> > >>> I would like ask the community if we should look for another solution
> > or
> > >>> would Kafka be the right solution considering zero data loss
> > requirement.
> > >>>
> > >>> Thanks
> > >>
> > >>
> >
>

Reply via email to