Jan, these are two separate issues.

1) consumer coordination should not, ideally, involve unreliable or slow
connections. Naively, a KafkaSourceConnector would coordinate via the
source cluster. We can do better than this, but I'm deferring this
optimization for now.

2) exactly-once between two clusters is mind-bending. But keep in mind that
transactions are managed by the producer, not the consumer. In fact, it's
the producer that requests that offsets be committed for the current
transaction. Obviously, these offsets are committed in whatever cluster the
producer is sending to.

These two issues are closely related. They are both resolved by not
coordinating or committing via the source cluster. And in fact, this is the
general model of SourceConnectors anyway, since most SourceConnectors
_only_ have a destination cluster.

If there is a lot of interest here, I can expound further on this aspect of
MM2, but again I think this is premature until this first KIP is approved.
I intend to address each of these in separate KIPs following this one.

Ryanne

On Wed, Oct 17, 2018 at 7:09 AM Jan Filipiak <jan.filip...@trivago.com>
wrote:

> This is not a performance optimisation. Its a fundamental design choice.
>
>
> I never really took a look how streams does exactly once. (its a trap
> anyways and you usually can deal with at least once donwstream pretty
> easy). But I am very certain its not gonna get somewhere if offset
> commit and record produce cluster are not the same.
>
> Pretty sure without this _design choice_ you can skip on that exactly
> once already
>
> Best Jan
>
> On 16.10.2018 18:16, Ryanne Dolan wrote:
> >  >  But one big obstacle in this was
> > always that group coordination happened on the source cluster.
> >
> > Jan, thank you for bringing up this issue with legacy MirrorMaker. I
> > totally agree with you. This is one of several problems with MirrorMaker
> > I intend to solve in MM2, and I already have a design and prototype that
> > solves this and related issues. But as you pointed out, this KIP is
> > already rather complex, and I want to focus on the core feature set
> > rather than performance optimizations for now. If we can agree on what
> > MM2 looks like, it will be very easy to agree to improve its performance
> > and reliability.
> >
> > That said, I look forward to your support on a subsequent KIP that
> > addresses consumer coordination and rebalance issues. Stay tuned!
> >
> > Ryanne
> >
> > On Tue, Oct 16, 2018 at 6:58 AM Jan Filipiak <jan.filip...@trivago.com
> > <mailto:jan.filip...@trivago.com>> wrote:
> >
> >     Hi,
> >
> >     Currently MirrorMaker is usually run collocated with the target
> >     cluster.
> >     This is all nice and good. But one big obstacle in this was
> >     always that group coordination happened on the source cluster. So
> when
> >     then network was congested, you sometimes loose group membership and
> >     have to rebalance and all this.
> >
> >     So one big request from we would be the support of having
> coordination
> >     cluster != source cluster.
> >
> >     I would generally say a LAN is better than a WAN for doing group
> >     coordinaton and there is no reason we couldn't have a group consuming
> >     topics from a different cluster and committing offsets to another
> >     one right?
> >
> >     Other than that. It feels like the KIP has too much features where
> many
> >     of them are not really wanted and counter productive but I will just
> >     wait and see how the discussion goes.
> >
> >     Best Jan
> >
> >
> >     On 15.10.2018 18:16, Ryanne Dolan wrote:
> >      > Hey y'all!
> >      >
> >      > Please take a look at KIP-382:
> >      >
> >      >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
> >      >
> >      > Thanks for your feedback and support.
> >      >
> >      > Ryanne
> >      >
> >
>

Reply via email to