Hi Radai

> 1. how do you handle possible duplications caused by the "special"
> producer timing-out/retrying? are you explicitely relying on the
> "exactly once" sequencing?

A duplicate ProduceRequest would be rejected with an 
INVALID_PRODUCE_OFFSET error.

We envision using an idempotent producer for cluster replication but to 
not require it.


> 2. what about the combination of log compacted topics + replicator
> downtime? by the time the replicator comes back up there might be
> "holes" in the source offsets (some msgs might have been compacted
> out)? how is that recoverable?
> 3. similarly, what if you try and fire up replication on a non-empty
> source topic? does the kip allow for offsets starting at some
> arbitrary X > 0 ? or would this have to be designed from the start.

Both these cases do not pose a problem. 
As mentioned in the KIP each Producer batch must not contain offset gaps, 
but these can exist between batches.
The companion PR has an implementation with tests that cover these cases

> and lastly, since this KIP seems to be designed fro active-passive
> failover (there can be no produce traffic except the replicator)
> wouldnt a solution based on seeking to a time offset be more generic?
> your producers could checkpoint the last (say log append) timestamp of
> records theyve seen, and when restoring in the remote site seek to
> those timestamps (which will be metadata in their committed offsets) -
> assumming replication takes > 0 time you'd need to handle some dups,
> but every kafka consumer setup needs to know how to handle those
> anyway.

can you please clarify?
We do not expect any cooperation from users applications.

thanks!
E&M

> On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar <eco...@uk.ibm.com> wrote:
> >
> > Hi Stanislav
> >
> > > > The flag is needed to distinguish a batch with a desired base 
offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset 
of 0?
> >
> > a regular batch (for which offsets are generated by the broker on 
write)
> > is sent with a base offset of 0.
> > How could you distinguish it from a batch where you *want* the first
> > record to be written at offset 0 (i.e. be the first in the partition 
and
> > be rejected if there are records on the log already) ?
> > We wanted to avoid a "deep" inspection (and potentially decompression) 
of
> > the records.
> >
> > For the replicator use case, a single produce request where all the 
data
> > is to be assumed with offset,
> > or all without offsets, seems to suffice,
> > So we added only a toplevel flag, not a per-topic-partition one.
> >
> > Thanks for your interest !
> > cheers
> > Edo
> > --------------------------------------------------
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski <stanis...@confluent.io> wrote on 22/11/2018 
22:32:42:
> >
> > > From: Stanislav Kozlovski <stanis...@confluent.io>
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 22:33
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edo & Mickael,
> > >
> > > > The flag is needed to distinguish a batch with a desired base 
offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset 
of 0?
> > >
> > > > (I am reading your post thinking about
> > > partitions rather than topics).
> > > Yes, I meant partitions. Sorry about that.
> > >
> > > Thanks for answering my questions :)
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar <eco...@uk.ibm.com> 
wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > you're right we envision the replicator use case to have a single
> > producer
> > > > with offsets per partition (I am reading your post thinking about
> > > > partitions rather than topics).
> > > >
> > > > If a regular producer was to send its own records at the same 
time,
> > it's
> > > > very likely that the one sending with an offset will fail because 
of
> > > > invalid offsets.
> > > > Same if two producers were sending with offsets, likely both would
> > then
> > > > fail.
> > > >
> > > > > Does it make sense to *lock* the topic from other producers 
while
> > there
> > > > is
> > > > > one that uses offsets?
> > > >
> > > > You could do that with ACL permissions if you wanted, I don't 
think it
> > > > needs to be mandated by changing the broker logic.
> > > >
> > > >
> > > > > Since we are tying the produce-with-offset request to the ACL, 
do we
> > > > need
> > > > > the `use_offset` field in the produce request? Maybe we make it
> > > > mandatory
> > > > > for produce requests with that ACL to have offsets.
> > > >
> > > > The flag is needed to distinguish a batch with a desired base 
offset
> > of 0,
> > > > from a regular batch for which offsets need to be generated.
> > > > I would not restrict a principal to only send-with-offsets (by 
making
> > that
> > > > mandatory via the ACL).
> > > >
> > > > Thanks
> > > > Edo & Mickael
> > > >
> > > > --------------------------------------------------
> > > >
> > > > Edoardo Comar
> > > >
> > > > IBM Event Streams
> > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > >
> > > >
> > > > Stanislav Kozlovski <stanis...@confluent.io> wrote on 22/11/2018
> > 16:17:11:
> > > >
> > > > > From: Stanislav Kozlovski <stanis...@confluent.io>
> > > > > To: dev@kafka.apache.org
> > > > > Date: 22/11/2018 16:17
> > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > > Cluster Replication
> > > > >
> > > > > Hey Edurdo, thanks for the KIP!
> > > > >
> > > > > I have some questions, apologies if they are naive:
> > > > > Is this intended to work for a single producer use case only?
> > > > > How would it work if two producers were producing to the same 
topic
> > with
> > > > > offsets?
> > > > > How would it work if two producers, one with offsets and one 
without
> > > > were
> > > > > producing to a topic?
> > > > > Does it make sense to *lock* the topic from other producers 
while
> > there
> > > > is
> > > > > one that uses offsets?
> > > > >
> > > > > Since we are tying the produce-with-offset request to the ACL, 
do we
> > > > need
> > > > > the `use_offset` field in the produce request? Maybe we make it
> > > > mandatory
> > > > > for produce requests with that ACL to have offsets.
> > > > >
> > > > > Best,
> > > > > Stanislav
> > > > >
> > > > > On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar 
<eco...@uk.ibm.com>
> > wrote:
> > > > >
> > > > > > Hi,
> > > > > > we've opened a KIP to improve data replication between Kafka
> > clusters
> > > > :
> > > > > >
> > > > > >
> > > > > > INVALID URI REMOVED
> > > > >
> > > >
> > > >
> > >
> > 
> 
u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication&d=DwIBaQ&c=jf_iaSHvJObTbx-
> > > > >
> > > >
> > 
siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=uUj9C3BdbYz0dDNA-
> > > > >
> > > >
> > 
E6iXreg1M5hWiWgG6ClS86VIPI&s=Vav8_-N7_OpfYEW33yGOf_or8ESMUJ4S45t2g-EUWKg&e=
> > > > > >
> > > > > > We'd like to start a discussion, please post your feedback in 
this
> > > > thread.
> > > > > >
> > > > > > Thank you
> > > > > > Edo and Mickael
> > > > > >
> > > > > >
> > > > > > --------------------------------------------------
> > > > > >
> > > > > > Edoardo Comar
> > > > > >
> > > > > > IBM Event Streams
> > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > >
> > > > > > Unless stated otherwise above:
> > > > > > IBM United Kingdom Limited - Registered in England and Wales 
with
> > > > number
> > > > > > 741598.
> > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, 
Hampshire
> > PO6
> > > > 3AU
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best,
> > > > > Stanislav
> > > >
> > > > Unless stated otherwise above:
> > > > IBM United Kingdom Limited - Registered in England and Wales with
> > number
> > > > 741598.
> > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
PO6
> > 3AU
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with 
number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
> 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Reply via email to