Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2019-02-04 Thread Edoardo Comar
Hi Radai,
thanks for the observation on the the kip-320 conflict.

I would not have the destination broker treat the __consumer_offsets
as a special
case (if this is what you suggested).

Rather in the replicator the __consumer_offsets topic could be treated as a
special case
where instead of just replicating the value as-is - it would edit it by
stripping the epoch.

As previously mentioned, the __consumer_offsets topic does not need to be
replicated by producing-with-offsets to it.

--
Edoardo Comar
IBM Event Streams

On Wed, 23 Jan 2019 at 03:18, radai  wrote:

> the kip-320 conflict can be resolved by saying that the leader broker
> on the destination "stamps" is own local leader epoch on the incoming
> msgs - meaning the offsets "transfer" but leader epochs do not.
>
> On Mon, Jan 7, 2019 at 1:38 PM Edoardo Comar  wrote:
> >
> > Hi,
> > I delayed starting the voting thread due to the festive period. I would
> > like to start it this week.
> > Has anyone any more feedback ?
> >
> > --
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> >
> >
> > Edoardo Comar  wrote on 13/12/2018 17:50:30:
> >
> > > From: Edoardo Comar 
> > > To: dev@kafka.apache.org
> > > Date: 13/12/2018 17:50
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hi,
> > > as we haven't got any more feedback, we'd like to start a vote on
> > KIP-391
> > > on Monday
> > >
> > > INVALID URI REMOVED
> > >
> >
> u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication=DwIFAg=jf_iaSHvJObTbx-
> > >
> >
> siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=hxekG7cvm8Peoyd4oPqvSwRFRuGIyi9Pc_h2GhHbgtw=4SGyJsJAuYWZWADpzAaSEPqzYnde0WRW6XgZ3L4haB4=
> > >
> > > --
> > >
> > > Edoardo Comar
> > >
> > > IBM Event Streams
> > > IBM UK Ltd, Hursley Park, SO21 2JN
> > >
> > >
> > > Edoardo Comar/UK/IBM wrote on 10/12/2018 10:20:06:
> > >
> > > > From: Edoardo Comar/UK/IBM
> > > > To: dev@kafka.apache.org
> > > > Date: 10/12/2018 10:20
> > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > Cluster Replication
> > > >
> > > > (shameless bump) any additional feedback is welcome ... thanks!
> > > >
> > > > Edoardo Comar  wrote on 27/11/2018 15:35:09:
> > > >
> > > > > From: Edoardo Comar 
> > > > > To: dev@kafka.apache.org
> > > > > Date: 27/11/2018 15:35
> > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > > Cluster Replication
> > > > >
> > > > > Hi Jason
> > > > >
> > > > > we envisioned the replicator to replicate the __consumer_offsets
> > topic
> > > too
> > > > > (although without producing-with-offsets to it!).
> > > > >
> > > > > As there is no client-side implementation yet using the leader
> > epoch,
> > > > > we could not yet see the impact of writing to the destination
> > cluster
> > > > > __consumer_offsets records with an invalid leader epoch.
> > > > >
> > > > > Also, applications might still use external storage mechanism for
> > > consumer
> > > > > offsets where the leader_epoch is missing.
> > > > >
> > > > > Perhaps the replicator could - for the __consumer_offsets topic -
> > just
> > >
> > > > > omit the leader_epoch field in the data sent to destination.
> > > > >
> > > > > What do you think ?
> > > > >
> > > > >
> > > > > Jason Gustafson  wrote on 27/11/2018 00:09:56:
> > > > >
> > > > > > Another wrinkle to consider is KIP-320. If you are planning to
> > > replicate
> > > > > > __consumer_offsets directly, then you will have to account for
> > > leader
> > > > > epoch
> > > > > > information which is stored with the committed offsets. But I
> > cannot
> > >
> > > > > think
> > > > > > how it would be possible to replicate the le

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2019-01-22 Thread radai
the kip-320 conflict can be resolved by saying that the leader broker
on the destination "stamps" is own local leader epoch on the incoming
msgs - meaning the offsets "transfer" but leader epochs do not.

On Mon, Jan 7, 2019 at 1:38 PM Edoardo Comar  wrote:
>
> Hi,
> I delayed starting the voting thread due to the festive period. I would
> like to start it this week.
> Has anyone any more feedback ?
>
> --
>
> Edoardo Comar
>
> IBM Event Streams
>
>
> Edoardo Comar  wrote on 13/12/2018 17:50:30:
>
> > From: Edoardo Comar 
> > To: dev@kafka.apache.org
> > Date: 13/12/2018 17:50
> > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > Cluster Replication
> >
> > Hi,
> > as we haven't got any more feedback, we'd like to start a vote on
> KIP-391
> > on Monday
> >
> > INVALID URI REMOVED
> >
> u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication=DwIFAg=jf_iaSHvJObTbx-
> >
> siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=hxekG7cvm8Peoyd4oPqvSwRFRuGIyi9Pc_h2GhHbgtw=4SGyJsJAuYWZWADpzAaSEPqzYnde0WRW6XgZ3L4haB4=
> >
> > --
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Edoardo Comar/UK/IBM wrote on 10/12/2018 10:20:06:
> >
> > > From: Edoardo Comar/UK/IBM
> > > To: dev@kafka.apache.org
> > > Date: 10/12/2018 10:20
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > (shameless bump) any additional feedback is welcome ... thanks!
> > >
> > > Edoardo Comar  wrote on 27/11/2018 15:35:09:
> > >
> > > > From: Edoardo Comar 
> > > > To: dev@kafka.apache.org
> > > > Date: 27/11/2018 15:35
> > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > Cluster Replication
> > > >
> > > > Hi Jason
> > > >
> > > > we envisioned the replicator to replicate the __consumer_offsets
> topic
> > too
> > > > (although without producing-with-offsets to it!).
> > > >
> > > > As there is no client-side implementation yet using the leader
> epoch,
> > > > we could not yet see the impact of writing to the destination
> cluster
> > > > __consumer_offsets records with an invalid leader epoch.
> > > >
> > > > Also, applications might still use external storage mechanism for
> > consumer
> > > > offsets where the leader_epoch is missing.
> > > >
> > > > Perhaps the replicator could - for the __consumer_offsets topic -
> just
> >
> > > > omit the leader_epoch field in the data sent to destination.
> > > >
> > > > What do you think ?
> > > >
> > > >
> > > > Jason Gustafson  wrote on 27/11/2018 00:09:56:
> > > >
> > > > > Another wrinkle to consider is KIP-320. If you are planning to
> > replicate
> > > > > __consumer_offsets directly, then you will have to account for
> > leader
> > > > epoch
> > > > > information which is stored with the committed offsets. But I
> cannot
> >
> > > > think
> > > > > how it would be possible to replicate the leader epoch information
>
> > in
> > > > > messages even if you can preserve offsets.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat
> > > > 
> > > > > wrote:
> > > > >
> > > > > > Hi Edoardo,
> > > > > >
> > > > > > Thanks a lot for the KIP.
> > > > > >  I have a few questions/suggestions in addition to what Radai
> has
> > > > mentioned
> > > > > > above :
> > > > > >
> > > > > >1. Is this meant only for 1:1 replication, for example one
> > Kafka
> > > > cluster
> > > > > >replicating to other, instead of having multiple Kafka
> clusters
> > > > > > mirroring
> > > > > >into one Kafka cluster?
> > > > > >2. Are we relying on exactly once produce in the replicator?
> If
> >
> > > > not, how
> > &g

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2019-01-07 Thread Edoardo Comar
Hi,
I delayed starting the voting thread due to the festive period. I would 
like to start it this week.
Has anyone any more feedback ?

--

Edoardo Comar

IBM Event Streams


Edoardo Comar  wrote on 13/12/2018 17:50:30:

> From: Edoardo Comar 
> To: dev@kafka.apache.org
> Date: 13/12/2018 17:50
> Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> Cluster Replication
> 
> Hi,
> as we haven't got any more feedback, we'd like to start a vote on 
KIP-391 
> on Monday
> 
> INVALID URI REMOVED
> 
u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication=DwIFAg=jf_iaSHvJObTbx-
> 
siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=hxekG7cvm8Peoyd4oPqvSwRFRuGIyi9Pc_h2GhHbgtw=4SGyJsJAuYWZWADpzAaSEPqzYnde0WRW6XgZ3L4haB4=
> 
> --
> 
> Edoardo Comar
> 
> IBM Event Streams
> IBM UK Ltd, Hursley Park, SO21 2JN
> 
> 
> Edoardo Comar/UK/IBM wrote on 10/12/2018 10:20:06:
> 
> > From: Edoardo Comar/UK/IBM
> > To: dev@kafka.apache.org
> > Date: 10/12/2018 10:20
> > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> > Cluster Replication
> > 
> > (shameless bump) any additional feedback is welcome ... thanks!
> > 
> > Edoardo Comar  wrote on 27/11/2018 15:35:09:
> > 
> > > From: Edoardo Comar 
> > > To: dev@kafka.apache.org
> > > Date: 27/11/2018 15:35
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> > > Cluster Replication
> > > 
> > > Hi Jason
> > > 
> > > we envisioned the replicator to replicate the __consumer_offsets 
topic 
> too 
> > > (although without producing-with-offsets to it!).
> > > 
> > > As there is no client-side implementation yet using the leader 
epoch, 
> > > we could not yet see the impact of writing to the destination 
cluster 
> > > __consumer_offsets records with an invalid leader epoch.
> > > 
> > > Also, applications might still use external storage mechanism for 
> consumer 
> > > offsets where the leader_epoch is missing.
> > > 
> > > Perhaps the replicator could - for the __consumer_offsets topic - 
just 
> 
> > > omit the leader_epoch field in the data sent to destination.
> > > 
> > > What do you think ?
> > > 
> > > 
> > > Jason Gustafson  wrote on 27/11/2018 00:09:56:
> > > 
> > > > Another wrinkle to consider is KIP-320. If you are planning to 
> replicate
> > > > __consumer_offsets directly, then you will have to account for 
> leader 
> > > epoch
> > > > information which is stored with the committed offsets. But I 
cannot 
> 
> > > think
> > > > how it would be possible to replicate the leader epoch information 

> in
> > > > messages even if you can preserve offsets.
> > > > 
> > > > -Jason
> > > > 
> > > > On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat 
> > > 
> > > > wrote:
> > > > 
> > > > > Hi Edoardo,
> > > > >
> > > > > Thanks a lot for the KIP.
> > > > >  I have a few questions/suggestions in addition to what Radai 
has 
> > > mentioned
> > > > > above :
> > > > >
> > > > >1. Is this meant only for 1:1 replication, for example one 
> Kafka 
> > > cluster
> > > > >replicating to other, instead of having multiple Kafka 
clusters
> > > > > mirroring
> > > > >into one Kafka cluster?
> > > > >2. Are we relying on exactly once produce in the replicator? 
If 
> 
> > > not, how
> > > > >are retries handled in the replicator ?
> > > > >3. What is the recommended value for inflight requests, here. 

> Is it
> > > > >suppose to be strictly 1, if yes, it would be great to 
mention 
> that 
> > > in
> > > > > the
> > > > >KIP.
> > > > >4. How is unclean Leader election between source cluster and 
> > > destination
> > > > >cluster handled?
> > > > >5. How are offsets resets in case of the replicator's 
consumer 
> > > handled?
> > > > >6. It would be good to explain the workflow in the KIP, with 
an
> > > > >example,  regarding how this KIP will change the replication 
> > > scenario

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-12-13 Thread Edoardo Comar
Hi,
as we haven't got any more feedback, we'd like to start a vote on KIP-391 
on Monday

https://cwiki.apache.org/confluence/display/KAFKA/KIP-391%3A+Allow+Producing+with+Offsets+for+Cluster+Replication

--

Edoardo Comar

IBM Event Streams
IBM UK Ltd, Hursley Park, SO21 2JN


Edoardo Comar/UK/IBM wrote on 10/12/2018 10:20:06:

> From: Edoardo Comar/UK/IBM
> To: dev@kafka.apache.org
> Date: 10/12/2018 10:20
> Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> Cluster Replication
> 
> (shameless bump) any additional feedback is welcome ... thanks!
> 
> Edoardo Comar  wrote on 27/11/2018 15:35:09:
> 
> > From: Edoardo Comar 
> > To: dev@kafka.apache.org
> > Date: 27/11/2018 15:35
> > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> > Cluster Replication
> > 
> > Hi Jason
> > 
> > we envisioned the replicator to replicate the __consumer_offsets topic 
too 
> > (although without producing-with-offsets to it!).
> > 
> > As there is no client-side implementation yet using the leader epoch, 
> > we could not yet see the impact of writing to the destination cluster 
> > __consumer_offsets records with an invalid leader epoch.
> > 
> > Also, applications might still use external storage mechanism for 
consumer 
> > offsets where the leader_epoch is missing.
> > 
> > Perhaps the replicator could - for the __consumer_offsets topic - just 

> > omit the leader_epoch field in the data sent to destination.
> > 
> > What do you think ?
> > 
> > 
> > Jason Gustafson  wrote on 27/11/2018 00:09:56:
> > 
> > > Another wrinkle to consider is KIP-320. If you are planning to 
replicate
> > > __consumer_offsets directly, then you will have to account for 
leader 
> > epoch
> > > information which is stored with the committed offsets. But I cannot 

> > think
> > > how it would be possible to replicate the leader epoch information 
in
> > > messages even if you can preserve offsets.
> > > 
> > > -Jason
> > > 
> > > On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat 
> > 
> > > wrote:
> > > 
> > > > Hi Edoardo,
> > > >
> > > > Thanks a lot for the KIP.
> > > >  I have a few questions/suggestions in addition to what Radai has 
> > mentioned
> > > > above :
> > > >
> > > >1. Is this meant only for 1:1 replication, for example one 
Kafka 
> > cluster
> > > >replicating to other, instead of having multiple Kafka clusters
> > > > mirroring
> > > >into one Kafka cluster?
> > > >2. Are we relying on exactly once produce in the replicator? If 

> > not, how
> > > >are retries handled in the replicator ?
> > > >3. What is the recommended value for inflight requests, here. 
Is it
> > > >suppose to be strictly 1, if yes, it would be great to mention 
that 
> > in
> > > > the
> > > >KIP.
> > > >4. How is unclean Leader election between source cluster and 
> > destination
> > > >cluster handled?
> > > >5. How are offsets resets in case of the replicator's consumer 
> > handled?
> > > >6. It would be good to explain the workflow in the KIP, with an
> > > >example,  regarding how this KIP will change the replication 
> > scenario
> > > > and
> > > >how it will benefit the consumer apps.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Mon, Nov 26, 2018 at 8:08 AM radai  

> > wrote:
> > > >
> > > > > a few questions:
> > > > >
> > > > > 1. how do you handle possible duplications caused by the 
"special"
> > > > > producer timing-out/retrying? are you explicitely relying on the
> > > > > "exactly once" sequencing?
> > > > > 2. what about the combination of log compacted topics + 
replicator
> > > > > downtime? by the time the replicator comes back up there might 
be
> > > > > "holes" in the source offsets (some msgs might have been 
compacted
> > > > > out)? how is that recoverable?
> > > > > 3. similarly, what if you try and fire up replication on a 
non-empty
> > > > > source topic? does the kip allow for offsets starting at some
> > > > > arbitrary X > 0 ? or would thi

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-12-10 Thread Edoardo Comar
(shameless bump) any additional feedback is welcome ... thanks!


Edoardo Comar  wrote on 27/11/2018 15:35:09:

> From: Edoardo Comar 
> To: dev@kafka.apache.org
> Date: 27/11/2018 15:35
> Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> Cluster Replication
> 
> Hi Jason
> 
> we envisioned the replicator to replicate the __consumer_offsets topic 
too 
> (although without producing-with-offsets to it!).
> 
> As there is no client-side implementation yet using the leader epoch, 
> we could not yet see the impact of writing to the destination cluster 
> __consumer_offsets records with an invalid leader epoch.
> 
> Also, applications might still use external storage mechanism for 
consumer 
> offsets where the leader_epoch is missing.
> 
> Perhaps the replicator could - for the __consumer_offsets topic - just 
> omit the leader_epoch field in the data sent to destination.
> 
> What do you think ?
> 
> 
> Jason Gustafson  wrote on 27/11/2018 00:09:56:
> 
> > Another wrinkle to consider is KIP-320. If you are planning to 
replicate
> > __consumer_offsets directly, then you will have to account for leader 
> epoch
> > information which is stored with the committed offsets. But I cannot 
> think
> > how it would be possible to replicate the leader epoch information in
> > messages even if you can preserve offsets.
> > 
> > -Jason
> > 
> > On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat 
> 
> > wrote:
> > 
> > > Hi Edoardo,
> > >
> > > Thanks a lot for the KIP.
> > >  I have a few questions/suggestions in addition to what Radai has 
> mentioned
> > > above :
> > >
> > >1. Is this meant only for 1:1 replication, for example one Kafka 
> cluster
> > >replicating to other, instead of having multiple Kafka clusters
> > > mirroring
> > >into one Kafka cluster?
> > >2. Are we relying on exactly once produce in the replicator? If 
> not, how
> > >are retries handled in the replicator ?
> > >3. What is the recommended value for inflight requests, here. Is 
it
> > >suppose to be strictly 1, if yes, it would be great to mention 
that 
> in
> > > the
> > >KIP.
> > >4. How is unclean Leader election between source cluster and 
> destination
> > >cluster handled?
> > >5. How are offsets resets in case of the replicator's consumer 
> handled?
> > >6. It would be good to explain the workflow in the KIP, with an
> > >example,  regarding how this KIP will change the replication 
> scenario
> > > and
> > >how it will benefit the consumer apps.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Mon, Nov 26, 2018 at 8:08 AM radai  
> wrote:
> > >
> > > > a few questions:
> > > >
> > > > 1. how do you handle possible duplications caused by the "special"
> > > > producer timing-out/retrying? are you explicitely relying on the
> > > > "exactly once" sequencing?
> > > > 2. what about the combination of log compacted topics + replicator
> > > > downtime? by the time the replicator comes back up there might be
> > > > "holes" in the source offsets (some msgs might have been compacted
> > > > out)? how is that recoverable?
> > > > 3. similarly, what if you try and fire up replication on a 
non-empty
> > > > source topic? does the kip allow for offsets starting at some
> > > > arbitrary X > 0 ? or would this have to be designed from the 
start.
> > > >
> > > > and lastly, since this KIP seems to be designed fro active-passive
> > > > failover (there can be no produce traffic except the replicator)
> > > > wouldnt a solution based on seeking to a time offset be more 
> generic?
> > > > your producers could checkpoint the last (say log append) 
timestamp 
> of
> > > > records theyve seen, and when restoring in the remote site seek to
> > > > those timestamps (which will be metadata in their committed 
offsets) 
> -
> > > > assumming replication takes > 0 time you'd need to handle some 
dups,
> > > > but every kafka consumer setup needs to know how to handle those
> > > > anyway.
> > > > On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  
> wrote:
> > > > >
> > > > > Hi Stanislav
> > > > >
> > > > > > > The flag is needed to distinguish a

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-27 Thread Edoardo Comar
Hi Mayuresh

1. we were envisioning the 1:1 case, however as long as topic names do not 
clash, 
you could replicate multiple cluster into a single replica, 
or use topic prefixes on the destination. 

2. using an idempotent producer in the replicator would be recommended in 
the replicator.

3. Why would you force max.in.flight.requests.per.connection it to 1? 
The idempotent producer can work with <=5 

4. if truncation occurred in a source topic-partition,
the replicator could encounter INVALID_PRODUCE_OFFSET and at the moment it 
could only
delete the topic and restart replicating to it. 
There is no mechanism for truncating the newest records in the 
destination.
Note that unclean Leader election is now disabled by default.

5. can you please clarify the question?

6. Consumers can use their saved offsets - either stored in Kafka's 
__consumer_offsets or in an external store -
on the records of the replicated cluster, without any translation or 
without relying on timestamps.

This allows the replicator to replicate the committed offsets without 
translation too.

HTH
Edo & Mickael

Mayuresh Gharat  wrote on 26/11/2018 21:16:25:

> Hi Edoardo,
> 
> Thanks a lot for the KIP.
>  I have a few questions/suggestions in addition to what Radai has 
mentioned
> above :
> 
>1. Is this meant only for 1:1 replication, for example one Kafka 
cluster
>replicating to other, instead of having multiple Kafka clusters 
mirroring
>into one Kafka cluster?
>2. Are we relying on exactly once produce in the replicator? If not, 
how
>are retries handled in the replicator ?
>3. What is the recommended value for inflight requests, here. Is it
>suppose to be strictly 1, if yes, it would be great to mention that 
in the
>KIP.
>4. How is unclean Leader election between source cluster and 
destination
>cluster handled?
>5. How are offsets resets in case of the replicator's consumer 
handled?
>6. It would be good to explain the workflow in the KIP, with an
>example,  regarding how this KIP will change the replication scenario 
and
>how it will benefit the consumer apps.
> 
> Thanks,
> 
> Mayuresh
> 
> On Mon, Nov 26, 2018 at 8:08 AM radai  
wrote:
> 
> > a few questions:
> >
> > 1. how do you handle possible duplications caused by the "special"
> > producer timing-out/retrying? are you explicitely relying on the
> > "exactly once" sequencing?
> > 2. what about the combination of log compacted topics + replicator
> > downtime? by the time the replicator comes back up there might be
> > "holes" in the source offsets (some msgs might have been compacted
> > out)? how is that recoverable?
> > 3. similarly, what if you try and fire up replication on a non-empty
> > source topic? does the kip allow for offsets starting at some
> > arbitrary X > 0 ? or would this have to be designed from the start.
> >
> > and lastly, since this KIP seems to be designed fro active-passive
> > failover (there can be no produce traffic except the replicator)
> > wouldnt a solution based on seeking to a time offset be more generic?
> > your producers could checkpoint the last (say log append) timestamp of
> > records theyve seen, and when restoring in the remote site seek to
> > those timestamps (which will be metadata in their committed offsets) -
> > assumming replication takes > 0 time you'd need to handle some dups,
> > but every kafka consumer setup needs to know how to handle those
> > anyway.
> > On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  
wrote:
> > >
> > > Hi Stanislav
> > >
> > > > > The flag is needed to distinguish a batch with a desired base 
offset
> > > of
> > > > 0,
> > > > from a regular batch for which offsets need to be generated.
> > > > If the producer can provide offsets, why not provide a base offset 
of
> > 0?
> > >
> > > a regular batch (for which offsets are generated by the broker on 
write)
> > > is sent with a base offset of 0.
> > > How could you distinguish it from a batch where you *want* the first
> > > record to be written at offset 0 (i.e. be the first in the partition 
and
> > > be rejected if there are records on the log already) ?
> > > We wanted to avoid a "deep" inspection (and potentially 
decompression) of
> > > the records.
> > >
> > > For the replicator use case, a single produce request where all the 
data
> > > is to be assumed with offset,
> > > or all without offsets, seems to suffice,
> > > So we added only a toplevel flag, not a per-topic-partition one.
> > &g

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-27 Thread Edoardo Comar
ression)
> > of
> > > > the records.
> > > >
> > > > For the replicator use case, a single produce request where all 
the
> > data
> > > > is to be assumed with offset,
> > > > or all without offsets, seems to suffice,
> > > > So we added only a toplevel flag, not a per-topic-partition one.
> > > >
> > > > Thanks for your interest !
> > > > cheers
> > > > Edo
> > > > --
> > > >
> > > > Edoardo Comar
> > > >
> > > > IBM Event Streams
> > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > >
> > > >
> > > > Stanislav Kozlovski  wrote on 22/11/2018
> > > 22:32:42:
> > > >
> > > > > From: Stanislav Kozlovski 
> > > > > To: dev@kafka.apache.org
> > > > > Date: 22/11/2018 22:33
> > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > > Cluster Replication
> > > > >
> > > > > Hey Edo & Mickael,
> > > > >
> > > > > > The flag is needed to distinguish a batch with a desired base
> > offset
> > > > of
> > > > > 0,
> > > > > from a regular batch for which offsets need to be generated.
> > > > > If the producer can provide offsets, why not provide a base 
offset of
> > > 0?
> > > > >
> > > > > > (I am reading your post thinking about
> > > > > partitions rather than topics).
> > > > > Yes, I meant partitions. Sorry about that.
> > > > >
> > > > > Thanks for answering my questions :)
> > > > >
> > > > > Best,
> > > > > Stanislav
> > > > >
> > > > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar 

> > > wrote:
> > > > >
> > > > > > Hi Stanislav,
> > > > > >
> > > > > > you're right we envision the replicator use case to have a 
single
> > > > producer
> > > > > > with offsets per partition (I am reading your post thinking 
about
> > > > > > partitions rather than topics).
> > > > > >
> > > > > > If a regular producer was to send its own records at the same 
time,
> > > > it's
> > > > > > very likely that the one sending with an offset will fail 
because
> > of
> > > > > > invalid offsets.
> > > > > > Same if two producers were sending with offsets, likely both 
would
> > > > then
> > > > > > fail.
> > > > > >
> > > > > > > Does it make sense to *lock* the topic from other producers 
while
> > > > there
> > > > > > is
> > > > > > > one that uses offsets?
> > > > > >
> > > > > > You could do that with ACL permissions if you wanted, I don't 
think
> > > it
> > > > > > needs to be mandated by changing the broker logic.
> > > > > >
> > > > > >
> > > > > > > Since we are tying the produce-with-offset request to the 
ACL, do
> > > we
> > > > > > need
> > > > > > > the `use_offset` field in the produce request? Maybe we make 
it
> > > > > > mandatory
> > > > > > > for produce requests with that ACL to have offsets.
> > > > > >
> > > > > > The flag is needed to distinguish a batch with a desired base
> > offset
> > > > of 0,
> > > > > > from a regular batch for which offsets need to be generated.
> > > > > > I would not restrict a principal to only send-with-offsets (by
> > making
> > > > that
> > > > > > mandatory via the ACL).
> > > > > >
> > > > > > Thanks
> > > > > > Edo & Mickael
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Edoardo Comar
> > > > > >
> > > > > > IBM Event Streams
> > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > >
> > > > > >
> > &g

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-27 Thread Edoardo Comar
Hi Radai

> 1. how do you handle possible duplications caused by the "special"
> producer timing-out/retrying? are you explicitely relying on the
> "exactly once" sequencing?

A duplicate ProduceRequest would be rejected with an 
INVALID_PRODUCE_OFFSET error.

We envision using an idempotent producer for cluster replication but to 
not require it.


> 2. what about the combination of log compacted topics + replicator
> downtime? by the time the replicator comes back up there might be
> "holes" in the source offsets (some msgs might have been compacted
> out)? how is that recoverable?
> 3. similarly, what if you try and fire up replication on a non-empty
> source topic? does the kip allow for offsets starting at some
> arbitrary X > 0 ? or would this have to be designed from the start.

Both these cases do not pose a problem. 
As mentioned in the KIP each Producer batch must not contain offset gaps, 
but these can exist between batches.
The companion PR has an implementation with tests that cover these cases

> and lastly, since this KIP seems to be designed fro active-passive
> failover (there can be no produce traffic except the replicator)
> wouldnt a solution based on seeking to a time offset be more generic?
> your producers could checkpoint the last (say log append) timestamp of
> records theyve seen, and when restoring in the remote site seek to
> those timestamps (which will be metadata in their committed offsets) -
> assumming replication takes > 0 time you'd need to handle some dups,
> but every kafka consumer setup needs to know how to handle those
> anyway.

can you please clarify?
We do not expect any cooperation from users applications.

thanks!
E

> On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  wrote:
> >
> > Hi Stanislav
> >
> > > > The flag is needed to distinguish a batch with a desired base 
offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset 
of 0?
> >
> > a regular batch (for which offsets are generated by the broker on 
write)
> > is sent with a base offset of 0.
> > How could you distinguish it from a batch where you *want* the first
> > record to be written at offset 0 (i.e. be the first in the partition 
and
> > be rejected if there are records on the log already) ?
> > We wanted to avoid a "deep" inspection (and potentially decompression) 
of
> > the records.
> >
> > For the replicator use case, a single produce request where all the 
data
> > is to be assumed with offset,
> > or all without offsets, seems to suffice,
> > So we added only a toplevel flag, not a per-topic-partition one.
> >
> > Thanks for your interest !
> > cheers
> > Edo
> > ------
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski  wrote on 22/11/2018 
22:32:42:
> >
> > > From: Stanislav Kozlovski 
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 22:33
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edo & Mickael,
> > >
> > > > The flag is needed to distinguish a batch with a desired base 
offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset 
of 0?
> > >
> > > > (I am reading your post thinking about
> > > partitions rather than topics).
> > > Yes, I meant partitions. Sorry about that.
> > >
> > > Thanks for answering my questions :)
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar  
wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > you're right we envision the replicator use case to have a single
> > producer
> > > > with offsets per partition (I am reading your post thinking about
> > > > partitions rather than topics).
> > > >
> > > > If a regular producer was to send its own records at the same 
time,
> > it's
> > > > very likely that the one sending with an offset will fail because 
of
> > > > invalid offsets.
> > > > Same if two producers were sending with offsets, likely both would
> > then
> > > > fail.
> > > >
> > > > > Does it m

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-26 Thread Jason Gustafson
Another wrinkle to consider is KIP-320. If you are planning to replicate
__consumer_offsets directly, then you will have to account for leader epoch
information which is stored with the committed offsets. But I cannot think
how it would be possible to replicate the leader epoch information in
messages even if you can preserve offsets.

-Jason

On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat 
wrote:

> Hi Edoardo,
>
> Thanks a lot for the KIP.
>  I have a few questions/suggestions in addition to what Radai has mentioned
> above :
>
>1. Is this meant only for 1:1 replication, for example one Kafka cluster
>replicating to other, instead of having multiple Kafka clusters
> mirroring
>into one Kafka cluster?
>2. Are we relying on exactly once produce in the replicator? If not, how
>are retries handled in the replicator ?
>3. What is the recommended value for inflight requests, here. Is it
>suppose to be strictly 1, if yes, it would be great to mention that in
> the
>KIP.
>4. How is unclean Leader election between source cluster and destination
>cluster handled?
>5. How are offsets resets in case of the replicator's consumer handled?
>6. It would be good to explain the workflow in the KIP, with an
>example,  regarding how this KIP will change the replication scenario
> and
>how it will benefit the consumer apps.
>
> Thanks,
>
> Mayuresh
>
> On Mon, Nov 26, 2018 at 8:08 AM radai  wrote:
>
> > a few questions:
> >
> > 1. how do you handle possible duplications caused by the "special"
> > producer timing-out/retrying? are you explicitely relying on the
> > "exactly once" sequencing?
> > 2. what about the combination of log compacted topics + replicator
> > downtime? by the time the replicator comes back up there might be
> > "holes" in the source offsets (some msgs might have been compacted
> > out)? how is that recoverable?
> > 3. similarly, what if you try and fire up replication on a non-empty
> > source topic? does the kip allow for offsets starting at some
> > arbitrary X > 0 ? or would this have to be designed from the start.
> >
> > and lastly, since this KIP seems to be designed fro active-passive
> > failover (there can be no produce traffic except the replicator)
> > wouldnt a solution based on seeking to a time offset be more generic?
> > your producers could checkpoint the last (say log append) timestamp of
> > records theyve seen, and when restoring in the remote site seek to
> > those timestamps (which will be metadata in their committed offsets) -
> > assumming replication takes > 0 time you'd need to handle some dups,
> > but every kafka consumer setup needs to know how to handle those
> > anyway.
> > On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  wrote:
> > >
> > > Hi Stanislav
> > >
> > > > > The flag is needed to distinguish a batch with a desired base
> offset
> > > of
> > > > 0,
> > > > from a regular batch for which offsets need to be generated.
> > > > If the producer can provide offsets, why not provide a base offset of
> > 0?
> > >
> > > a regular batch (for which offsets are generated by the broker on
> write)
> > > is sent with a base offset of 0.
> > > How could you distinguish it from a batch where you *want* the first
> > > record to be written at offset 0 (i.e. be the first in the partition
> and
> > > be rejected if there are records on the log already) ?
> > > We wanted to avoid a "deep" inspection (and potentially decompression)
> of
> > > the records.
> > >
> > > For the replicator use case, a single produce request where all the
> data
> > > is to be assumed with offset,
> > > or all without offsets, seems to suffice,
> > > So we added only a toplevel flag, not a per-topic-partition one.
> > >
> > > Thanks for your interest !
> > > cheers
> > > Edo
> > > --
> > >
> > > Edoardo Comar
> > >
> > > IBM Event Streams
> > > IBM UK Ltd, Hursley Park, SO21 2JN
> > >
> > >
> > > Stanislav Kozlovski  wrote on 22/11/2018
> > 22:32:42:
> > >
> > > > From: Stanislav Kozlovski 
> > > > To: dev@kafka.apache.org
> > > > Date: 22/11/2018 22:33
> > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > > Cluster Replication
> > > >
> > > > Hey Edo & 

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-26 Thread Mayuresh Gharat
Hi Edoardo,

Thanks a lot for the KIP.
 I have a few questions/suggestions in addition to what Radai has mentioned
above :

   1. Is this meant only for 1:1 replication, for example one Kafka cluster
   replicating to other, instead of having multiple Kafka clusters mirroring
   into one Kafka cluster?
   2. Are we relying on exactly once produce in the replicator? If not, how
   are retries handled in the replicator ?
   3. What is the recommended value for inflight requests, here. Is it
   suppose to be strictly 1, if yes, it would be great to mention that in the
   KIP.
   4. How is unclean Leader election between source cluster and destination
   cluster handled?
   5. How are offsets resets in case of the replicator's consumer handled?
   6. It would be good to explain the workflow in the KIP, with an
   example,  regarding how this KIP will change the replication scenario and
   how it will benefit the consumer apps.

Thanks,

Mayuresh

On Mon, Nov 26, 2018 at 8:08 AM radai  wrote:

> a few questions:
>
> 1. how do you handle possible duplications caused by the "special"
> producer timing-out/retrying? are you explicitely relying on the
> "exactly once" sequencing?
> 2. what about the combination of log compacted topics + replicator
> downtime? by the time the replicator comes back up there might be
> "holes" in the source offsets (some msgs might have been compacted
> out)? how is that recoverable?
> 3. similarly, what if you try and fire up replication on a non-empty
> source topic? does the kip allow for offsets starting at some
> arbitrary X > 0 ? or would this have to be designed from the start.
>
> and lastly, since this KIP seems to be designed fro active-passive
> failover (there can be no produce traffic except the replicator)
> wouldnt a solution based on seeking to a time offset be more generic?
> your producers could checkpoint the last (say log append) timestamp of
> records theyve seen, and when restoring in the remote site seek to
> those timestamps (which will be metadata in their committed offsets) -
> assumming replication takes > 0 time you'd need to handle some dups,
> but every kafka consumer setup needs to know how to handle those
> anyway.
> On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  wrote:
> >
> > Hi Stanislav
> >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> >
> > a regular batch (for which offsets are generated by the broker on write)
> > is sent with a base offset of 0.
> > How could you distinguish it from a batch where you *want* the first
> > record to be written at offset 0 (i.e. be the first in the partition and
> > be rejected if there are records on the log already) ?
> > We wanted to avoid a "deep" inspection (and potentially decompression) of
> > the records.
> >
> > For the replicator use case, a single produce request where all the data
> > is to be assumed with offset,
> > or all without offsets, seems to suffice,
> > So we added only a toplevel flag, not a per-topic-partition one.
> >
> > Thanks for your interest !
> > cheers
> > Edo
> > ------------------
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski  wrote on 22/11/2018
> 22:32:42:
> >
> > > From: Stanislav Kozlovski 
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 22:33
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edo & Mickael,
> > >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> > >
> > > > (I am reading your post thinking about
> > > partitions rather than topics).
> > > Yes, I meant partitions. Sorry about that.
> > >
> > > Thanks for answering my questions :)
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar 
> wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > you're right we envision the replicator use case to have a single
> > producer
> > > > with offsets per

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-26 Thread radai
a few questions:

1. how do you handle possible duplications caused by the "special"
producer timing-out/retrying? are you explicitely relying on the
"exactly once" sequencing?
2. what about the combination of log compacted topics + replicator
downtime? by the time the replicator comes back up there might be
"holes" in the source offsets (some msgs might have been compacted
out)? how is that recoverable?
3. similarly, what if you try and fire up replication on a non-empty
source topic? does the kip allow for offsets starting at some
arbitrary X > 0 ? or would this have to be designed from the start.

and lastly, since this KIP seems to be designed fro active-passive
failover (there can be no produce traffic except the replicator)
wouldnt a solution based on seeking to a time offset be more generic?
your producers could checkpoint the last (say log append) timestamp of
records theyve seen, and when restoring in the remote site seek to
those timestamps (which will be metadata in their committed offsets) -
assumming replication takes > 0 time you'd need to handle some dups,
but every kafka consumer setup needs to know how to handle those
anyway.
On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  wrote:
>
> Hi Stanislav
>
> > > The flag is needed to distinguish a batch with a desired base offset
> of
> > 0,
> > from a regular batch for which offsets need to be generated.
> > If the producer can provide offsets, why not provide a base offset of 0?
>
> a regular batch (for which offsets are generated by the broker on write)
> is sent with a base offset of 0.
> How could you distinguish it from a batch where you *want* the first
> record to be written at offset 0 (i.e. be the first in the partition and
> be rejected if there are records on the log already) ?
> We wanted to avoid a "deep" inspection (and potentially decompression) of
> the records.
>
> For the replicator use case, a single produce request where all the data
> is to be assumed with offset,
> or all without offsets, seems to suffice,
> So we added only a toplevel flag, not a per-topic-partition one.
>
> Thanks for your interest !
> cheers
> Edo
> --
>
> Edoardo Comar
>
> IBM Event Streams
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
> Stanislav Kozlovski  wrote on 22/11/2018 22:32:42:
>
> > From: Stanislav Kozlovski 
> > To: dev@kafka.apache.org
> > Date: 22/11/2018 22:33
> > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > Cluster Replication
> >
> > Hey Edo & Mickael,
> >
> > > The flag is needed to distinguish a batch with a desired base offset
> of
> > 0,
> > from a regular batch for which offsets need to be generated.
> > If the producer can provide offsets, why not provide a base offset of 0?
> >
> > > (I am reading your post thinking about
> > partitions rather than topics).
> > Yes, I meant partitions. Sorry about that.
> >
> > Thanks for answering my questions :)
> >
> > Best,
> > Stanislav
> >
> > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar  wrote:
> >
> > > Hi Stanislav,
> > >
> > > you're right we envision the replicator use case to have a single
> producer
> > > with offsets per partition (I am reading your post thinking about
> > > partitions rather than topics).
> > >
> > > If a regular producer was to send its own records at the same time,
> it's
> > > very likely that the one sending with an offset will fail because of
> > > invalid offsets.
> > > Same if two producers were sending with offsets, likely both would
> then
> > > fail.
> > >
> > > > Does it make sense to *lock* the topic from other producers while
> there
> > > is
> > > > one that uses offsets?
> > >
> > > You could do that with ACL permissions if you wanted, I don't think it
> > > needs to be mandated by changing the broker logic.
> > >
> > >
> > > > Since we are tying the produce-with-offset request to the ACL, do we
> > > need
> > > > the `use_offset` field in the produce request? Maybe we make it
> > > mandatory
> > > > for produce requests with that ACL to have offsets.
> > >
> > > The flag is needed to distinguish a batch with a desired base offset
> of 0,
> > > from a regular batch for which offsets need to be generated.
> > > I would not restrict a principal to only send-with-offsets (by making
> that
> > > mandatory via the ACL).
> > >
> > > Thanks
> > > E

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-23 Thread Edoardo Comar
Hi Stanislav

> > The flag is needed to distinguish a batch with a desired base offset 
of
> 0,
> from a regular batch for which offsets need to be generated.
> If the producer can provide offsets, why not provide a base offset of 0?

a regular batch (for which offsets are generated by the broker on write) 
is sent with a base offset of 0.
How could you distinguish it from a batch where you *want* the first 
record to be written at offset 0 (i.e. be the first in the partition and 
be rejected if there are records on the log already) ?
We wanted to avoid a "deep" inspection (and potentially decompression) of 
the records. 

For the replicator use case, a single produce request where all the data 
is to be assumed with offset, 
or all without offsets, seems to suffice,
So we added only a toplevel flag, not a per-topic-partition one.

Thanks for your interest !
cheers
Edo
--

Edoardo Comar

IBM Event Streams
IBM UK Ltd, Hursley Park, SO21 2JN


Stanislav Kozlovski  wrote on 22/11/2018 22:32:42:

> From: Stanislav Kozlovski 
> To: dev@kafka.apache.org
> Date: 22/11/2018 22:33
> Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> Cluster Replication
> 
> Hey Edo & Mickael,
> 
> > The flag is needed to distinguish a batch with a desired base offset 
of
> 0,
> from a regular batch for which offsets need to be generated.
> If the producer can provide offsets, why not provide a base offset of 0?
> 
> > (I am reading your post thinking about
> partitions rather than topics).
> Yes, I meant partitions. Sorry about that.
> 
> Thanks for answering my questions :)
> 
> Best,
> Stanislav
> 
> On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar  wrote:
> 
> > Hi Stanislav,
> >
> > you're right we envision the replicator use case to have a single 
producer
> > with offsets per partition (I am reading your post thinking about
> > partitions rather than topics).
> >
> > If a regular producer was to send its own records at the same time, 
it's
> > very likely that the one sending with an offset will fail because of
> > invalid offsets.
> > Same if two producers were sending with offsets, likely both would 
then
> > fail.
> >
> > > Does it make sense to *lock* the topic from other producers while 
there
> > is
> > > one that uses offsets?
> >
> > You could do that with ACL permissions if you wanted, I don't think it
> > needs to be mandated by changing the broker logic.
> >
> >
> > > Since we are tying the produce-with-offset request to the ACL, do we
> > need
> > > the `use_offset` field in the produce request? Maybe we make it
> > mandatory
> > > for produce requests with that ACL to have offsets.
> >
> > The flag is needed to distinguish a batch with a desired base offset 
of 0,
> > from a regular batch for which offsets need to be generated.
> > I would not restrict a principal to only send-with-offsets (by making 
that
> > mandatory via the ACL).
> >
> > Thanks
> > Edo & Mickael
> >
> > ------------------
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski  wrote on 22/11/2018 
16:17:11:
> >
> > > From: Stanislav Kozlovski 
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 16:17
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edurdo, thanks for the KIP!
> > >
> > > I have some questions, apologies if they are naive:
> > > Is this intended to work for a single producer use case only?
> > > How would it work if two producers were producing to the same topic 
with
> > > offsets?
> > > How would it work if two producers, one with offsets and one without
> > were
> > > producing to a topic?
> > > Does it make sense to *lock* the topic from other producers while 
there
> > is
> > > one that uses offsets?
> > >
> > > Since we are tying the produce-with-offset request to the ACL, do we
> > need
> > > the `use_offset` field in the produce request? Maybe we make it
> > mandatory
> > > for produce requests with that ACL to have offsets.
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar  
wrote:
> > >
> > > > Hi,
> > > > we've opened a KIP to improve data replication between Kafka 
clusters
> > :
> 

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-22 Thread Stanislav Kozlovski
Hey Edo & Mickael,

> The flag is needed to distinguish a batch with a desired base offset of
0,
from a regular batch for which offsets need to be generated.
If the producer can provide offsets, why not provide a base offset of 0?

> (I am reading your post thinking about
partitions rather than topics).
Yes, I meant partitions. Sorry about that.

Thanks for answering my questions :)

Best,
Stanislav

On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar  wrote:

> Hi Stanislav,
>
> you're right we envision the replicator use case to have a single producer
> with offsets per partition (I am reading your post thinking about
> partitions rather than topics).
>
> If a regular producer was to send its own records at the same time, it's
> very likely that the one sending with an offset will fail because of
> invalid offsets.
> Same if two producers were sending with offsets, likely both would then
> fail.
>
> > Does it make sense to *lock* the topic from other producers while there
> is
> > one that uses offsets?
>
> You could do that with ACL permissions if you wanted, I don't think it
> needs to be mandated by changing the broker logic.
>
>
> > Since we are tying the produce-with-offset request to the ACL, do we
> need
> > the `use_offset` field in the produce request? Maybe we make it
> mandatory
> > for produce requests with that ACL to have offsets.
>
> The flag is needed to distinguish a batch with a desired base offset of 0,
> from a regular batch for which offsets need to be generated.
> I would not restrict a principal to only send-with-offsets (by making that
> mandatory via the ACL).
>
> Thanks
> Edo & Mickael
>
> --
>
> Edoardo Comar
>
> IBM Event Streams
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
> Stanislav Kozlovski  wrote on 22/11/2018 16:17:11:
>
> > From: Stanislav Kozlovski 
> > To: dev@kafka.apache.org
> > Date: 22/11/2018 16:17
> > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > Cluster Replication
> >
> > Hey Edurdo, thanks for the KIP!
> >
> > I have some questions, apologies if they are naive:
> > Is this intended to work for a single producer use case only?
> > How would it work if two producers were producing to the same topic with
> > offsets?
> > How would it work if two producers, one with offsets and one without
> were
> > producing to a topic?
> > Does it make sense to *lock* the topic from other producers while there
> is
> > one that uses offsets?
> >
> > Since we are tying the produce-with-offset request to the ACL, do we
> need
> > the `use_offset` field in the produce request? Maybe we make it
> mandatory
> > for produce requests with that ACL to have offsets.
> >
> > Best,
> > Stanislav
> >
> > On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar  wrote:
> >
> > > Hi,
> > > we've opened a KIP to improve data replication between Kafka clusters
> :
> > >
> > >
> > > INVALID URI REMOVED
> >
>
> u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication=DwIBaQ=jf_iaSHvJObTbx-
> >
> siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=uUj9C3BdbYz0dDNA-
> >
> E6iXreg1M5hWiWgG6ClS86VIPI=Vav8_-N7_OpfYEW33yGOf_or8ESMUJ4S45t2g-EUWKg=
> > >
> > > We'd like to start a discussion, please post your feedback in this
> thread.
> > >
> > > Thank you
> > > Edo and Mickael
> > >
> > >
> > > --
> > >
> > > Edoardo Comar
> > >
> > > IBM Event Streams
> > > IBM UK Ltd, Hursley Park, SO21 2JN
> > >
> > > Unless stated otherwise above:
> > > IBM United Kingdom Limited - Registered in England and Wales with
> number
> > > 741598.
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> > >
> >
> >
> > --
> > Best,
> > Stanislav
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


-- 
Best,
Stanislav


Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-22 Thread Edoardo Comar
Hi Stanislav,

you're right we envision the replicator use case to have a single producer 
with offsets per partition (I am reading your post thinking about 
partitions rather than topics).

If a regular producer was to send its own records at the same time, it's 
very likely that the one sending with an offset will fail because of 
invalid offsets.
Same if two producers were sending with offsets, likely both would then 
fail.

> Does it make sense to *lock* the topic from other producers while there 
is
> one that uses offsets?

You could do that with ACL permissions if you wanted, I don't think it 
needs to be mandated by changing the broker logic.


> Since we are tying the produce-with-offset request to the ACL, do we 
need
> the `use_offset` field in the produce request? Maybe we make it 
mandatory
> for produce requests with that ACL to have offsets.

The flag is needed to distinguish a batch with a desired base offset of 0, 
from a regular batch for which offsets need to be generated.
I would not restrict a principal to only send-with-offsets (by making that 
mandatory via the ACL).

Thanks
Edo & Mickael

--

Edoardo Comar

IBM Event Streams
IBM UK Ltd, Hursley Park, SO21 2JN


Stanislav Kozlovski  wrote on 22/11/2018 16:17:11:

> From: Stanislav Kozlovski 
> To: dev@kafka.apache.org
> Date: 22/11/2018 16:17
> Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for 
> Cluster Replication
> 
> Hey Edurdo, thanks for the KIP!
> 
> I have some questions, apologies if they are naive:
> Is this intended to work for a single producer use case only?
> How would it work if two producers were producing to the same topic with
> offsets?
> How would it work if two producers, one with offsets and one without 
were
> producing to a topic?
> Does it make sense to *lock* the topic from other producers while there 
is
> one that uses offsets?
> 
> Since we are tying the produce-with-offset request to the ACL, do we 
need
> the `use_offset` field in the produce request? Maybe we make it 
mandatory
> for produce requests with that ACL to have offsets.
> 
> Best,
> Stanislav
> 
> On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar  wrote:
> 
> > Hi,
> > we've opened a KIP to improve data replication between Kafka clusters 
:
> >
> >
> > INVALID URI REMOVED
> 
u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication=DwIBaQ=jf_iaSHvJObTbx-
> 
siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=uUj9C3BdbYz0dDNA-
> 
E6iXreg1M5hWiWgG6ClS86VIPI=Vav8_-N7_OpfYEW33yGOf_or8ESMUJ4S45t2g-EUWKg=
> >
> > We'd like to start a discussion, please post your feedback in this 
thread.
> >
> > Thank you
> > Edo and Mickael
> >
> >
> > --
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with 
number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
> >
> 
> 
> -- 
> Best,
> Stanislav

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-22 Thread Stanislav Kozlovski
Hey Edurdo, thanks for the KIP!

I have some questions, apologies if they are naive:
Is this intended to work for a single producer use case only?
How would it work if two producers were producing to the same topic with
offsets?
How would it work if two producers, one with offsets and one without were
producing to a topic?
Does it make sense to *lock* the topic from other producers while there is
one that uses offsets?

Since we are tying the produce-with-offset request to the ACL, do we need
the `use_offset` field in the produce request? Maybe we make it mandatory
for produce requests with that ACL to have offsets.

Best,
Stanislav

On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar  wrote:

> Hi,
> we've opened a KIP to improve data replication between Kafka clusters :
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-391%3A+Allow+Producing+with+Offsets+for+Cluster+Replication
>
> We'd like to start a discussion, please post your feedback in this thread.
>
> Thank you
> Edo and Mickael
>
>
> --
>
> Edoardo Comar
>
> IBM Event Streams
> IBM UK Ltd, Hursley Park, SO21 2JN
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


-- 
Best,
Stanislav


[DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-21 Thread Edoardo Comar
Hi,
we've opened a KIP to improve data replication between Kafka clusters :

https://cwiki.apache.org/confluence/display/KAFKA/KIP-391%3A+Allow+Producing+with+Offsets+for+Cluster+Replication

We'd like to start a discussion, please post your feedback in this thread.

Thank you
Edo and Mickael


--

Edoardo Comar

IBM Event Streams
IBM UK Ltd, Hursley Park, SO21 2JN

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU