Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-22 Thread Michael Pearce
Hi Mayuresh,

LGTM. Ive just made one small adjustment updating the wire protocol to show the 
magic byte bump.

Do we think we’re good to put to a vote? Is there any other bits needing 
discussion?

Cheers
Mike

On 21/11/2016, 18:26, "Mayuresh Gharat" <gharatmayures...@gmail.com> wrote:

Hi Michael,

I have updated the migration section of the KIP. Can you please take a look?

Thanks,

Mayuresh

On Fri, Nov 18, 2016 at 9:07 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Michael,
>
> That whilst sending tombstone and non null value, the consumer can expect
> only to receive the non-null message only in step (3) is this correct?
> ---> I do agree with you here.
>
> Becket, Ismael : can you guys review the migration plan listed above using
> magic byte?
>
> Thanks,
>
> Mayuresh
>
> On Fri, Nov 18, 2016 at 8:58 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
>> Many thanks for this Mayuresh. I don't have any objections.
>>
>> I assume we should state:
>>
>> That whilst sending tombstone and non null value, the consumer can expect
>> only to receive the non-null message only in step (3) is this correct?
>>
>> Cheers
>> Mike
>>
>>
>>
>> Sent using OWA for iPhone
>> 
>> From: Mayuresh Gharat <gharatmayures...@gmail.com>
>> Sent: Thursday, November 17, 2016 5:18:41 PM
>> To: dev@kafka.apache.org
>> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>>
>> Hi Ismael,
>>
>> Thanks for the explanation.
>> Specially I like this part where in you mentioned we can get rid of the
>> older null value support for log compaction later on, here :
>> We can't change semantics of the message format without having a long
>> transition period. And we can't rely
>> on people reading documentation or acting on a warning for something so
>> fundamental. As such, my take is that we need to bump the magic byte. The
>> good news is
>> that we don't have to support all versions forever. We have said that we
>> will support direct upgrades for 2 years. That means that message format
>> version n could, in theory, be removed 2 years after the it's introduced.
>>
>> Just a heads up, I would like to mention that even without bumping magic
>> byte, we will *NOT* loose zero copy as in the client(x+1) in my
>> explanation
>> above will convert internally a null value to have a tombstone bit set 
and
>> a tombstone bit set to have a null value automatically internally and by
>> the time we move to version (x+2), the clients would have upgraded.
>> Obviously if we support a request from consumer(x), we will loose zero
>> copy
>> but that is the same case with magic byte.
>>
>> But if magic byte bump makes life easier for transition for the above
>> reasons that you explained, I am OK with it since we are going to meet 
the
>> end goal down the road :)
>>
>> On a side note can we update the doc here on magic byte to say that "*it
>> should be bumped whenever the message format is changed or the
>> interpretation of message format (usage of the reserved bits as well) is
>> changed*".
>>
>>
>> Hi Michael,
>>
>> Here is the update plan that we discussed offline yesterday :
>>
>> Currently the magic-byte which corresponds to the 
"message.format.version"
>> is set to 1.
>>
>> 1) On broker it will be set to 1 initially.
>>
>> 2) When a producer client sends a message with magic-byte = 2, since the
>> broker is on magic-byte = 1, we will down convert it, which means if the
>> tombstone bit is set, the value will be set to null. A consumer
>> understanding magic-byte = 1, will still work with this. A consumer
>> working
>> with magic-byte =2 will also be able to understand this, since it
>> understands the tombstone.
>> Now there is still the question of supporting a non-tombstone and null
>> value from producer client with magic-byte = 2.* (I am not sure if we
>> should support this. Ismael/Becket can comment here)*
>>
>> 3) When almost all the clients have upgraded, the message.format.version
>> on
>> the broker can be changed to 2, where 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-21 Thread Mayuresh Gharat
Hi Michael,

I have updated the migration section of the KIP. Can you please take a look?

Thanks,

Mayuresh

On Fri, Nov 18, 2016 at 9:07 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Michael,
>
> That whilst sending tombstone and non null value, the consumer can expect
> only to receive the non-null message only in step (3) is this correct?
> ---> I do agree with you here.
>
> Becket, Ismael : can you guys review the migration plan listed above using
> magic byte?
>
> Thanks,
>
> Mayuresh
>
> On Fri, Nov 18, 2016 at 8:58 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
>> Many thanks for this Mayuresh. I don't have any objections.
>>
>> I assume we should state:
>>
>> That whilst sending tombstone and non null value, the consumer can expect
>> only to receive the non-null message only in step (3) is this correct?
>>
>> Cheers
>> Mike
>>
>>
>>
>> Sent using OWA for iPhone
>> 
>> From: Mayuresh Gharat <gharatmayures...@gmail.com>
>> Sent: Thursday, November 17, 2016 5:18:41 PM
>> To: dev@kafka.apache.org
>> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>>
>> Hi Ismael,
>>
>> Thanks for the explanation.
>> Specially I like this part where in you mentioned we can get rid of the
>> older null value support for log compaction later on, here :
>> We can't change semantics of the message format without having a long
>> transition period. And we can't rely
>> on people reading documentation or acting on a warning for something so
>> fundamental. As such, my take is that we need to bump the magic byte. The
>> good news is
>> that we don't have to support all versions forever. We have said that we
>> will support direct upgrades for 2 years. That means that message format
>> version n could, in theory, be removed 2 years after the it's introduced.
>>
>> Just a heads up, I would like to mention that even without bumping magic
>> byte, we will *NOT* loose zero copy as in the client(x+1) in my
>> explanation
>> above will convert internally a null value to have a tombstone bit set and
>> a tombstone bit set to have a null value automatically internally and by
>> the time we move to version (x+2), the clients would have upgraded.
>> Obviously if we support a request from consumer(x), we will loose zero
>> copy
>> but that is the same case with magic byte.
>>
>> But if magic byte bump makes life easier for transition for the above
>> reasons that you explained, I am OK with it since we are going to meet the
>> end goal down the road :)
>>
>> On a side note can we update the doc here on magic byte to say that "*it
>> should be bumped whenever the message format is changed or the
>> interpretation of message format (usage of the reserved bits as well) is
>> changed*".
>>
>>
>> Hi Michael,
>>
>> Here is the update plan that we discussed offline yesterday :
>>
>> Currently the magic-byte which corresponds to the "message.format.version"
>> is set to 1.
>>
>> 1) On broker it will be set to 1 initially.
>>
>> 2) When a producer client sends a message with magic-byte = 2, since the
>> broker is on magic-byte = 1, we will down convert it, which means if the
>> tombstone bit is set, the value will be set to null. A consumer
>> understanding magic-byte = 1, will still work with this. A consumer
>> working
>> with magic-byte =2 will also be able to understand this, since it
>> understands the tombstone.
>> Now there is still the question of supporting a non-tombstone and null
>> value from producer client with magic-byte = 2.* (I am not sure if we
>> should support this. Ismael/Becket can comment here)*
>>
>> 3) When almost all the clients have upgraded, the message.format.version
>> on
>> the broker can be changed to 2, where in the down conversion in the above
>> step will not happen. If at this point we get a consumer request from a
>> older consumer, we might have to down convert where in we loose zero copy,
>> but these cases should be rare.
>>
>> Becket can you review this plan and add more details if I have
>> missed/wronged something, before we put it on KIP.
>>
>> Thanks,
>>
>> Mayuresh
>>
>> On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
>> wrote:
>>
>> > Thanks guys, for discussing this offline and getting some consensus.
>> >
>> > So its clear for myself 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-18 Thread Mayuresh Gharat
Hi Michael,

That whilst sending tombstone and non null value, the consumer can expect
only to receive the non-null message only in step (3) is this correct?
---> I do agree with you here.

Becket, Ismael : can you guys review the migration plan listed above using
magic byte?

Thanks,

Mayuresh

On Fri, Nov 18, 2016 at 8:58 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Many thanks for this Mayuresh. I don't have any objections.
>
> I assume we should state:
>
> That whilst sending tombstone and non null value, the consumer can expect
> only to receive the non-null message only in step (3) is this correct?
>
> Cheers
> Mike
>
>
>
> Sent using OWA for iPhone
> 
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Thursday, November 17, 2016 5:18:41 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Hi Ismael,
>
> Thanks for the explanation.
> Specially I like this part where in you mentioned we can get rid of the
> older null value support for log compaction later on, here :
> We can't change semantics of the message format without having a long
> transition period. And we can't rely
> on people reading documentation or acting on a warning for something so
> fundamental. As such, my take is that we need to bump the magic byte. The
> good news is
> that we don't have to support all versions forever. We have said that we
> will support direct upgrades for 2 years. That means that message format
> version n could, in theory, be removed 2 years after the it's introduced.
>
> Just a heads up, I would like to mention that even without bumping magic
> byte, we will *NOT* loose zero copy as in the client(x+1) in my explanation
> above will convert internally a null value to have a tombstone bit set and
> a tombstone bit set to have a null value automatically internally and by
> the time we move to version (x+2), the clients would have upgraded.
> Obviously if we support a request from consumer(x), we will loose zero copy
> but that is the same case with magic byte.
>
> But if magic byte bump makes life easier for transition for the above
> reasons that you explained, I am OK with it since we are going to meet the
> end goal down the road :)
>
> On a side note can we update the doc here on magic byte to say that "*it
> should be bumped whenever the message format is changed or the
> interpretation of message format (usage of the reserved bits as well) is
> changed*".
>
>
> Hi Michael,
>
> Here is the update plan that we discussed offline yesterday :
>
> Currently the magic-byte which corresponds to the "message.format.version"
> is set to 1.
>
> 1) On broker it will be set to 1 initially.
>
> 2) When a producer client sends a message with magic-byte = 2, since the
> broker is on magic-byte = 1, we will down convert it, which means if the
> tombstone bit is set, the value will be set to null. A consumer
> understanding magic-byte = 1, will still work with this. A consumer working
> with magic-byte =2 will also be able to understand this, since it
> understands the tombstone.
> Now there is still the question of supporting a non-tombstone and null
> value from producer client with magic-byte = 2.* (I am not sure if we
> should support this. Ismael/Becket can comment here)*
>
> 3) When almost all the clients have upgraded, the message.format.version on
> the broker can be changed to 2, where in the down conversion in the above
> step will not happen. If at this point we get a consumer request from a
> older consumer, we might have to down convert where in we loose zero copy,
> but these cases should be rare.
>
> Becket can you review this plan and add more details if I have
> missed/wronged something, before we put it on KIP.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > Thanks guys, for discussing this offline and getting some consensus.
> >
> > So its clear for myself and others what is proposed now (i think i
> > understand, but want to make sure)
> >
> > Could i ask either directly update the kip to detail the migration
> > strategy, or (re-)state your offline discussed and agreed migration
> > strategy based on a magic byte is in this thread.
> >
> >
> > The main original driver for the KIP was to support compaction where
> value
> > isn't null, based off the discussions on KIP-82 thread.
> >
> > We should be able to support non-tombstone + null value by the completion
> > of the KIP, as we noted when discussing this kip, having logic based on a
> > nu

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-18 Thread Michael Pearce
Many thanks for this Mayuresh. I don't have any objections.

I assume we should state:

That whilst sending tombstone and non null value, the consumer can expect only 
to receive the non-null message only in step (3) is this correct?

Cheers
Mike



Sent using OWA for iPhone

From: Mayuresh Gharat <gharatmayures...@gmail.com>
Sent: Thursday, November 17, 2016 5:18:41 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

Hi Ismael,

Thanks for the explanation.
Specially I like this part where in you mentioned we can get rid of the
older null value support for log compaction later on, here :
We can't change semantics of the message format without having a long
transition period. And we can't rely
on people reading documentation or acting on a warning for something so
fundamental. As such, my take is that we need to bump the magic byte. The
good news is
that we don't have to support all versions forever. We have said that we
will support direct upgrades for 2 years. That means that message format
version n could, in theory, be removed 2 years after the it's introduced.

Just a heads up, I would like to mention that even without bumping magic
byte, we will *NOT* loose zero copy as in the client(x+1) in my explanation
above will convert internally a null value to have a tombstone bit set and
a tombstone bit set to have a null value automatically internally and by
the time we move to version (x+2), the clients would have upgraded.
Obviously if we support a request from consumer(x), we will loose zero copy
but that is the same case with magic byte.

But if magic byte bump makes life easier for transition for the above
reasons that you explained, I am OK with it since we are going to meet the
end goal down the road :)

On a side note can we update the doc here on magic byte to say that "*it
should be bumped whenever the message format is changed or the
interpretation of message format (usage of the reserved bits as well) is
changed*".


Hi Michael,

Here is the update plan that we discussed offline yesterday :

Currently the magic-byte which corresponds to the "message.format.version"
is set to 1.

1) On broker it will be set to 1 initially.

2) When a producer client sends a message with magic-byte = 2, since the
broker is on magic-byte = 1, we will down convert it, which means if the
tombstone bit is set, the value will be set to null. A consumer
understanding magic-byte = 1, will still work with this. A consumer working
with magic-byte =2 will also be able to understand this, since it
understands the tombstone.
Now there is still the question of supporting a non-tombstone and null
value from producer client with magic-byte = 2.* (I am not sure if we
should support this. Ismael/Becket can comment here)*

3) When almost all the clients have upgraded, the message.format.version on
the broker can be changed to 2, where in the down conversion in the above
step will not happen. If at this point we get a consumer request from a
older consumer, we might have to down convert where in we loose zero copy,
but these cases should be rare.

Becket can you review this plan and add more details if I have
missed/wronged something, before we put it on KIP.

Thanks,

Mayuresh

On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Thanks guys, for discussing this offline and getting some consensus.
>
> So its clear for myself and others what is proposed now (i think i
> understand, but want to make sure)
>
> Could i ask either directly update the kip to detail the migration
> strategy, or (re-)state your offline discussed and agreed migration
> strategy based on a magic byte is in this thread.
>
>
> The main original driver for the KIP was to support compaction where value
> isn't null, based off the discussions on KIP-82 thread.
>
> We should be able to support non-tombstone + null value by the completion
> of the KIP, as we noted when discussing this kip, having logic based on a
> null value isn't very clean and also separates the concerns.
>
> As discussed already though we can split this into KIP-87a and KIP-87b
>
> Where we look to deliver KIP-87a on a compacted topic (to address the
> immediate issues)
> * tombstone + null value
> * tombstone + non-null value
> * non-tombstone + non-null value
>
> Then we can discuss once KIP-87a is completed options later and how we
> support the second part KIP-87b to deliver:
> * non-tombstone + null value
>
> Cheers
> Mike
>
>
>
> ____
> From: Becket Qin <becket....@gmail.com>
> Sent: Thursday, November 17, 2016 1:43 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Renu, Mayuresh and I had an offline discussion, and following is a brief
>

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-17 Thread Mayuresh Gharat
Hi Ismael,

Thanks for the explanation.
Specially I like this part where in you mentioned we can get rid of the
older null value support for log compaction later on, here :
We can't change semantics of the message format without having a long
transition period. And we can't rely
on people reading documentation or acting on a warning for something so
fundamental. As such, my take is that we need to bump the magic byte. The
good news is
that we don't have to support all versions forever. We have said that we
will support direct upgrades for 2 years. That means that message format
version n could, in theory, be removed 2 years after the it's introduced.

Just a heads up, I would like to mention that even without bumping magic
byte, we will *NOT* loose zero copy as in the client(x+1) in my explanation
above will convert internally a null value to have a tombstone bit set and
a tombstone bit set to have a null value automatically internally and by
the time we move to version (x+2), the clients would have upgraded.
Obviously if we support a request from consumer(x), we will loose zero copy
but that is the same case with magic byte.

But if magic byte bump makes life easier for transition for the above
reasons that you explained, I am OK with it since we are going to meet the
end goal down the road :)

On a side note can we update the doc here on magic byte to say that "*it
should be bumped whenever the message format is changed or the
interpretation of message format (usage of the reserved bits as well) is
changed*".


Hi Michael,

Here is the update plan that we discussed offline yesterday :

Currently the magic-byte which corresponds to the "message.format.version"
is set to 1.

1) On broker it will be set to 1 initially.

2) When a producer client sends a message with magic-byte = 2, since the
broker is on magic-byte = 1, we will down convert it, which means if the
tombstone bit is set, the value will be set to null. A consumer
understanding magic-byte = 1, will still work with this. A consumer working
with magic-byte =2 will also be able to understand this, since it
understands the tombstone.
Now there is still the question of supporting a non-tombstone and null
value from producer client with magic-byte = 2.* (I am not sure if we
should support this. Ismael/Becket can comment here)*

3) When almost all the clients have upgraded, the message.format.version on
the broker can be changed to 2, where in the down conversion in the above
step will not happen. If at this point we get a consumer request from a
older consumer, we might have to down convert where in we loose zero copy,
but these cases should be rare.

Becket can you review this plan and add more details if I have
missed/wronged something, before we put it on KIP.

Thanks,

Mayuresh

On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Thanks guys, for discussing this offline and getting some consensus.
>
> So its clear for myself and others what is proposed now (i think i
> understand, but want to make sure)
>
> Could i ask either directly update the kip to detail the migration
> strategy, or (re-)state your offline discussed and agreed migration
> strategy based on a magic byte is in this thread.
>
>
> The main original driver for the KIP was to support compaction where value
> isn't null, based off the discussions on KIP-82 thread.
>
> We should be able to support non-tombstone + null value by the completion
> of the KIP, as we noted when discussing this kip, having logic based on a
> null value isn't very clean and also separates the concerns.
>
> As discussed already though we can split this into KIP-87a and KIP-87b
>
> Where we look to deliver KIP-87a on a compacted topic (to address the
> immediate issues)
> * tombstone + null value
> * tombstone + non-null value
> * non-tombstone + non-null value
>
> Then we can discuss once KIP-87a is completed options later and how we
> support the second part KIP-87b to deliver:
> * non-tombstone + null value
>
> Cheers
> Mike
>
>
>
> ____
> From: Becket Qin <becket@gmail.com>
> Sent: Thursday, November 17, 2016 1:43 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Renu, Mayuresh and I had an offline discussion, and following is a brief
> summary.
>
> 1. We agreed that not bumping up magic value may result in losing zero copy
> during migration.
> 2. Given that bumping up magic value is almost free and has benefit of
> avoiding potential performance issue. It is probably worth doing.
>
> One issue we still need to think about is whether we want to support a
> non-tombstone message with null value.
> Currently it is not supported by Kafka. If we allow a non-tombstone null
> value message to exist after KI

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Michael Pearce
Thanks guys, for discussing this offline and getting some consensus.

So its clear for myself and others what is proposed now (i think i understand, 
but want to make sure)

Could i ask either directly update the kip to detail the migration strategy, or 
(re-)state your offline discussed and agreed migration strategy based on a 
magic byte is in this thread.


The main original driver for the KIP was to support compaction where value 
isn't null, based off the discussions on KIP-82 thread.

We should be able to support non-tombstone + null value by the completion of 
the KIP, as we noted when discussing this kip, having logic based on a null 
value isn't very clean and also separates the concerns.

As discussed already though we can split this into KIP-87a and KIP-87b

Where we look to deliver KIP-87a on a compacted topic (to address the immediate 
issues)
* tombstone + null value
* tombstone + non-null value
* non-tombstone + non-null value

Then we can discuss once KIP-87a is completed options later and how we support 
the second part KIP-87b to deliver:
* non-tombstone + null value

Cheers
Mike




From: Becket Qin <becket@gmail.com>
Sent: Thursday, November 17, 2016 1:43 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

Renu, Mayuresh and I had an offline discussion, and following is a brief
summary.

1. We agreed that not bumping up magic value may result in losing zero copy
during migration.
2. Given that bumping up magic value is almost free and has benefit of
avoiding potential performance issue. It is probably worth doing.

One issue we still need to think about is whether we want to support a
non-tombstone message with null value.
Currently it is not supported by Kafka. If we allow a non-tombstone null
value message to exist after KIP-87. The problem is that such message will
not be supported by the consumers prior to KIP-87. Because a null value
will always be interpreted to a tombstone.

One option is that we keep the current way, i.e. do not support such
message. It would be good to know if there is a concrete use case for such
message. If there is not, we can probably just not support it.

Thanks,

JIangjie (Becket) Qin



On Wed, Nov 16, 2016 at 1:28 PM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Ismael,
>
> This is something I can think of for migration plan:
> So the migration plan can look something like this, with up conversion :
>
> 1) Currently lets say we have Broker at version x.
> 2) Currently we have clients at version x.
> 3) a) We move the version to Broker(x+1) : supports both tombstone and null
> for log compaction.
> b) We upgrade the client to version client(x+1) : if in the producer
> client(x+1) the value is set to null, we will automatically set the
> Tombstone bit internally. If the producer client(x+1) sets the tombstone
> itself, well and good. For producer client(x), the broker will up convert
> to have the tombstone bit. Broker(x+1) is supporting both. Consumer
> client(x+1) will be aware of this and should be able to handle this. For
> consumer client(x) we will down convert the message on the broker side.
> c) At this point we will have to specify a warning or clearly specify
> in docs that this behavior is about to be changed for log compaction.
> 4) a) In next release of the Broker(x+2), we say that only Tombstone is
> used for log compaction on the Broker side. Clients(x+1) still is
> supported.
> b) We upgrade the client to version client(x+2) : if value is set to
> null, tombstone will not be set automatically. The client will have to call
> setTombstone() to actually set the tombstone.
>
> We should compare this migration plan with the migration plan for magic
> byte bump and do whatever looks good.
> I am just worried that if we go down magic byte route, unless I am missing
> something, it sounds like kafka will be stuck with supporting both null
> value and tombstone bit for log compaction for life long, which does not
> look like a good end state.
>
> Thanks,
>
> Mayuresh
>
>
>
>
> On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Ismael,
> >
> > That's a very good point which I might have not considered earlier.
> >
> > Here is a plan that I can think of:
> >
> > Stage 1) The broker from now on, up converts the message to have the
> > tombstone marker. The log compaction thread does log compaction based on
> > both null and tombstone marker. This is our transition period.
> > Stage 2) The next release we only say that log compaction is based on
> > tombstone marker. (Open source kafka makes this as a policy). By this
> time,
> > the organization which is moving to 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Becket Qin
Renu, Mayuresh and I had an offline discussion, and following is a brief
summary.

1. We agreed that not bumping up magic value may result in losing zero copy
during migration.
2. Given that bumping up magic value is almost free and has benefit of
avoiding potential performance issue. It is probably worth doing.

One issue we still need to think about is whether we want to support a
non-tombstone message with null value.
Currently it is not supported by Kafka. If we allow a non-tombstone null
value message to exist after KIP-87. The problem is that such message will
not be supported by the consumers prior to KIP-87. Because a null value
will always be interpreted to a tombstone.

One option is that we keep the current way, i.e. do not support such
message. It would be good to know if there is a concrete use case for such
message. If there is not, we can probably just not support it.

Thanks,

JIangjie (Becket) Qin



On Wed, Nov 16, 2016 at 1:28 PM, Mayuresh Gharat  wrote:

> Hi Ismael,
>
> This is something I can think of for migration plan:
> So the migration plan can look something like this, with up conversion :
>
> 1) Currently lets say we have Broker at version x.
> 2) Currently we have clients at version x.
> 3) a) We move the version to Broker(x+1) : supports both tombstone and null
> for log compaction.
> b) We upgrade the client to version client(x+1) : if in the producer
> client(x+1) the value is set to null, we will automatically set the
> Tombstone bit internally. If the producer client(x+1) sets the tombstone
> itself, well and good. For producer client(x), the broker will up convert
> to have the tombstone bit. Broker(x+1) is supporting both. Consumer
> client(x+1) will be aware of this and should be able to handle this. For
> consumer client(x) we will down convert the message on the broker side.
> c) At this point we will have to specify a warning or clearly specify
> in docs that this behavior is about to be changed for log compaction.
> 4) a) In next release of the Broker(x+2), we say that only Tombstone is
> used for log compaction on the Broker side. Clients(x+1) still is
> supported.
> b) We upgrade the client to version client(x+2) : if value is set to
> null, tombstone will not be set automatically. The client will have to call
> setTombstone() to actually set the tombstone.
>
> We should compare this migration plan with the migration plan for magic
> byte bump and do whatever looks good.
> I am just worried that if we go down magic byte route, unless I am missing
> something, it sounds like kafka will be stuck with supporting both null
> value and tombstone bit for log compaction for life long, which does not
> look like a good end state.
>
> Thanks,
>
> Mayuresh
>
>
>
>
> On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Ismael,
> >
> > That's a very good point which I might have not considered earlier.
> >
> > Here is a plan that I can think of:
> >
> > Stage 1) The broker from now on, up converts the message to have the
> > tombstone marker. The log compaction thread does log compaction based on
> > both null and tombstone marker. This is our transition period.
> > Stage 2) The next release we only say that log compaction is based on
> > tombstone marker. (Open source kafka makes this as a policy). By this
> time,
> > the organization which is moving to this release will be sure that they
> > have gone through the entire transition period.
> >
> > My only goal of doing this is that Kafka clearly specifies the end state
> > about what log compaction means (is it null value or a tombstone marker,
> > but not both).
> >
> > What do you think?
> >
> > Thanks,
> >
> > Mayuresh
> > .
> >
> > On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma  wrote:
> >
> >> One comment below.
> >>
> >> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
> >> gharatmayures...@gmail.com
> >> > wrote:
> >>
> >> >- If we don't bump up the magic byte, on the broker side, the
> broker
> >> >will always have to look at both tombstone bit and the value when
> do
> >> the
> >> >compaction. Assuming we do not bump up the magic byte,
> >> >imagine the broker sees a message which does not have a tombstone
> bit
> >> >set. The broker does not know when the message was produced (i.e.
> >> > whether
> >> >the message has been up converted or not), it has to take a further
> >> > look at
> >> >the value to see if it is null or not in order to determine if it
> is
> >> a
> >> >tombstone. The same logic has to be put on the consumer as well
> >> because
> >> > the
> >> >consumer does not know if the message has been up converted or not.
> >> >   - If we upconvert while appending, this is not the case, right?
> >>
> >>
> >> If I understand you correctly, this is not sufficient because the log
> may
> >> have messages appended before it was upgraded to include KIP-87.
> >>
> >> 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Ismael Juma
Hi Mayuresh,

Thanks for describing your plan in detail. I understand the concern about
having to support both old and new way for a long time. In my opinion, we
don't have a choice in that matter. We can't change semantics of the
message format without having a long transition period. And we can't rely
on people reading documentation or acting on a warning for something so
fundamental.

As such, my take is that we need to bump the magic byte. The good news is
that we don't have to support all versions forever. We have said that we
will support direct upgrades for 2 years. That means that message format
version n could, in theory, be removed 2 years after the it's introduced.
So far, we only have 2 message format versions and we have never removed
any. So, we'll see what will actually happen, in practice.

Ismael

On Wed, Nov 16, 2016 at 9:28 PM, Mayuresh Gharat  wrote:

> Hi Ismael,
>
> This is something I can think of for migration plan:
> So the migration plan can look something like this, with up conversion :
>
> 1) Currently lets say we have Broker at version x.
> 2) Currently we have clients at version x.
> 3) a) We move the version to Broker(x+1) : supports both tombstone and null
> for log compaction.
> b) We upgrade the client to version client(x+1) : if in the producer
> client(x+1) the value is set to null, we will automatically set the
> Tombstone bit internally. If the producer client(x+1) sets the tombstone
> itself, well and good. For producer client(x), the broker will up convert
> to have the tombstone bit. Broker(x+1) is supporting both. Consumer
> client(x+1) will be aware of this and should be able to handle this. For
> consumer client(x) we will down convert the message on the broker side.
> c) At this point we will have to specify a warning or clearly specify
> in docs that this behavior is about to be changed for log compaction.
> 4) a) In next release of the Broker(x+2), we say that only Tombstone is
> used for log compaction on the Broker side. Clients(x+1) still is
> supported.
> b) We upgrade the client to version client(x+2) : if value is set to
> null, tombstone will not be set automatically. The client will have to call
> setTombstone() to actually set the tombstone.
>
> We should compare this migration plan with the migration plan for magic
> byte bump and do whatever looks good.
> I am just worried that if we go down magic byte route, unless I am missing
> something, it sounds like kafka will be stuck with supporting both null
> value and tombstone bit for log compaction for life long, which does not
> look like a good end state.
>
> Thanks,
>
> Mayuresh
>
>
>
>
> On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Ismael,
> >
> > That's a very good point which I might have not considered earlier.
> >
> > Here is a plan that I can think of:
> >
> > Stage 1) The broker from now on, up converts the message to have the
> > tombstone marker. The log compaction thread does log compaction based on
> > both null and tombstone marker. This is our transition period.
> > Stage 2) The next release we only say that log compaction is based on
> > tombstone marker. (Open source kafka makes this as a policy). By this
> time,
> > the organization which is moving to this release will be sure that they
> > have gone through the entire transition period.
> >
> > My only goal of doing this is that Kafka clearly specifies the end state
> > about what log compaction means (is it null value or a tombstone marker,
> > but not both).
> >
> > What do you think?
> >
> > Thanks,
> >
> > Mayuresh
> > .
> >
> > On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma  wrote:
> >
> >> One comment below.
> >>
> >> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
> >> gharatmayures...@gmail.com
> >> > wrote:
> >>
> >> >- If we don't bump up the magic byte, on the broker side, the
> broker
> >> >will always have to look at both tombstone bit and the value when
> do
> >> the
> >> >compaction. Assuming we do not bump up the magic byte,
> >> >imagine the broker sees a message which does not have a tombstone
> bit
> >> >set. The broker does not know when the message was produced (i.e.
> >> > whether
> >> >the message has been up converted or not), it has to take a further
> >> > look at
> >> >the value to see if it is null or not in order to determine if it
> is
> >> a
> >> >tombstone. The same logic has to be put on the consumer as well
> >> because
> >> > the
> >> >consumer does not know if the message has been up converted or not.
> >> >   - If we upconvert while appending, this is not the case, right?
> >>
> >>
> >> If I understand you correctly, this is not sufficient because the log
> may
> >> have messages appended before it was upgraded to include KIP-87.
> >>
> >> Ismael
> >>
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>
>
>
> --
> 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Mayuresh Gharat
Hi Ismael,

This is something I can think of for migration plan:
So the migration plan can look something like this, with up conversion :

1) Currently lets say we have Broker at version x.
2) Currently we have clients at version x.
3) a) We move the version to Broker(x+1) : supports both tombstone and null
for log compaction.
b) We upgrade the client to version client(x+1) : if in the producer
client(x+1) the value is set to null, we will automatically set the
Tombstone bit internally. If the producer client(x+1) sets the tombstone
itself, well and good. For producer client(x), the broker will up convert
to have the tombstone bit. Broker(x+1) is supporting both. Consumer
client(x+1) will be aware of this and should be able to handle this. For
consumer client(x) we will down convert the message on the broker side.
c) At this point we will have to specify a warning or clearly specify
in docs that this behavior is about to be changed for log compaction.
4) a) In next release of the Broker(x+2), we say that only Tombstone is
used for log compaction on the Broker side. Clients(x+1) still is
supported.
b) We upgrade the client to version client(x+2) : if value is set to
null, tombstone will not be set automatically. The client will have to call
setTombstone() to actually set the tombstone.

We should compare this migration plan with the migration plan for magic
byte bump and do whatever looks good.
I am just worried that if we go down magic byte route, unless I am missing
something, it sounds like kafka will be stuck with supporting both null
value and tombstone bit for log compaction for life long, which does not
look like a good end state.

Thanks,

Mayuresh




On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat  wrote:

> Hi Ismael,
>
> That's a very good point which I might have not considered earlier.
>
> Here is a plan that I can think of:
>
> Stage 1) The broker from now on, up converts the message to have the
> tombstone marker. The log compaction thread does log compaction based on
> both null and tombstone marker. This is our transition period.
> Stage 2) The next release we only say that log compaction is based on
> tombstone marker. (Open source kafka makes this as a policy). By this time,
> the organization which is moving to this release will be sure that they
> have gone through the entire transition period.
>
> My only goal of doing this is that Kafka clearly specifies the end state
> about what log compaction means (is it null value or a tombstone marker,
> but not both).
>
> What do you think?
>
> Thanks,
>
> Mayuresh
> .
>
> On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma  wrote:
>
>> One comment below.
>>
>> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com
>> > wrote:
>>
>> >- If we don't bump up the magic byte, on the broker side, the broker
>> >will always have to look at both tombstone bit and the value when do
>> the
>> >compaction. Assuming we do not bump up the magic byte,
>> >imagine the broker sees a message which does not have a tombstone bit
>> >set. The broker does not know when the message was produced (i.e.
>> > whether
>> >the message has been up converted or not), it has to take a further
>> > look at
>> >the value to see if it is null or not in order to determine if it is
>> a
>> >tombstone. The same logic has to be put on the consumer as well
>> because
>> > the
>> >consumer does not know if the message has been up converted or not.
>> >   - If we upconvert while appending, this is not the case, right?
>>
>>
>> If I understand you correctly, this is not sufficient because the log may
>> have messages appended before it was upgraded to include KIP-87.
>>
>> Ismael
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Mayuresh Gharat
Hi Ismael,

That's a very good point which I might have not considered earlier.

Here is a plan that I can think of:

Stage 1) The broker from now on, up converts the message to have the
tombstone marker. The log compaction thread does log compaction based on
both null and tombstone marker. This is our transition period.
Stage 2) The next release we only say that log compaction is based on
tombstone marker. (Open source kafka makes this as a policy). By this time,
the organization which is moving to this release will be sure that they
have gone through the entire transition period.

My only goal of doing this is that Kafka clearly specifies the end state
about what log compaction means (is it null value or a tombstone marker,
but not both).

What do you think?

Thanks,

Mayuresh
.

On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma  wrote:

> One comment below.
>
> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> >- If we don't bump up the magic byte, on the broker side, the broker
> >will always have to look at both tombstone bit and the value when do
> the
> >compaction. Assuming we do not bump up the magic byte,
> >imagine the broker sees a message which does not have a tombstone bit
> >set. The broker does not know when the message was produced (i.e.
> > whether
> >the message has been up converted or not), it has to take a further
> > look at
> >the value to see if it is null or not in order to determine if it is a
> >tombstone. The same logic has to be put on the consumer as well
> because
> > the
> >consumer does not know if the message has been up converted or not.
> >   - If we upconvert while appending, this is not the case, right?
>
>
> If I understand you correctly, this is not sufficient because the log may
> have messages appended before it was upgraded to include KIP-87.
>
> Ismael
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Ismael Juma
One comment below.

On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat  wrote:

>- If we don't bump up the magic byte, on the broker side, the broker
>will always have to look at both tombstone bit and the value when do the
>compaction. Assuming we do not bump up the magic byte,
>imagine the broker sees a message which does not have a tombstone bit
>set. The broker does not know when the message was produced (i.e.
> whether
>the message has been up converted or not), it has to take a further
> look at
>the value to see if it is null or not in order to determine if it is a
>tombstone. The same logic has to be put on the consumer as well because
> the
>consumer does not know if the message has been up converted or not.
>   - If we upconvert while appending, this is not the case, right?


If I understand you correctly, this is not sufficient because the log may
have messages appended before it was upgraded to include KIP-87.

Ismael


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-16 Thread Mayuresh Gharat
ave to look at both tombstone bit and the
> value when do the compaction. Assuming we do not bump up the magic byte,
> imagine the broker sees a message which does not have a tombstone bit set.
> The broker does not know when the message was produced (i.e. whether the
> message has been up converted or not), it has to take a further look at the
> value to see if it is null or not in order to determine if it is a
> tombstone. The same logic has to be put on the consumer as well because the
> consumer does not know if the message has been up converted or not. With a
> magic value bump, the broker/consumer knows for sure there is no need to
> look at the value anymore if the tombstone bit is not set. So if we want to
> eventually only use tombstone bit, a magic value bump is necessary.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Nov 14, 2016 at 11:39 PM, Renu Tewari <tewa...@gmail.com> wrote:
>
> > Is upping the magic byte to 2 needed?
> >
> > In your example say
> > For broker api version at or above 0.10.2 the tombstone bit will be used
> > for log compaction deletion.
> > If the producerequest version is less than 0.10.2 and the message is
> null,
> > the broker will up convert to set the tombstone bit on
> > If the producerequest version is at or above 0.10.2 then the tombstone
> bit
> > value is unchanged
> > Either way the new version of broker only  uses the tombstone bit
> > internally.
> >
> > thanks
> > Renu
> >
> > On Mon, Nov 14, 2016 at 8:31 PM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > If we follow the current way of doing this format change, it would work
> > the
> > > following way:
> > >
> > > 0. Bump up the magic byte to 2 to indicate the tombstone bit is used.
> > >
> > > 1. On receiving a ProduceRequest, broker always convert the messages to
> > the
> > > configured message.format.version.
> > > 1.1 If the message version does not match the configured
> > > message.format.version, the broker will either up convert or down
> convert
> > > the message. In that case, users pay the performance cost of
> > re-compression
> > > if needed.
> > > 1.2 If the message version matches the configured
> message.format.version,
> > > the broker will not do the format conversion and user may save the
> > > re-compression cost if the message.format.version is on or above
> 0.10.0.
> > >
> > > 2. On receiving a FetchRequest, the broker check the FetchRequest
> version
> > > to see if the consumer supports the configured message.format.version
> or
> > > not. If the consumer does not support it, down conversion is required
> and
> > > zero copy is lost. Otherwise zero copy is used to return the
> > FetchResponse.
> > >
> > > Notice that setting message.format.version to 0.10.2 is equivalent to
> > > always up convert if needed, but that also means to always down convert
> > if
> > > there is an old consumer.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > >
> > > On Mon, Nov 14, 2016 at 1:43 PM, Michael Pearce <michael.pea...@ig.com
> >
> > > wrote:
> > >
> > > > I like the idea of up converting and then just having the logic to
> look
> > > > for tombstones. It makes that quite clean in nature.
> > > >
> > > > It's quite late here in the UK, so I fully understand / confirm I
> > > > understand what you propose could you write it on the kip wiki or
> fully
> > > > describe exactly how you see it working, so uk morning I could read
> > > through?
> > > >
> > > > Thanks all for the input on this it is appreciated.
> > > >
> > > >
> > > > Sent using OWA for iPhone
> > > > 
> > > > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > > > Sent: Monday, November 14, 2016 9:28:16 PM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > > >
> > > > Hi Michael,
> > > >
> > > > Just another thing that came up during my discussion with Renu and I
> > > wanted
> > > > to share it.
> > > >
> > > > Other thing we can do to handle a mixture of old and new clients is
> > when
> > > > once the new broker with this KIP is deployed, the new code should
>

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-15 Thread Becket Qin
Hi Renu,

Technically speaking we may not need to bump up the magic value. A
tombstone is a message either with tombstone bit set OR with a null
value.(once all the clients has been upgraded, it automatically becomes
only based on tombstone) However, this leads to a few subtle issues.

1. Migration. In practice, the up-conversion may not be used until all the
clients are upgraded. Kafka requires the client version to be lower than
the server version for compatibility. That means the new server code is
always deployed when clients are still running old code. When rolling out
the new server code, we will keep the message format at 0.10.1. We cannot
use message format version 0.10.2 yet because if we do that the brokers
will suddenly lose zero copy This is because all the clients are still
running old code. If we bump up the message version to 0.10.2, the broker
will have to look at the message to see if down conversion is needed).
Later on when most of the consumers have upgraded, we will then bump up the
message format version to 0.10.2. So the broker cannot always up convert
and depend on the tombstone even with the new code.

2. Long term efficiency. If we don't bump up the magic byte, on the broker
side, the broker will always have to look at both tombstone bit and the
value when do the compaction. Assuming we do not bump up the magic byte,
imagine the broker sees a message which does not have a tombstone bit set.
The broker does not know when the message was produced (i.e. whether the
message has been up converted or not), it has to take a further look at the
value to see if it is null or not in order to determine if it is a
tombstone. The same logic has to be put on the consumer as well because the
consumer does not know if the message has been up converted or not. With a
magic value bump, the broker/consumer knows for sure there is no need to
look at the value anymore if the tombstone bit is not set. So if we want to
eventually only use tombstone bit, a magic value bump is necessary.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 14, 2016 at 11:39 PM, Renu Tewari <tewa...@gmail.com> wrote:

> Is upping the magic byte to 2 needed?
>
> In your example say
> For broker api version at or above 0.10.2 the tombstone bit will be used
> for log compaction deletion.
> If the producerequest version is less than 0.10.2 and the message is null,
> the broker will up convert to set the tombstone bit on
> If the producerequest version is at or above 0.10.2 then the tombstone bit
> value is unchanged
> Either way the new version of broker only  uses the tombstone bit
> internally.
>
> thanks
> Renu
>
> On Mon, Nov 14, 2016 at 8:31 PM, Becket Qin <becket@gmail.com> wrote:
>
> > If we follow the current way of doing this format change, it would work
> the
> > following way:
> >
> > 0. Bump up the magic byte to 2 to indicate the tombstone bit is used.
> >
> > 1. On receiving a ProduceRequest, broker always convert the messages to
> the
> > configured message.format.version.
> > 1.1 If the message version does not match the configured
> > message.format.version, the broker will either up convert or down convert
> > the message. In that case, users pay the performance cost of
> re-compression
> > if needed.
> > 1.2 If the message version matches the configured message.format.version,
> > the broker will not do the format conversion and user may save the
> > re-compression cost if the message.format.version is on or above 0.10.0.
> >
> > 2. On receiving a FetchRequest, the broker check the FetchRequest version
> > to see if the consumer supports the configured message.format.version or
> > not. If the consumer does not support it, down conversion is required and
> > zero copy is lost. Otherwise zero copy is used to return the
> FetchResponse.
> >
> > Notice that setting message.format.version to 0.10.2 is equivalent to
> > always up convert if needed, but that also means to always down convert
> if
> > there is an old consumer.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Mon, Nov 14, 2016 at 1:43 PM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > I like the idea of up converting and then just having the logic to look
> > > for tombstones. It makes that quite clean in nature.
> > >
> > > It's quite late here in the UK, so I fully understand / confirm I
> > > understand what you propose could you write it on the kip wiki or fully
> > > describe exactly how you see it working, so uk morning I could read
> > through?
> > >
> > > Thanks all for the input on this it is appreciated.
> > >
> > >
> > > Sent u

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Renu Tewari
Is upping the magic byte to 2 needed?

In your example say
For broker api version at or above 0.10.2 the tombstone bit will be used
for log compaction deletion.
If the producerequest version is less than 0.10.2 and the message is null,
the broker will up convert to set the tombstone bit on
If the producerequest version is at or above 0.10.2 then the tombstone bit
value is unchanged
Either way the new version of broker only  uses the tombstone bit
internally.

thanks
Renu

On Mon, Nov 14, 2016 at 8:31 PM, Becket Qin <becket@gmail.com> wrote:

> If we follow the current way of doing this format change, it would work the
> following way:
>
> 0. Bump up the magic byte to 2 to indicate the tombstone bit is used.
>
> 1. On receiving a ProduceRequest, broker always convert the messages to the
> configured message.format.version.
> 1.1 If the message version does not match the configured
> message.format.version, the broker will either up convert or down convert
> the message. In that case, users pay the performance cost of re-compression
> if needed.
> 1.2 If the message version matches the configured message.format.version,
> the broker will not do the format conversion and user may save the
> re-compression cost if the message.format.version is on or above 0.10.0.
>
> 2. On receiving a FetchRequest, the broker check the FetchRequest version
> to see if the consumer supports the configured message.format.version or
> not. If the consumer does not support it, down conversion is required and
> zero copy is lost. Otherwise zero copy is used to return the FetchResponse.
>
> Notice that setting message.format.version to 0.10.2 is equivalent to
> always up convert if needed, but that also means to always down convert if
> there is an old consumer.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Nov 14, 2016 at 1:43 PM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > I like the idea of up converting and then just having the logic to look
> > for tombstones. It makes that quite clean in nature.
> >
> > It's quite late here in the UK, so I fully understand / confirm I
> > understand what you propose could you write it on the kip wiki or fully
> > describe exactly how you see it working, so uk morning I could read
> through?
> >
> > Thanks all for the input on this it is appreciated.
> >
> >
> > Sent using OWA for iPhone
> > ________________________
> > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > Sent: Monday, November 14, 2016 9:28:16 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > Hi Michael,
> >
> > Just another thing that came up during my discussion with Renu and I
> wanted
> > to share it.
> >
> > Other thing we can do to handle a mixture of old and new clients is when
> > once the new broker with this KIP is deployed, the new code should check
> > the request version from older producer we can up convert it with a
> > tombstone marker when appending the message to the log. This is similar
> to
> > down converting messages for older clients.
> >
> > If this is possible then the broker in this case has to rely only on the
> > tombstone marker for log compaction. Using this approach we preserve the
> > description of when to update the magic byte as described here :
> > https://kafka.apache.org/documentation#messageformat (1 byte "magic"
> > identifier to allow format changes).
> >
> > In stage 2, if we don't want open source kafka to make the decision of
> > deprecation of null value for log compaction (which is the approach I
> would
> > prefer as an end state) since there are some concerns on this, individual
> > companies/orgs can then have a horizontal initiative at their convenience
> > to move to stage 2 by asking all app users to upgrade to new kafka
> clients.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Mon, Nov 14, 2016 at 11:50 AM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > Michael,
> > >
> > > Yes, I am OK with stage 1. We can discuss about stage 2 later but this
> > > sounds really an organization specific decision to deprecate an API. It
> > > does not seem a general need in open source Kafka to only support
> > tombstone
> > > bit , which is a bad thing for people who are still running old clients
> > in
> > > the community. This is exactly why we want to have magic byte bump -
> > there
> > > should be only one definitive way to interpret a message with a given
> > magic
>

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Becket Qin
If we follow the current way of doing this format change, it would work the
following way:

0. Bump up the magic byte to 2 to indicate the tombstone bit is used.

1. On receiving a ProduceRequest, broker always convert the messages to the
configured message.format.version.
1.1 If the message version does not match the configured
message.format.version, the broker will either up convert or down convert
the message. In that case, users pay the performance cost of re-compression
if needed.
1.2 If the message version matches the configured message.format.version,
the broker will not do the format conversion and user may save the
re-compression cost if the message.format.version is on or above 0.10.0.

2. On receiving a FetchRequest, the broker check the FetchRequest version
to see if the consumer supports the configured message.format.version or
not. If the consumer does not support it, down conversion is required and
zero copy is lost. Otherwise zero copy is used to return the FetchResponse.

Notice that setting message.format.version to 0.10.2 is equivalent to
always up convert if needed, but that also means to always down convert if
there is an old consumer.

Thanks,

Jiangjie (Becket) Qin



On Mon, Nov 14, 2016 at 1:43 PM, Michael Pearce <michael.pea...@ig.com>
wrote:

> I like the idea of up converting and then just having the logic to look
> for tombstones. It makes that quite clean in nature.
>
> It's quite late here in the UK, so I fully understand / confirm I
> understand what you propose could you write it on the kip wiki or fully
> describe exactly how you see it working, so uk morning I could read through?
>
> Thanks all for the input on this it is appreciated.
>
>
> Sent using OWA for iPhone
> 
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Monday, November 14, 2016 9:28:16 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Hi Michael,
>
> Just another thing that came up during my discussion with Renu and I wanted
> to share it.
>
> Other thing we can do to handle a mixture of old and new clients is when
> once the new broker with this KIP is deployed, the new code should check
> the request version from older producer we can up convert it with a
> tombstone marker when appending the message to the log. This is similar to
> down converting messages for older clients.
>
> If this is possible then the broker in this case has to rely only on the
> tombstone marker for log compaction. Using this approach we preserve the
> description of when to update the magic byte as described here :
> https://kafka.apache.org/documentation#messageformat (1 byte "magic"
> identifier to allow format changes).
>
> In stage 2, if we don't want open source kafka to make the decision of
> deprecation of null value for log compaction (which is the approach I would
> prefer as an end state) since there are some concerns on this, individual
> companies/orgs can then have a horizontal initiative at their convenience
> to move to stage 2 by asking all app users to upgrade to new kafka clients.
>
> Thanks,
>
> Mayuresh
>
> On Mon, Nov 14, 2016 at 11:50 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Michael,
> >
> > Yes, I am OK with stage 1. We can discuss about stage 2 later but this
> > sounds really an organization specific decision to deprecate an API. It
> > does not seem a general need in open source Kafka to only support
> tombstone
> > bit , which is a bad thing for people who are still running old clients
> in
> > the community. This is exactly why we want to have magic byte bump -
> there
> > should be only one definitive way to interpret a message with a given
> magic
> > byte. We should avoid re-define the interpretation of a message with an
> > existing magic byte. The interpretation of an old message format should
> not
> > be changed and should always be supported until deprecated.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Nov 14, 2016 at 11:28 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > I am not sure about "If I understand correctly, you want to let the
> > broker
> > > to reject requests
> > > from old clients to ensure everyone in an organization has upgraded,
> > > right?"
> > >
> > > I don't think we will be rejecting requests. What phase2 (stage 2)
> meant
> > > was we will only do log compaction based on tombstone marker and
> nothing
> > > else.
> > > I am not sure about making this configurable, as a configuration makes
> it
> > > sound li

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Mayuresh Gharat
Hi Michael,

Just another thing that came up during my discussion with Renu and I wanted
to share it.

Other thing we can do to handle a mixture of old and new clients is when
once the new broker with this KIP is deployed, the new code should check
the request version from older producer we can up convert it with a
tombstone marker when appending the message to the log. This is similar to
down converting messages for older clients.

If this is possible then the broker in this case has to rely only on the
tombstone marker for log compaction. Using this approach we preserve the
description of when to update the magic byte as described here :
https://kafka.apache.org/documentation#messageformat (1 byte "magic"
identifier to allow format changes).

In stage 2, if we don't want open source kafka to make the decision of
deprecation of null value for log compaction (which is the approach I would
prefer as an end state) since there are some concerns on this, individual
companies/orgs can then have a horizontal initiative at their convenience
to move to stage 2 by asking all app users to upgrade to new kafka clients.

Thanks,

Mayuresh

On Mon, Nov 14, 2016 at 11:50 AM, Becket Qin <becket@gmail.com> wrote:

> Michael,
>
> Yes, I am OK with stage 1. We can discuss about stage 2 later but this
> sounds really an organization specific decision to deprecate an API. It
> does not seem a general need in open source Kafka to only support tombstone
> bit , which is a bad thing for people who are still running old clients in
> the community. This is exactly why we want to have magic byte bump - there
> should be only one definitive way to interpret a message with a given magic
> byte. We should avoid re-define the interpretation of a message with an
> existing magic byte. The interpretation of an old message format should not
> be changed and should always be supported until deprecated.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Nov 14, 2016 at 11:28 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > I am not sure about "If I understand correctly, you want to let the
> broker
> > to reject requests
> > from old clients to ensure everyone in an organization has upgraded,
> > right?"
> >
> > I don't think we will be rejecting requests. What phase2 (stage 2) meant
> > was we will only do log compaction based on tombstone marker and nothing
> > else.
> > I am not sure about making this configurable, as a configuration makes it
> > sound like the broker has the capability to support null and tombstone
> > marker bit, for log compaction, for life long.
> >
> > I agree with Michael's idea of delivering phase 1 (stage1) right now. It
> > will also be good to put a note saying that we will be deprecating the
> > older way of log compaction in next release.
> > Phase 2 (stage 2) will actually only rely on tombstone marker for doing
> log
> > compaction.
> > So this actually meets our end state of having tombstone marker for log
> > compaction support.
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Mon, Nov 14, 2016 at 11:09 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > I'm ok with this, but I'd like to at least get phase 1 in, for the next
> > > release, this is what I'm very keen for,
> > >
> > > As such shall we say we have
> > > Kip-87a that delivers phase 1
> > >
> > > And then
> > > Kip-87b in release after that delivers phase 2 but has a dependency on
> > > support of deprecating old api versions
> > >
> > > How does this approach sound?
> > >
> > >
> > > 
> > > From: Becket Qin <becket@gmail.com>
> > > Sent: Monday, November 14, 2016 6:14:44 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Hey Michael and Mayuresh,
> > >
> > > If I understand correctly, you want to let the broker to reject
> requests
> > > from old clients to ensure everyone in an organization has upgraded,
> > right?
> > > This is essentially deprecating an old protocol. I agree it is useful
> and
> > > that is why we have that baked in KIP-35. However, we haven't thought
> > > through how to deprecate an API so far. From broker's point of view.,
> it
> > > will always support all the old protocols according to Kafka's widely
> > known
> > > compatibility guarantee. If people decide to deprecate an API within an
> > > organization, we can provide some additi

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Becket Qin
Michael,

Yes, I am OK with stage 1. We can discuss about stage 2 later but this
sounds really an organization specific decision to deprecate an API. It
does not seem a general need in open source Kafka to only support tombstone
bit , which is a bad thing for people who are still running old clients in
the community. This is exactly why we want to have magic byte bump - there
should be only one definitive way to interpret a message with a given magic
byte. We should avoid re-define the interpretation of a message with an
existing magic byte. The interpretation of an old message format should not
be changed and should always be supported until deprecated.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 14, 2016 at 11:28 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> I am not sure about "If I understand correctly, you want to let the broker
> to reject requests
> from old clients to ensure everyone in an organization has upgraded,
> right?"
>
> I don't think we will be rejecting requests. What phase2 (stage 2) meant
> was we will only do log compaction based on tombstone marker and nothing
> else.
> I am not sure about making this configurable, as a configuration makes it
> sound like the broker has the capability to support null and tombstone
> marker bit, for log compaction, for life long.
>
> I agree with Michael's idea of delivering phase 1 (stage1) right now. It
> will also be good to put a note saying that we will be deprecating the
> older way of log compaction in next release.
> Phase 2 (stage 2) will actually only rely on tombstone marker for doing log
> compaction.
> So this actually meets our end state of having tombstone marker for log
> compaction support.
>
> Thanks,
>
> Mayuresh
>
>
> On Mon, Nov 14, 2016 at 11:09 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > I'm ok with this, but I'd like to at least get phase 1 in, for the next
> > release, this is what I'm very keen for,
> >
> > As such shall we say we have
> > Kip-87a that delivers phase 1
> >
> > And then
> > Kip-87b in release after that delivers phase 2 but has a dependency on
> > support of deprecating old api versions
> >
> > How does this approach sound?
> >
> >
> > ________________
> > From: Becket Qin <becket@gmail.com>
> > Sent: Monday, November 14, 2016 6:14:44 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > Hey Michael and Mayuresh,
> >
> > If I understand correctly, you want to let the broker to reject requests
> > from old clients to ensure everyone in an organization has upgraded,
> right?
> > This is essentially deprecating an old protocol. I agree it is useful and
> > that is why we have that baked in KIP-35. However, we haven't thought
> > through how to deprecate an API so far. From broker's point of view., it
> > will always support all the old protocols according to Kafka's widely
> known
> > compatibility guarantee. If people decide to deprecate an API within an
> > organization, we can provide some additional configuration to let people
> do
> > this, which is not a bad idea but seems better to be discussed in another
> > KIP.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Nov 14, 2016 at 8:52 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > I agree with Mayuresh.
> > >
> > > I don't see how having a magic byte helps here.
> > >
> > > What we are saying is that on day 1 after an upgrade both tombstone
> flag
> > > or a null value will be treated as a marker to delete on compacted
> topic.
> > >
> > > During this time we expect organisations to migrate themselves over
> onto
> > > the new producers and consumers that call the new Api, changing their
> > > application logic. during this point producers should start setting the
> > > tombstone marker and consumers to start behaving based on the tombstone
> > > marker.
> > >
> > > Only after all producers and consumers are upgraded then we enter
> phase 2
> > > disabling the use of a null value as a delete marker. And the old apis
> > > would not be allowed to be called as the broker now expects all clients
> > to
> > > be using latest api only.
> > >
> > > At this second phase stage only also we can support a null value in a
> > > compacted topic without it being treated as deletion. We solely base on
> > the
> > > tombstone marker.
> > >
> > >
>

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Mayuresh Gharat
I am not sure about "If I understand correctly, you want to let the broker
to reject requests
from old clients to ensure everyone in an organization has upgraded,
right?"

I don't think we will be rejecting requests. What phase2 (stage 2) meant
was we will only do log compaction based on tombstone marker and nothing
else.
I am not sure about making this configurable, as a configuration makes it
sound like the broker has the capability to support null and tombstone
marker bit, for log compaction, for life long.

I agree with Michael's idea of delivering phase 1 (stage1) right now. It
will also be good to put a note saying that we will be deprecating the
older way of log compaction in next release.
Phase 2 (stage 2) will actually only rely on tombstone marker for doing log
compaction.
So this actually meets our end state of having tombstone marker for log
compaction support.

Thanks,

Mayuresh


On Mon, Nov 14, 2016 at 11:09 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> I'm ok with this, but I'd like to at least get phase 1 in, for the next
> release, this is what I'm very keen for,
>
> As such shall we say we have
> Kip-87a that delivers phase 1
>
> And then
> Kip-87b in release after that delivers phase 2 but has a dependency on
> support of deprecating old api versions
>
> How does this approach sound?
>
>
> 
> From: Becket Qin <becket@gmail.com>
> Sent: Monday, November 14, 2016 6:14:44 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Hey Michael and Mayuresh,
>
> If I understand correctly, you want to let the broker to reject requests
> from old clients to ensure everyone in an organization has upgraded, right?
> This is essentially deprecating an old protocol. I agree it is useful and
> that is why we have that baked in KIP-35. However, we haven't thought
> through how to deprecate an API so far. From broker's point of view., it
> will always support all the old protocols according to Kafka's widely known
> compatibility guarantee. If people decide to deprecate an API within an
> organization, we can provide some additional configuration to let people do
> this, which is not a bad idea but seems better to be discussed in another
> KIP.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Nov 14, 2016 at 8:52 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > I agree with Mayuresh.
> >
> > I don't see how having a magic byte helps here.
> >
> > What we are saying is that on day 1 after an upgrade both tombstone flag
> > or a null value will be treated as a marker to delete on compacted topic.
> >
> > During this time we expect organisations to migrate themselves over onto
> > the new producers and consumers that call the new Api, changing their
> > application logic. during this point producers should start setting the
> > tombstone marker and consumers to start behaving based on the tombstone
> > marker.
> >
> > Only after all producers and consumers are upgraded then we enter phase 2
> > disabling the use of a null value as a delete marker. And the old apis
> > would not be allowed to be called as the broker now expects all clients
> to
> > be using latest api only.
> >
> > At this second phase stage only also we can support a null value in a
> > compacted topic without it being treated as deletion. We solely base on
> the
> > tombstone marker.
> >
> >
> > 
> > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > Sent: Saturday, November 12, 2016 2:18:19 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > I think "In second stage we move on to supporting only the attribute flag
> > for log
> > compaction." means that it will no longer support request from older
> > clients.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Nov 11, 2016 at 10:02 AM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > Hey Michael,
> > >
> > > The way Kafka implements backwards compatibility is to let the brokers
> > > support old protocols. So the brokers have to support older clients
> that
> > do
> > > not understand the new attribute bit. That means we will not be able to
> > get
> > > away with the null value as a tombstone. So we need a good definition
> of
> > > whether we should treat it as a tombstone or not. This is why I think
> we
> > > need a magic value bump.
> > >
> > > My confusion on 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Michael Pearce
I'm ok with this, but I'd like to at least get phase 1 in, for the next 
release, this is what I'm very keen for,

As such shall we say we have
Kip-87a that delivers phase 1

And then
Kip-87b in release after that delivers phase 2 but has a dependency on support 
of deprecating old api versions

How does this approach sound?



From: Becket Qin <becket@gmail.com>
Sent: Monday, November 14, 2016 6:14:44 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

Hey Michael and Mayuresh,

If I understand correctly, you want to let the broker to reject requests
from old clients to ensure everyone in an organization has upgraded, right?
This is essentially deprecating an old protocol. I agree it is useful and
that is why we have that baked in KIP-35. However, we haven't thought
through how to deprecate an API so far. From broker's point of view., it
will always support all the old protocols according to Kafka's widely known
compatibility guarantee. If people decide to deprecate an API within an
organization, we can provide some additional configuration to let people do
this, which is not a bad idea but seems better to be discussed in another
KIP.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 14, 2016 at 8:52 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> I agree with Mayuresh.
>
> I don't see how having a magic byte helps here.
>
> What we are saying is that on day 1 after an upgrade both tombstone flag
> or a null value will be treated as a marker to delete on compacted topic.
>
> During this time we expect organisations to migrate themselves over onto
> the new producers and consumers that call the new Api, changing their
> application logic. during this point producers should start setting the
> tombstone marker and consumers to start behaving based on the tombstone
> marker.
>
> Only after all producers and consumers are upgraded then we enter phase 2
> disabling the use of a null value as a delete marker. And the old apis
> would not be allowed to be called as the broker now expects all clients to
> be using latest api only.
>
> At this second phase stage only also we can support a null value in a
> compacted topic without it being treated as deletion. We solely base on the
> tombstone marker.
>
>
> 
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Saturday, November 12, 2016 2:18:19 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> I think "In second stage we move on to supporting only the attribute flag
> for log
> compaction." means that it will no longer support request from older
> clients.
>
> Thanks,
>
> Mayuresh
>
> On Fri, Nov 11, 2016 at 10:02 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Hey Michael,
> >
> > The way Kafka implements backwards compatibility is to let the brokers
> > support old protocols. So the brokers have to support older clients that
> do
> > not understand the new attribute bit. That means we will not be able to
> get
> > away with the null value as a tombstone. So we need a good definition of
> > whether we should treat it as a tombstone or not. This is why I think we
> > need a magic value bump.
> >
> > My confusion on the second stage is exactly about "we want to end up just
> > supporting tombstone marker (not both)", or in the original statement:
> "2)
> > In second stage we move on to supporting only the attribute flag for log
> > compaction." - We will always support the null value as a tombstone as
> long
> > as the message format version (i.e. magic byte) is less than 2 for the
> > reason I mentioned above.
> >
> > Although the way to interpret the message bytes at producing time can be
> > inferred from the the ProduceRequest version, this version information
> > won't be stored in the broker. So when an old consumer comes, we need to
> > decide whether we have to do down conversion or not.. With a clear
> existing
> > configuration of message version config (i.e. magic value), we know for
> > sure when to down convert the messages to adapt to older clients.
> Otherwise
> > we will have to always scan all the messages. It would probably work but
> > relies on guess or inference.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Nov 11, 2016 at 8:42 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Sounds good Michael.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Fri, Nov 11, 2016 a

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Becket Qin
Hey Michael and Mayuresh,

If I understand correctly, you want to let the broker to reject requests
from old clients to ensure everyone in an organization has upgraded, right?
This is essentially deprecating an old protocol. I agree it is useful and
that is why we have that baked in KIP-35. However, we haven't thought
through how to deprecate an API so far. From broker's point of view., it
will always support all the old protocols according to Kafka's widely known
compatibility guarantee. If people decide to deprecate an API within an
organization, we can provide some additional configuration to let people do
this, which is not a bad idea but seems better to be discussed in another
KIP.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 14, 2016 at 8:52 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> I agree with Mayuresh.
>
> I don't see how having a magic byte helps here.
>
> What we are saying is that on day 1 after an upgrade both tombstone flag
> or a null value will be treated as a marker to delete on compacted topic.
>
> During this time we expect organisations to migrate themselves over onto
> the new producers and consumers that call the new Api, changing their
> application logic. during this point producers should start setting the
> tombstone marker and consumers to start behaving based on the tombstone
> marker.
>
> Only after all producers and consumers are upgraded then we enter phase 2
> disabling the use of a null value as a delete marker. And the old apis
> would not be allowed to be called as the broker now expects all clients to
> be using latest api only.
>
> At this second phase stage only also we can support a null value in a
> compacted topic without it being treated as deletion. We solely base on the
> tombstone marker.
>
>
> 
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Saturday, November 12, 2016 2:18:19 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> I think "In second stage we move on to supporting only the attribute flag
> for log
> compaction." means that it will no longer support request from older
> clients.
>
> Thanks,
>
> Mayuresh
>
> On Fri, Nov 11, 2016 at 10:02 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Hey Michael,
> >
> > The way Kafka implements backwards compatibility is to let the brokers
> > support old protocols. So the brokers have to support older clients that
> do
> > not understand the new attribute bit. That means we will not be able to
> get
> > away with the null value as a tombstone. So we need a good definition of
> > whether we should treat it as a tombstone or not. This is why I think we
> > need a magic value bump.
> >
> > My confusion on the second stage is exactly about "we want to end up just
> > supporting tombstone marker (not both)", or in the original statement:
> "2)
> > In second stage we move on to supporting only the attribute flag for log
> > compaction." - We will always support the null value as a tombstone as
> long
> > as the message format version (i.e. magic byte) is less than 2 for the
> > reason I mentioned above.
> >
> > Although the way to interpret the message bytes at producing time can be
> > inferred from the the ProduceRequest version, this version information
> > won't be stored in the broker. So when an old consumer comes, we need to
> > decide whether we have to do down conversion or not.. With a clear
> existing
> > configuration of message version config (i.e. magic value), we know for
> > sure when to down convert the messages to adapt to older clients.
> Otherwise
> > we will have to always scan all the messages. It would probably work but
> > relies on guess or inference.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Nov 11, 2016 at 8:42 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Sounds good Michael.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Fri, Nov 11, 2016 at 12:48 AM, Michael Pearce <
> michael.pea...@ig.com>
> > > wrote:
> > >
> > > > @Mayuresh i don't think you've missed anything -
> > > >
> > > > as per earlier in the discussion.
> > > >
> > > > We're providing new api versions, but not planning on bumping the
> magic
> > > > number as there is no structural changes, we are simply using up a
> new
> > > > attribute bit (as like adding new compression support just uses

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-14 Thread Michael Pearce
I agree with Mayuresh.

I don't see how having a magic byte helps here.

What we are saying is that on day 1 after an upgrade both tombstone flag or a 
null value will be treated as a marker to delete on compacted topic.

During this time we expect organisations to migrate themselves over onto the 
new producers and consumers that call the new Api, changing their application 
logic. during this point producers should start setting the tombstone marker 
and consumers to start behaving based on the tombstone marker.

Only after all producers and consumers are upgraded then we enter phase 2 
disabling the use of a null value as a delete marker. And the old apis would 
not be allowed to be called as the broker now expects all clients to be using 
latest api only.

At this second phase stage only also we can support a null value in a compacted 
topic without it being treated as deletion. We solely base on the tombstone 
marker.



From: Mayuresh Gharat <gharatmayures...@gmail.com>
Sent: Saturday, November 12, 2016 2:18:19 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

I think "In second stage we move on to supporting only the attribute flag
for log
compaction." means that it will no longer support request from older
clients.

Thanks,

Mayuresh

On Fri, Nov 11, 2016 at 10:02 AM, Becket Qin <becket@gmail.com> wrote:

> Hey Michael,
>
> The way Kafka implements backwards compatibility is to let the brokers
> support old protocols. So the brokers have to support older clients that do
> not understand the new attribute bit. That means we will not be able to get
> away with the null value as a tombstone. So we need a good definition of
> whether we should treat it as a tombstone or not. This is why I think we
> need a magic value bump.
>
> My confusion on the second stage is exactly about "we want to end up just
> supporting tombstone marker (not both)", or in the original statement: "2)
> In second stage we move on to supporting only the attribute flag for log
> compaction." - We will always support the null value as a tombstone as long
> as the message format version (i.e. magic byte) is less than 2 for the
> reason I mentioned above.
>
> Although the way to interpret the message bytes at producing time can be
> inferred from the the ProduceRequest version, this version information
> won't be stored in the broker. So when an old consumer comes, we need to
> decide whether we have to do down conversion or not.. With a clear existing
> configuration of message version config (i.e. magic value), we know for
> sure when to down convert the messages to adapt to older clients. Otherwise
> we will have to always scan all the messages. It would probably work but
> relies on guess or inference.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Nov 11, 2016 at 8:42 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Sounds good Michael.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Nov 11, 2016 at 12:48 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > @Mayuresh i don't think you've missed anything -
> > >
> > > as per earlier in the discussion.
> > >
> > > We're providing new api versions, but not planning on bumping the magic
> > > number as there is no structural changes, we are simply using up a new
> > > attribute bit (as like adding new compression support just uses up
> > > additional attribute bits)
> > >
> > >
> > > I think also difference between null vs 0 length byte array is covered.
> > >
> > > @Becket,
> > >
> > > The two stage approach is because we want to end up just supporting
> > > tombstone marker (not both)
> > >
> > > But we accept we need to allow organisations and systems a period of
> > > transition (this is what stage 1 provides)
> > >
> > >
> > >
> > >
> > > 
> > > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > > Sent: Thursday, November 10, 2016 8:57 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > I am not sure if we are bumping magicByte value as the KIP does not
> > mention
> > > it (If I didn't miss anything).
> > >
> > > @Nacho : I did not understand what you meant by : Conversion from one
> to
> > > the other may be possible but I would rather not have to do it.
> > >
> > > You will have to do the conversion for the scenario

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-11 Thread Mayuresh Gharat
I think "In second stage we move on to supporting only the attribute flag
for log
compaction." means that it will no longer support request from older
clients.

Thanks,

Mayuresh

On Fri, Nov 11, 2016 at 10:02 AM, Becket Qin <becket@gmail.com> wrote:

> Hey Michael,
>
> The way Kafka implements backwards compatibility is to let the brokers
> support old protocols. So the brokers have to support older clients that do
> not understand the new attribute bit. That means we will not be able to get
> away with the null value as a tombstone. So we need a good definition of
> whether we should treat it as a tombstone or not. This is why I think we
> need a magic value bump.
>
> My confusion on the second stage is exactly about "we want to end up just
> supporting tombstone marker (not both)", or in the original statement: "2)
> In second stage we move on to supporting only the attribute flag for log
> compaction." - We will always support the null value as a tombstone as long
> as the message format version (i.e. magic byte) is less than 2 for the
> reason I mentioned above.
>
> Although the way to interpret the message bytes at producing time can be
> inferred from the the ProduceRequest version, this version information
> won't be stored in the broker. So when an old consumer comes, we need to
> decide whether we have to do down conversion or not.. With a clear existing
> configuration of message version config (i.e. magic value), we know for
> sure when to down convert the messages to adapt to older clients. Otherwise
> we will have to always scan all the messages. It would probably work but
> relies on guess or inference.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Nov 11, 2016 at 8:42 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Sounds good Michael.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Nov 11, 2016 at 12:48 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > @Mayuresh i don't think you've missed anything -
> > >
> > > as per earlier in the discussion.
> > >
> > > We're providing new api versions, but not planning on bumping the magic
> > > number as there is no structural changes, we are simply using up a new
> > > attribute bit (as like adding new compression support just uses up
> > > additional attribute bits)
> > >
> > >
> > > I think also difference between null vs 0 length byte array is covered.
> > >
> > > @Becket,
> > >
> > > The two stage approach is because we want to end up just supporting
> > > tombstone marker (not both)
> > >
> > > But we accept we need to allow organisations and systems a period of
> > > transition (this is what stage 1 provides)
> > >
> > >
> > >
> > >
> > > 
> > > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > > Sent: Thursday, November 10, 2016 8:57 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > I am not sure if we are bumping magicByte value as the KIP does not
> > mention
> > > it (If I didn't miss anything).
> > >
> > > @Nacho : I did not understand what you meant by : Conversion from one
> to
> > > the other may be possible but I would rather not have to do it.
> > >
> > > You will have to do the conversion for the scenario that Becket has
> > > mentioned above where an old consumer talks to the new broker, right?
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > > On Thu, Nov 10, 2016 at 11:54 AM, Becket Qin <becket@gmail.com>
> > wrote:
> > >
> > > > Nacho,
> > > >
> > > > In Kafka protocol, a negative length is null, a zero length means
> empty
> > > > byte array.
> > > >
> > > > I am still confused about the two stages. It seems that the broker
> only
> > > > needs to understand three versions of messages.
> > > > 1. MagicValue=0 - no timestamp, absolute offset, no tombstone flag
> > > > 2. MagicValue=1 - with timestamp, relative offsets, no tombstone flag
> > > (null
> > > > value = tombstone)
> > > > 3. MagicValue=2 - with timestamp, relative offsets, tombstone flag
> > > >
> > > > We are talking about two flavors for 3:
> > > > 3.1 tombstone flag set = tombstone (Allo

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-11 Thread Becket Qin
Hey Michael,

The way Kafka implements backwards compatibility is to let the brokers
support old protocols. So the brokers have to support older clients that do
not understand the new attribute bit. That means we will not be able to get
away with the null value as a tombstone. So we need a good definition of
whether we should treat it as a tombstone or not. This is why I think we
need a magic value bump.

My confusion on the second stage is exactly about "we want to end up just
supporting tombstone marker (not both)", or in the original statement: "2)
In second stage we move on to supporting only the attribute flag for log
compaction." - We will always support the null value as a tombstone as long
as the message format version (i.e. magic byte) is less than 2 for the
reason I mentioned above.

Although the way to interpret the message bytes at producing time can be
inferred from the the ProduceRequest version, this version information
won't be stored in the broker. So when an old consumer comes, we need to
decide whether we have to do down conversion or not.. With a clear existing
configuration of message version config (i.e. magic value), we know for
sure when to down convert the messages to adapt to older clients. Otherwise
we will have to always scan all the messages. It would probably work but
relies on guess or inference.

Thanks,

Jiangjie (Becket) Qin

On Fri, Nov 11, 2016 at 8:42 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Sounds good Michael.
>
> Thanks,
>
> Mayuresh
>
> On Fri, Nov 11, 2016 at 12:48 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > @Mayuresh i don't think you've missed anything -
> >
> > as per earlier in the discussion.
> >
> > We're providing new api versions, but not planning on bumping the magic
> > number as there is no structural changes, we are simply using up a new
> > attribute bit (as like adding new compression support just uses up
> > additional attribute bits)
> >
> >
> > I think also difference between null vs 0 length byte array is covered.
> >
> > @Becket,
> >
> > The two stage approach is because we want to end up just supporting
> > tombstone marker (not both)
> >
> > But we accept we need to allow organisations and systems a period of
> > transition (this is what stage 1 provides)
> >
> >
> >
> >
> > ________________
> > From: Mayuresh Gharat <gharatmayures...@gmail.com>
> > Sent: Thursday, November 10, 2016 8:57 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > I am not sure if we are bumping magicByte value as the KIP does not
> mention
> > it (If I didn't miss anything).
> >
> > @Nacho : I did not understand what you meant by : Conversion from one to
> > the other may be possible but I would rather not have to do it.
> >
> > You will have to do the conversion for the scenario that Becket has
> > mentioned above where an old consumer talks to the new broker, right?
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Thu, Nov 10, 2016 at 11:54 AM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > Nacho,
> > >
> > > In Kafka protocol, a negative length is null, a zero length means empty
> > > byte array.
> > >
> > > I am still confused about the two stages. It seems that the broker only
> > > needs to understand three versions of messages.
> > > 1. MagicValue=0 - no timestamp, absolute offset, no tombstone flag
> > > 2. MagicValue=1 - with timestamp, relative offsets, no tombstone flag
> > (null
> > > value = tombstone)
> > > 3. MagicValue=2 - with timestamp, relative offsets, tombstone flag
> > >
> > > We are talking about two flavors for 3:
> > > 3.1 tombstone flag set = tombstone (Allows a key with null value in the
> > > compacted topics)
> > > 3.2 tombstone flag set OR null value = tombstone ( Do not allow a key
> > with
> > > null value in the compacted topics)
> > >
> > > No matter which flavor we choose, we just need to stick to that way of
> > > interpretation, right? Why would we need a second stage?
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Thu, Nov 10, 2016 at 10:37 AM, Ignacio Solis <iso...@igso.net>
> wrote:
> > >
> > > > A quick differentiation I would like to make is null is not the same
> as
> > > > size 0.
> > > >
> > > > Many times these are made equal, but they're not.  When serializing
> >

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-11 Thread Mayuresh Gharat
Sounds good Michael.

Thanks,

Mayuresh

On Fri, Nov 11, 2016 at 12:48 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> @Mayuresh i don't think you've missed anything -
>
> as per earlier in the discussion.
>
> We're providing new api versions, but not planning on bumping the magic
> number as there is no structural changes, we are simply using up a new
> attribute bit (as like adding new compression support just uses up
> additional attribute bits)
>
>
> I think also difference between null vs 0 length byte array is covered.
>
> @Becket,
>
> The two stage approach is because we want to end up just supporting
> tombstone marker (not both)
>
> But we accept we need to allow organisations and systems a period of
> transition (this is what stage 1 provides)
>
>
>
>
> 
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Thursday, November 10, 2016 8:57 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> I am not sure if we are bumping magicByte value as the KIP does not mention
> it (If I didn't miss anything).
>
> @Nacho : I did not understand what you meant by : Conversion from one to
> the other may be possible but I would rather not have to do it.
>
> You will have to do the conversion for the scenario that Becket has
> mentioned above where an old consumer talks to the new broker, right?
>
> Thanks,
>
> Mayuresh
>
>
> On Thu, Nov 10, 2016 at 11:54 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Nacho,
> >
> > In Kafka protocol, a negative length is null, a zero length means empty
> > byte array.
> >
> > I am still confused about the two stages. It seems that the broker only
> > needs to understand three versions of messages.
> > 1. MagicValue=0 - no timestamp, absolute offset, no tombstone flag
> > 2. MagicValue=1 - with timestamp, relative offsets, no tombstone flag
> (null
> > value = tombstone)
> > 3. MagicValue=2 - with timestamp, relative offsets, tombstone flag
> >
> > We are talking about two flavors for 3:
> > 3.1 tombstone flag set = tombstone (Allows a key with null value in the
> > compacted topics)
> > 3.2 tombstone flag set OR null value = tombstone ( Do not allow a key
> with
> > null value in the compacted topics)
> >
> > No matter which flavor we choose, we just need to stick to that way of
> > interpretation, right? Why would we need a second stage?
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Nov 10, 2016 at 10:37 AM, Ignacio Solis <iso...@igso.net> wrote:
> >
> > > A quick differentiation I would like to make is null is not the same as
> > > size 0.
> > >
> > > Many times these are made equal, but they're not.  When serializing
> data,
> > > we have to make a choice in null values and many times these are
> > translated
> > > to zero length blobs. This is not really the same thing.
> > >
> > > From this perspective, what can Kafka represent and what is considered
> a
> > > valid value?
> > >
> > > If the user sends a byte array of length 0, or if the serializer sends
> > > something of length 0, this should be a valid value. It is not kafka's
> > job
> > > to determine what the user is trying to send. For all we know, the user
> > has
> > > a really good compression serializer that sends 0 bytes when nothing
> has
> > > changed.
> > >
> > > If the user is allowed to send null then some behavior should be
> defined.
> > > However, this should semantically be different than sending a command.
> It
> > > is possible for a null value could signify some form of delete, like
> > > "delete all messages with this key". However, if kafka has a goal to
> > write
> > > good, readable code, then this should not be allowed.
> > >
> > > A delete or a purge is a call that can have certain side effects or
> > > encounter errors that are unrelated to a send call.
> > >
> > > A couple of bullets from the Kafka style guide (
> > > http://kafka.apache.org/coding-guide.html ):
> > >
> > > - Clear code is preferable to comments. When possible make your naming
> so
> > > good you don't need comments.
> > > - Logging, configuration, and public APIs are our "UI". Make them
> pretty,
> > > consistent, and usable.
> > >
> > > If you want sendTombstone() functionality make the protocol reflect
> that.
> > >
> > > Righ

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-11 Thread Michael Pearce
@Mayuresh i don't think you've missed anything -

as per earlier in the discussion.

We're providing new api versions, but not planning on bumping the magic number 
as there is no structural changes, we are simply using up a new attribute bit 
(as like adding new compression support just uses up additional attribute bits)


I think also difference between null vs 0 length byte array is covered.

@Becket,

The two stage approach is because we want to end up just supporting tombstone 
marker (not both)

But we accept we need to allow organisations and systems a period of transition 
(this is what stage 1 provides)





From: Mayuresh Gharat <gharatmayures...@gmail.com>
Sent: Thursday, November 10, 2016 8:57 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

I am not sure if we are bumping magicByte value as the KIP does not mention
it (If I didn't miss anything).

@Nacho : I did not understand what you meant by : Conversion from one to
the other may be possible but I would rather not have to do it.

You will have to do the conversion for the scenario that Becket has
mentioned above where an old consumer talks to the new broker, right?

Thanks,

Mayuresh


On Thu, Nov 10, 2016 at 11:54 AM, Becket Qin <becket@gmail.com> wrote:

> Nacho,
>
> In Kafka protocol, a negative length is null, a zero length means empty
> byte array.
>
> I am still confused about the two stages. It seems that the broker only
> needs to understand three versions of messages.
> 1. MagicValue=0 - no timestamp, absolute offset, no tombstone flag
> 2. MagicValue=1 - with timestamp, relative offsets, no tombstone flag (null
> value = tombstone)
> 3. MagicValue=2 - with timestamp, relative offsets, tombstone flag
>
> We are talking about two flavors for 3:
> 3.1 tombstone flag set = tombstone (Allows a key with null value in the
> compacted topics)
> 3.2 tombstone flag set OR null value = tombstone ( Do not allow a key with
> null value in the compacted topics)
>
> No matter which flavor we choose, we just need to stick to that way of
> interpretation, right? Why would we need a second stage?
>
> Jiangjie (Becket) Qin
>
> On Thu, Nov 10, 2016 at 10:37 AM, Ignacio Solis <iso...@igso.net> wrote:
>
> > A quick differentiation I would like to make is null is not the same as
> > size 0.
> >
> > Many times these are made equal, but they're not.  When serializing data,
> > we have to make a choice in null values and many times these are
> translated
> > to zero length blobs. This is not really the same thing.
> >
> > From this perspective, what can Kafka represent and what is considered a
> > valid value?
> >
> > If the user sends a byte array of length 0, or if the serializer sends
> > something of length 0, this should be a valid value. It is not kafka's
> job
> > to determine what the user is trying to send. For all we know, the user
> has
> > a really good compression serializer that sends 0 bytes when nothing has
> > changed.
> >
> > If the user is allowed to send null then some behavior should be defined.
> > However, this should semantically be different than sending a command. It
> > is possible for a null value could signify some form of delete, like
> > "delete all messages with this key". However, if kafka has a goal to
> write
> > good, readable code, then this should not be allowed.
> >
> > A delete or a purge is a call that can have certain side effects or
> > encounter errors that are unrelated to a send call.
> >
> > A couple of bullets from the Kafka style guide (
> > http://kafka.apache.org/coding-guide.html ):
> >
> > - Clear code is preferable to comments. When possible make your naming so
> > good you don't need comments.
> > - Logging, configuration, and public APIs are our "UI". Make them pretty,
> > consistent, and usable.
> >
> > If you want sendTombstone() functionality make the protocol reflect that.
> >
> > Right now we're struggling because we did not have a clear separation of
> > concerns.
> >
> > I don't have a good solution for deployment. I'm in favor of a harsh
> road:
> > - Add the flag/variable/header for tombstone
> > - Add an API for tombstone behavior if needed
> > - Error if somebody tries to API send null
> > - Accept if somebody tries to API send byte[] length 0
> > - Rev broker version
> > - broker accept flag/variable/header as tombstone
> > - broker accept zero length values as normal messages
> >
> > ​The more gentle road would be:
> > - Add the flag/variable/header for tombstone
> > - Add 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Mayuresh Gharat
 2016 at 11:01 AM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > I see the reasoning and might be inclined to agree a bit :
> > > > If we go to stage 2, the only difference is that we can theoretically
> > > > support a null value non-tombstone message in a log compacted topic,
> > but
> > > I
> > > > am not sure if that has any use case.
> > > >
> > > > But as an end goal I see that kafka should clearly specify what it
> > means
> > > by
> > > > a tombstone : is it the attribute flag OR is it the null value. If we
> > > just
> > > > do stage 1, I don't think we are defining the end-goal completely.
> > > > Again this is more about semantics of correctness of end state.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Wed, Nov 9, 2016 at 10:49 AM, Becket Qin <becket@gmail.com>
> > > wrote:
> > > >
> > > > > I am not sure if we need the second stage. Wouldn't it be enough to
> > say
> > > > > that a message is a tombstone if one of the following is true?
> > > > > 1. tombstone flag is set.
> > > > > 2. value is null.
> > > > >
> > > > > If we go to stage 2, the only difference is that we can
> theoretically
> > > > > support a null value non-tombstone message in a log compacted
> topic,
> > > but
> > > > I
> > > > > am not sure if that has any use case.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > > On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I think it will be a good idea. +1
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > > > On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <
> > > michael.pea...@ig.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 Mayuresh, I think this is a good solution/strategy.
> > > > > > >
> > > > > > > Shall we update the KIP with this? Becket/Jun/Joel any comments
> > to
> > > > add
> > > > > > > before we do?
> > > > > > >
> > > > > > > On 08/11/2016, 17:29, "Mayuresh Gharat" <
> > > gharatmayures...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > I think the migration can be done in 2 stages :
> > > > > > >
> > > > > > > 1) In first stage the broker should understand the
> attribute
> > > flag
> > > > > as
> > > > > > > well
> > > > > > > as Null for the value for log compaction.
> > > > > > > 2) In second stage we move on to supporting only the
> > attribute
> > > > flag
> > > > > > > for log
> > > > > > > compaction.
> > > > > > >
> > > > > > > I agree with Becket that for older clients (consumers) the
> > > broker
> > > > > > might
> > > > > > > have to down convert a message that has the attribute flag
> > set
> > > > for
> > > > > > log
> > > > > > > compacting but has a non null value. But this should be in
> > > first
> > > > > > stage.
> > > > > > > Once all the clients have upgraded (clients start
> recognizing
> > > the
> > > > > > > attribute
> > > > > > > flag), we can move the broker to stage 2.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Mayuresh
> > > > > > >
> > > > > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > > > > michael.pea...@ig.com
> > > > > > > >
> > > > > > > wrote:

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Becket Qin
>
> > wrote:
> > >
> > > > I am not sure if we need the second stage. Wouldn't it be enough to
> say
> > > > that a message is a tombstone if one of the following is true?
> > > > 1. tombstone flag is set.
> > > > 2. value is null.
> > > >
> > > > If we go to stage 2, the only difference is that we can theoretically
> > > > support a null value non-tombstone message in a log compacted topic,
> > but
> > > I
> > > > am not sure if that has any use case.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > >
> > > > On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com>
> > > > wrote:
> > > >
> > > > > I think it will be a good idea. +1
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <
> > michael.pea...@ig.com>
> > > > > wrote:
> > > > >
> > > > > > +1 Mayuresh, I think this is a good solution/strategy.
> > > > > >
> > > > > > Shall we update the KIP with this? Becket/Jun/Joel any comments
> to
> > > add
> > > > > > before we do?
> > > > > >
> > > > > > On 08/11/2016, 17:29, "Mayuresh Gharat" <
> > gharatmayures...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > I think the migration can be done in 2 stages :
> > > > > >
> > > > > > 1) In first stage the broker should understand the attribute
> > flag
> > > > as
> > > > > > well
> > > > > > as Null for the value for log compaction.
> > > > > > 2) In second stage we move on to supporting only the
> attribute
> > > flag
> > > > > > for log
> > > > > > compaction.
> > > > > >
> > > > > > I agree with Becket that for older clients (consumers) the
> > broker
> > > > > might
> > > > > > have to down convert a message that has the attribute flag
> set
> > > for
> > > > > log
> > > > > > compacting but has a non null value. But this should be in
> > first
> > > > > stage.
> > > > > > Once all the clients have upgraded (clients start recognizing
> > the
> > > > > > attribute
> > > > > > flag), we can move the broker to stage 2.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > > > michael.pea...@ig.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Also we can add further guidance:
> > > > > > >
> > > > > > > To  avoid the below caveat to organisations by promoting of
> > > > > > upgrading all
> > > > > > > consumers first before relying on producing tombstone
> > messages
> > > > with
> > > > > > data
> > > > > > >
> > > > > > > Sent using OWA for iPhone
> > > > > > > 
> > > > > > > From: Michael Pearce
> > > > > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > > > > To: dev@kafka.apache.org
> > > > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone
> Flag
> > > > > > >
> > > > > > > Thanks Jun on the feedback, I think I understand the
> > > issue/point
> > > > > now.
> > > > > > >
> > > > > > > We def can add that on older client version if tombstone
> > marker
> > > > > make
> > > > > > the
> > > > > > > value null to preserve behaviour.
> > > > > > >
> > > > > > > There is one caveats to this:
> > > > > > >
> > > >

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Ignacio Solis
; >
> > > > > On 08/11/2016, 17:29, "Mayuresh Gharat" <
> gharatmayures...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > I think the migration can be done in 2 stages :
> > > > >
> > > > > 1) In first stage the broker should understand the attribute
> flag
> > > as
> > > > > well
> > > > > as Null for the value for log compaction.
> > > > > 2) In second stage we move on to supporting only the attribute
> > flag
> > > > > for log
> > > > > compaction.
> > > > >
> > > > > I agree with Becket that for older clients (consumers) the
> broker
> > > > might
> > > > > have to down convert a message that has the attribute flag set
> > for
> > > > log
> > > > > compacting but has a non null value. But this should be in
> first
> > > > stage.
> > > > > Once all the clients have upgraded (clients start recognizing
> the
> > > > > attribute
> > > > > flag), we can move the broker to stage 2.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > > michael.pea...@ig.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Also we can add further guidance:
> > > > > >
> > > > > > To  avoid the below caveat to organisations by promoting of
> > > > > upgrading all
> > > > > > consumers first before relying on producing tombstone
> messages
> > > with
> > > > > data
> > > > > >
> > > > > > Sent using OWA for iPhone
> > > > > > 
> > > > > > From: Michael Pearce
> > > > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > > > To: dev@kafka.apache.org
> > > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > > > > >
> > > > > > Thanks Jun on the feedback, I think I understand the
> > issue/point
> > > > now.
> > > > > >
> > > > > > We def can add that on older client version if tombstone
> marker
> > > > make
> > > > > the
> > > > > > value null to preserve behaviour.
> > > > > >
> > > > > > There is one caveats to this:
> > > > > >
> > > > > > * we have to be clear that data is lost if reading via old
> > > > > client/message
> > > > > > format - I don't think this is a big issue as mostly the
> > idea/use
> > > > > case is
> > > > > > around meta data transport as such would only be as bad as
> > > current
> > > > > situation
> > > > > >
> > > > > > Re having configurable broker this was to handle cases like
> you
> > > > > described
> > > > > > but in another way by allowing organisation choose the
> > behaviour
> > > of
> > > > > the
> > > > > > compaction per broker or per topic so they could manage their
> > > > > transition to
> > > > > > using tombstone markers.
> > > > > >
> > > > > > On hind sight it maybe easier to just upgrade and downgrade
> the
> > > > > messages
> > > > > > on version as you propose.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sent using OWA for iPhone
> > > > > > 
> > > > > > From: Jun Rao <j...@confluent.io>
> > > > > > Sent: Tuesday, November 8, 2016 12:34:41 AM
> > > > > > To: dev@kafka.apache.org
> > > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > > > > >
> > > > > > For the use case, one potential use case is fo

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Mayuresh Gharat
  > > as
> > > > > well
> > > > > as Null for the value for log compaction.
> > > > > 2) In second stage we move on to supporting only the
> attribute
> > flag
> > > > > for log
> > > > > compaction.
> > > > >
> > > > > I agree with Becket that for older clients (consumers)
> the broker
> > > > might
> > > > > have to down convert a message that has the attribute
> flag set
> > for
> > > > log
> > > > > compacting but has a non null value. But this should
> be in first
> > > > stage.
> > > > > Once all the clients have upgraded (clients start
> recognizing the
> > > > > attribute
>         > > > > flag), we can move the broker to stage 2.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > > michael.pea...@ig.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Also we can add further guidance:
> > > > > >
> > > > > > To  avoid the below caveat to organisations by
> promoting of
> > > > > upgrading all
> > > > > > consumers first before relying on producing
> tombstone messages
> > > with
> > > > > data
> > > > > >
> > > > > > Sent using OWA for iPhone
> > > > > > 
> > > > > > From: Michael Pearce
> > > > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > > > To: dev@kafka.apache.org
> > > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction
> Tombstone Flag
> > > > > >
> > > > > > Thanks Jun on the feedback, I think I understand the
> > issue/point
> > > > now.
> > > > > >
> > > > > > We def can add that on older client version if
> tombstone marker
> > > > make
> > > > > the
> > > > > > value null to preserve behaviour.
> > > > >     >
>     > > > > > There is one caveats to this:
> > > > > >
> > > > > > * we have to be clear that data is lost if reading
> via old
> > > > > client/message
> > > > > > format - I don't think this is a big issue as mostly
> the
> > idea/use
> > > > > case is
> > > > > > around meta data transport as such would only be as
> bad as
> > > current
> > > > > situation
> > > > > >
> > > > > > Re having configurable broker this was to handle
> cases like you
> > > > > described
> > > > > > but in another way by allowing organisation choose
> the
> > behaviour
> > > of
> > > > > the
> > > > > > compaction per broker or per topic so they could
> manage their
> > > > > transition to
> > > > > > using tombstone markers.
> > > > > >
> > > > > > On hind sight it maybe easier to just upgrade and
> downgrade the
> > > > > messages
> > > > > > on version as you propose.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sent using OWA for iPhone
> > > > > > 
> > > > > > From: Jun Rao <j...

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Michael Pearce
; compacting but has a non null value. But this should be in 
first
> > > stage.
> > > > Once all the clients have upgraded (clients start 
recognizing the
> > > > attribute
> > > > flag), we can move the broker to stage 2.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > michael.pea...@ig.com
        > > > > >
    > > > > wrote:
> > > >
> > > > > Also we can add further guidance:
> > > > >
> > > > > To  avoid the below caveat to organisations by promoting 
of
> > > > upgrading all
> > > > > consumers first before relying on producing tombstone 
messages
> > with
> > > > data
> > > > >
> > > > > Sent using OWA for iPhone
> > > > > 
> > > > > From: Michael Pearce
> > > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone 
Flag
> > > > >
> > > > > Thanks Jun on the feedback, I think I understand the
> issue/point
> > > now.
> > > > >
> > > > > We def can add that on older client version if tombstone 
marker
> > > make
> > > > the
> > > > > value null to preserve behaviour.
> > > > >
> > > > > There is one caveats to this:
> > > > >
> > > > > * we have to be clear that data is lost if reading via old
> > > > client/message
> > > > > format - I don't think this is a big issue as mostly the
> idea/use
> > > > case is
> > > >     > around meta data transport as such would only be as bad as
> > current
> > > > situation
> > > > >
> > > > > Re having configurable broker this was to handle cases 
like you
> > > > described
> > > > > but in another way by allowing organisation choose the
> behaviour
> > of
> > > > the
> > > > > compaction per broker or per topic so they could manage 
their
> > > > transition to
> > > > > using tombstone markers.
> > > > >
> > > > > On hind sight it maybe easier to just upgrade and 
downgrade the
> > > > messages
> > > > > on version as you propose.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Sent using OWA for iPhone
> > > > > 
> > > > > From: Jun Rao <j...@confluent.io>
> > > > > Sent: Tuesday, November 8, 2016 12:34:41 AM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone 
Flag
> > > > >
> > > > > For the use case, one potential use case is for schema
> > > registration.
> > > > For
> > > > > example, in Avro, a null value corresponds to a Null 
schema.
> So,
> > if
> > > > you
> > > > > want to be able to keep the schema id in a delete 
message, the
> > > value
> > > > can't
> > > > > be null. We could get around this issue by specializing 
null
> > value
> > > > during
> > > > > schema registration though.
> > > > >
> > > > > Now for the proposed changes. We probably should preserve
>

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-10 Thread Michael Pearce
Agree on this point by Raidai, im happy having a two stage roll out a suggested 
by yourself Mayuresh, so for a period we have both as obviously a transition 
period will be required.
But I think the end goal is that we should just be reliant on a tombstone 
marker eventually. (as in remove method denoted by Raidai)

Also we should default new installs to have them set by default on stage 2.

On 10/11/2016, 05:03, "radai" <radai.rosenbl...@gmail.com> wrote:

my personal opinion - a log compacted topic is basically a kv-store, so a
map API.
map.put(key, null) is not the same as map.remove(key), which to me means a
null value should not represent a delete. a delete should be explicit
(meaning flag).


On Wed, Nov 9, 2016 at 11:01 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> I see the reasoning and might be inclined to agree a bit :
> If we go to stage 2, the only difference is that we can theoretically
> support a null value non-tombstone message in a log compacted topic, but I
> am not sure if that has any use case.
>
> But as an end goal I see that kafka should clearly specify what it means 
by
> a tombstone : is it the attribute flag OR is it the null value. If we just
> do stage 1, I don't think we are defining the end-goal completely.
> Again this is more about semantics of correctness of end state.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Nov 9, 2016 at 10:49 AM, Becket Qin <becket@gmail.com> wrote:
>
> > I am not sure if we need the second stage. Wouldn't it be enough to say
> > that a message is a tombstone if one of the following is true?
> > 1. tombstone flag is set.
> > 2. value is null.
> >
> > If we go to stage 2, the only difference is that we can theoretically
> > support a null value non-tombstone message in a log compacted topic, but
> I
> > am not sure if that has any use case.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> > On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > I think it will be a good idea. +1
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <michael.pea...@ig.com>
> > > wrote:
> > >
> > > > +1 Mayuresh, I think this is a good solution/strategy.
> > > >
> > > > Shall we update the KIP with this? Becket/Jun/Joel any comments to
> add
> > > > before we do?
> > > >
> > > > On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com>
> > > > wrote:
> > > >
> > > > I think the migration can be done in 2 stages :
> > > >
> > > > 1) In first stage the broker should understand the attribute 
flag
> > as
> > > > well
> > > > as Null for the value for log compaction.
> > > > 2) In second stage we move on to supporting only the attribute
> flag
> > > > for log
> > > > compaction.
> > > >
> > > > I agree with Becket that for older clients (consumers) the 
broker
> > > might
> > > > have to down convert a message that has the attribute flag set
> for
> > > log
> > > > compacting but has a non null value. But this should be in first
> > > stage.
> > > > Once all the clients have upgraded (clients start recognizing 
the
> > > > attribute
> > > > flag), we can move the broker to stage 2.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > michael.pea...@ig.com
> > > > >
> > > > wrote:
> > > >
> > > > > Also we can add further guidance:
> > > > >
> > > > > To  avoid the below caveat to organisations by promoting of
> > > > upgrading all
> > > > > consumers first before relying on producing tombstone messages
> > with
> > > > data
> > > > >
> > > > > Sent using OWA 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-09 Thread radai
my personal opinion - a log compacted topic is basically a kv-store, so a
map API.
map.put(key, null) is not the same as map.remove(key), which to me means a
null value should not represent a delete. a delete should be explicit
(meaning flag).


On Wed, Nov 9, 2016 at 11:01 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> I see the reasoning and might be inclined to agree a bit :
> If we go to stage 2, the only difference is that we can theoretically
> support a null value non-tombstone message in a log compacted topic, but I
> am not sure if that has any use case.
>
> But as an end goal I see that kafka should clearly specify what it means by
> a tombstone : is it the attribute flag OR is it the null value. If we just
> do stage 1, I don't think we are defining the end-goal completely.
> Again this is more about semantics of correctness of end state.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Nov 9, 2016 at 10:49 AM, Becket Qin <becket@gmail.com> wrote:
>
> > I am not sure if we need the second stage. Wouldn't it be enough to say
> > that a message is a tombstone if one of the following is true?
> > 1. tombstone flag is set.
> > 2. value is null.
> >
> > If we go to stage 2, the only difference is that we can theoretically
> > support a null value non-tombstone message in a log compacted topic, but
> I
> > am not sure if that has any use case.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> > On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > I think it will be a good idea. +1
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <michael.pea...@ig.com>
> > > wrote:
> > >
> > > > +1 Mayuresh, I think this is a good solution/strategy.
> > > >
> > > > Shall we update the KIP with this? Becket/Jun/Joel any comments to
> add
> > > > before we do?
> > > >
> > > > On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com>
> > > > wrote:
> > > >
> > > > I think the migration can be done in 2 stages :
> > > >
> > > > 1) In first stage the broker should understand the attribute flag
> > as
> > > > well
> > > > as Null for the value for log compaction.
> > > > 2) In second stage we move on to supporting only the attribute
> flag
> > > > for log
> > > > compaction.
> > > >
> > > > I agree with Becket that for older clients (consumers) the broker
> > > might
> > > > have to down convert a message that has the attribute flag set
> for
> > > log
> > > > compacting but has a non null value. But this should be in first
> > > stage.
> > > > Once all the clients have upgraded (clients start recognizing the
> > > > attribute
> > > > flag), we can move the broker to stage 2.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > > michael.pea...@ig.com
> > > > >
> > > > wrote:
> > > >
> > > > > Also we can add further guidance:
> > > > >
> > > > > To  avoid the below caveat to organisations by promoting of
> > > > upgrading all
> > > > > consumers first before relying on producing tombstone messages
> > with
> > > > data
> > > > >
> > > > > Sent using OWA for iPhone
> > > > > 
> > > > > From: Michael Pearce
> > > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > > > >
> > > > > Thanks Jun on the feedback, I think I understand the
> issue/point
> > > now.
> > > > >
> > > > > We def can add that on older client version if tombstone marker
> > > make
> > > > the
> > > > > value null to preserve behaviour.
> > > > >
> > > > > There is one caveats to this:
> > > > >
> > > > > * we have to be cle

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-09 Thread Mayuresh Gharat
I see the reasoning and might be inclined to agree a bit :
If we go to stage 2, the only difference is that we can theoretically
support a null value non-tombstone message in a log compacted topic, but I
am not sure if that has any use case.

But as an end goal I see that kafka should clearly specify what it means by
a tombstone : is it the attribute flag OR is it the null value. If we just
do stage 1, I don't think we are defining the end-goal completely.
Again this is more about semantics of correctness of end state.

Thanks,

Mayuresh

On Wed, Nov 9, 2016 at 10:49 AM, Becket Qin <becket@gmail.com> wrote:

> I am not sure if we need the second stage. Wouldn't it be enough to say
> that a message is a tombstone if one of the following is true?
> 1. tombstone flag is set.
> 2. value is null.
>
> If we go to stage 2, the only difference is that we can theoretically
> support a null value non-tombstone message in a log compacted topic, but I
> am not sure if that has any use case.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > I think it will be a good idea. +1
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > +1 Mayuresh, I think this is a good solution/strategy.
> > >
> > > Shall we update the KIP with this? Becket/Jun/Joel any comments to add
> > > before we do?
> > >
> > > On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com>
> > > wrote:
> > >
> > > I think the migration can be done in 2 stages :
> > >
> > > 1) In first stage the broker should understand the attribute flag
> as
> > > well
> > > as Null for the value for log compaction.
> > > 2) In second stage we move on to supporting only the attribute flag
> > > for log
> > > compaction.
> > >
> > > I agree with Becket that for older clients (consumers) the broker
> > might
> > > have to down convert a message that has the attribute flag set for
> > log
> > > compacting but has a non null value. But this should be in first
> > stage.
> > > Once all the clients have upgraded (clients start recognizing the
> > > attribute
> > > flag), we can move the broker to stage 2.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> > michael.pea...@ig.com
> > > >
> > > wrote:
> > >
> > > > Also we can add further guidance:
> > > >
> > > > To  avoid the below caveat to organisations by promoting of
> > > upgrading all
> > > > consumers first before relying on producing tombstone messages
> with
> > > data
> > > >
> > > > Sent using OWA for iPhone
> > > > 
> > > > From: Michael Pearce
> > > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > > >
> > > > Thanks Jun on the feedback, I think I understand the issue/point
> > now.
> > > >
> > > > We def can add that on older client version if tombstone marker
> > make
> > > the
> > > > value null to preserve behaviour.
> > > >
> > > > There is one caveats to this:
> > > >
> > > > * we have to be clear that data is lost if reading via old
> > > client/message
> > > > format - I don't think this is a big issue as mostly the idea/use
> > > case is
> > > > around meta data transport as such would only be as bad as
> current
> > > situation
> > >     >
> > >     > Re having configurable broker this was to handle cases like you
> > > described
> > > > but in another way by allowing organisation choose the behaviour
> of
> > > the
> > > > compaction per broker or per topic so they could manage their
> > > transition to
> > > > using tombstone markers.
> > > >
> > > > On hind sight it maybe easier to just upgrade and downgrade the
> > > messages
> > > > on version as

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-09 Thread Becket Qin
I am not sure if we need the second stage. Wouldn't it be enough to say
that a message is a tombstone if one of the following is true?
1. tombstone flag is set.
2. value is null.

If we go to stage 2, the only difference is that we can theoretically
support a null value non-tombstone message in a log compacted topic, but I
am not sure if that has any use case.

Thanks,

Jiangjie (Becket) Qin


On Wed, Nov 9, 2016 at 9:23 AM, Mayuresh Gharat <gharatmayures...@gmail.com>
wrote:

> I think it will be a good idea. +1
>
> Thanks,
>
> Mayuresh
>
> On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > +1 Mayuresh, I think this is a good solution/strategy.
> >
> > Shall we update the KIP with this? Becket/Jun/Joel any comments to add
> > before we do?
> >
> > On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com>
> > wrote:
> >
> > I think the migration can be done in 2 stages :
> >
> > 1) In first stage the broker should understand the attribute flag as
> > well
> > as Null for the value for log compaction.
> > 2) In second stage we move on to supporting only the attribute flag
> > for log
> > compaction.
> >
> > I agree with Becket that for older clients (consumers) the broker
> might
> > have to down convert a message that has the attribute flag set for
> log
> > compacting but has a non null value. But this should be in first
> stage.
> > Once all the clients have upgraded (clients start recognizing the
> > attribute
> > flag), we can move the broker to stage 2.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <
> michael.pea...@ig.com
> > >
> > wrote:
> >
> > > Also we can add further guidance:
> > >
> > > To  avoid the below caveat to organisations by promoting of
> > upgrading all
> > > consumers first before relying on producing tombstone messages with
> > data
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Michael Pearce
> > > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Thanks Jun on the feedback, I think I understand the issue/point
> now.
> > >
> > > We def can add that on older client version if tombstone marker
> make
> > the
> > > value null to preserve behaviour.
> > >
> > > There is one caveats to this:
> > >
> > > * we have to be clear that data is lost if reading via old
> > client/message
> > > format - I don't think this is a big issue as mostly the idea/use
> > case is
> > > around meta data transport as such would only be as bad as current
> > situation
> > >
> > > Re having configurable broker this was to handle cases like you
> > described
> > > but in another way by allowing organisation choose the behaviour of
> > the
> > > compaction per broker or per topic so they could manage their
> > transition to
> > > using tombstone markers.
> > >
> > > On hind sight it maybe easier to just upgrade and downgrade the
> > messages
> > > on version as you propose.
> > >
> > >
> > >
> > >
> > >
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Jun Rao <j...@confluent.io>
> > > Sent: Tuesday, November 8, 2016 12:34:41 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > For the use case, one potential use case is for schema
> registration.
> > For
> > > example, in Avro, a null value corresponds to a Null schema. So, if
> > you
> > > want to be able to keep the schema id in a delete message, the
> value
> > can't
> > > be null. We could get around this issue by specializing null value
> > during
> > > schema registration though.
> > >
> > > Now for the proposed changes. We probably should preserve client
> > > compatibility. If a client application is sending a null value to

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-09 Thread Mayuresh Gharat
I think it will be a good idea. +1

Thanks,

Mayuresh

On Wed, Nov 9, 2016 at 9:13 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> +1 Mayuresh, I think this is a good solution/strategy.
>
> Shall we update the KIP with this? Becket/Jun/Joel any comments to add
> before we do?
>
> On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com>
> wrote:
>
> I think the migration can be done in 2 stages :
>
> 1) In first stage the broker should understand the attribute flag as
> well
> as Null for the value for log compaction.
> 2) In second stage we move on to supporting only the attribute flag
> for log
> compaction.
>
> I agree with Becket that for older clients (consumers) the broker might
> have to down convert a message that has the attribute flag set for log
> compacting but has a non null value. But this should be in first stage.
> Once all the clients have upgraded (clients start recognizing the
> attribute
> flag), we can move the broker to stage 2.
>
> Thanks,
>
> Mayuresh
>
> On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <michael.pea...@ig.com
> >
> wrote:
>
> > Also we can add further guidance:
> >
> > To  avoid the below caveat to organisations by promoting of
> upgrading all
> > consumers first before relying on producing tombstone messages with
> data
> >
> > Sent using OWA for iPhone
> > ________________________
>     > From: Michael Pearce
> > Sent: Tuesday, November 8, 2016 8:03:32 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > Thanks Jun on the feedback, I think I understand the issue/point now.
> >
> > We def can add that on older client version if tombstone marker make
> the
> > value null to preserve behaviour.
> >
> > There is one caveats to this:
> >
> > * we have to be clear that data is lost if reading via old
> client/message
> > format - I don't think this is a big issue as mostly the idea/use
> case is
> > around meta data transport as such would only be as bad as current
> situation
> >
> > Re having configurable broker this was to handle cases like you
> described
> > but in another way by allowing organisation choose the behaviour of
> the
> > compaction per broker or per topic so they could manage their
> transition to
> > using tombstone markers.
> >
> > On hind sight it maybe easier to just upgrade and downgrade the
> messages
>     > on version as you propose.
> >
> >
> >
> >
> >
> >
> > Sent using OWA for iPhone
> > 
> > From: Jun Rao <j...@confluent.io>
> > Sent: Tuesday, November 8, 2016 12:34:41 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > For the use case, one potential use case is for schema registration.
> For
> > example, in Avro, a null value corresponds to a Null schema. So, if
> you
> > want to be able to keep the schema id in a delete message, the value
> can't
> > be null. We could get around this issue by specializing null value
> during
> > schema registration though.
> >
> > Now for the proposed changes. We probably should preserve client
> > compatibility. If a client application is sending a null value to a
> > compacted topic, ideally, it should work the same after the client
> > upgrades.
> >
> > I am not sure about making the tombstone marker configurable,
> especially at
> > the topic level. Should we allow users to change the config values
> back and
> > forth, and what would be the implication?
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > Hi Michael,
> > >
> > > Yes, changing the logic in the log cleaner makes sense. There
> could be
> > some
> > > other thing worth thinking (e.g. the message size change after
> > conversion),
> > > though.
> > >
> > > The scenario I was thinking is the following:
> > > Imagine a distributed caching system built on top of Kafka. A user
> is
> > > consuming 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-09 Thread Michael Pearce
+1 Mayuresh, I think this is a good solution/strategy.

Shall we update the KIP with this? Becket/Jun/Joel any comments to add before 
we do?

On 08/11/2016, 17:29, "Mayuresh Gharat" <gharatmayures...@gmail.com> wrote:

I think the migration can be done in 2 stages :

1) In first stage the broker should understand the attribute flag as well
as Null for the value for log compaction.
2) In second stage we move on to supporting only the attribute flag for log
compaction.

I agree with Becket that for older clients (consumers) the broker might
have to down convert a message that has the attribute flag set for log
compacting but has a non null value. But this should be in first stage.
Once all the clients have upgraded (clients start recognizing the attribute
flag), we can move the broker to stage 2.

Thanks,

Mayuresh

On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Also we can add further guidance:
>
> To  avoid the below caveat to organisations by promoting of upgrading all
> consumers first before relying on producing tombstone messages with data
>
> Sent using OWA for iPhone
> 
> From: Michael Pearce
> Sent: Tuesday, November 8, 2016 8:03:32 AM
    > To: dev@kafka.apache.org
    > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Thanks Jun on the feedback, I think I understand the issue/point now.
>
> We def can add that on older client version if tombstone marker make the
> value null to preserve behaviour.
>
> There is one caveats to this:
>
> * we have to be clear that data is lost if reading via old client/message
> format - I don't think this is a big issue as mostly the idea/use case is
> around meta data transport as such would only be as bad as current 
situation
>
> Re having configurable broker this was to handle cases like you described
> but in another way by allowing organisation choose the behaviour of the
> compaction per broker or per topic so they could manage their transition 
to
> using tombstone markers.
>
> On hind sight it maybe easier to just upgrade and downgrade the messages
> on version as you propose.
>
>
>
>
>
>
> Sent using OWA for iPhone
> 
> From: Jun Rao <j...@confluent.io>
    > Sent: Tuesday, November 8, 2016 12:34:41 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> For the use case, one potential use case is for schema registration. For
> example, in Avro, a null value corresponds to a Null schema. So, if you
> want to be able to keep the schema id in a delete message, the value can't
> be null. We could get around this issue by specializing null value during
> schema registration though.
>
> Now for the proposed changes. We probably should preserve client
> compatibility. If a client application is sending a null value to a
> compacted topic, ideally, it should work the same after the client
> upgrades.
>
> I am not sure about making the tombstone marker configurable, especially 
at
> the topic level. Should we allow users to change the config values back 
and
> forth, and what would be the implication?
>
> Thanks,
>
> Jun
>
> On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Hi Michael,
> >
> > Yes, changing the logic in the log cleaner makes sense. There could be
> some
> > other thing worth thinking (e.g. the message size change after
> conversion),
> > though.
> >
> > The scenario I was thinking is the following:
> > Imagine a distributed caching system built on top of Kafka. A user is
> > consuming from a topic and it is guaranteed that if the user consume to
> the
> > log end it will get the latest value for all the keys. Currently if the
> > consumer sees a null value it knows the key has been removed. Now let's
> say
> > we rolled out this change. And the producer applies a message with the
> > tombstone flag set, but the value was not null. When we append that
> message
> > to the log I suppose we will not do the down conversion if the broker 
has
> > set the message.format.version to the latest. Because the log cleaner
> won't
> > touch the active log segment, so that message will be sitting in the
> active
> > segment as

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-08 Thread Mayuresh Gharat
I think the migration can be done in 2 stages :

1) In first stage the broker should understand the attribute flag as well
as Null for the value for log compaction.
2) In second stage we move on to supporting only the attribute flag for log
compaction.

I agree with Becket that for older clients (consumers) the broker might
have to down convert a message that has the attribute flag set for log
compacting but has a non null value. But this should be in first stage.
Once all the clients have upgraded (clients start recognizing the attribute
flag), we can move the broker to stage 2.

Thanks,

Mayuresh

On Tue, Nov 8, 2016 at 12:06 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Also we can add further guidance:
>
> To  avoid the below caveat to organisations by promoting of upgrading all
> consumers first before relying on producing tombstone messages with data
>
> Sent using OWA for iPhone
> 
> From: Michael Pearce
> Sent: Tuesday, November 8, 2016 8:03:32 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Thanks Jun on the feedback, I think I understand the issue/point now.
>
> We def can add that on older client version if tombstone marker make the
> value null to preserve behaviour.
>
> There is one caveats to this:
>
> * we have to be clear that data is lost if reading via old client/message
> format - I don't think this is a big issue as mostly the idea/use case is
> around meta data transport as such would only be as bad as current situation
>
> Re having configurable broker this was to handle cases like you described
> but in another way by allowing organisation choose the behaviour of the
> compaction per broker or per topic so they could manage their transition to
> using tombstone markers.
>
> On hind sight it maybe easier to just upgrade and downgrade the messages
> on version as you propose.
>
>
>
>
>
>
> Sent using OWA for iPhone
> 
> From: Jun Rao <j...@confluent.io>
> Sent: Tuesday, November 8, 2016 12:34:41 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> For the use case, one potential use case is for schema registration. For
> example, in Avro, a null value corresponds to a Null schema. So, if you
> want to be able to keep the schema id in a delete message, the value can't
> be null. We could get around this issue by specializing null value during
> schema registration though.
>
> Now for the proposed changes. We probably should preserve client
> compatibility. If a client application is sending a null value to a
> compacted topic, ideally, it should work the same after the client
> upgrades.
>
> I am not sure about making the tombstone marker configurable, especially at
> the topic level. Should we allow users to change the config values back and
> forth, and what would be the implication?
>
> Thanks,
>
> Jun
>
> On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin <becket@gmail.com> wrote:
>
> > Hi Michael,
> >
> > Yes, changing the logic in the log cleaner makes sense. There could be
> some
> > other thing worth thinking (e.g. the message size change after
> conversion),
> > though.
> >
> > The scenario I was thinking is the following:
> > Imagine a distributed caching system built on top of Kafka. A user is
> > consuming from a topic and it is guaranteed that if the user consume to
> the
> > log end it will get the latest value for all the keys. Currently if the
> > consumer sees a null value it knows the key has been removed. Now let's
> say
> > we rolled out this change. And the producer applies a message with the
> > tombstone flag set, but the value was not null. When we append that
> message
> > to the log I suppose we will not do the down conversion if the broker has
> > set the message.format.version to the latest. Because the log cleaner
> won't
> > touch the active log segment, so that message will be sitting in the
> active
> > segment as is. Now when a consumer that hasn't upgraded yet consumes that
> > tombstone message in the active segment, it seems that the broker will
> need
> > to down convert that message to remove the value, right? In this case, we
> > cannot wait for the log cleaner to do the down conversion because that
> > message may have already been consumed before the log compaction happens.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce <michael.pea...@ig.com>
> > wrote:
> >
> > > 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-08 Thread Michael Pearce
Also we can add further guidance:

To  avoid the below caveat to organisations by promoting of upgrading all 
consumers first before relying on producing tombstone messages with data

Sent using OWA for iPhone

From: Michael Pearce
Sent: Tuesday, November 8, 2016 8:03:32 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

Thanks Jun on the feedback, I think I understand the issue/point now.

We def can add that on older client version if tombstone marker make the value 
null to preserve behaviour.

There is one caveats to this:

* we have to be clear that data is lost if reading via old client/message 
format - I don't think this is a big issue as mostly the idea/use case is 
around meta data transport as such would only be as bad as current situation

Re having configurable broker this was to handle cases like you described but 
in another way by allowing organisation choose the behaviour of the compaction 
per broker or per topic so they could manage their transition to using 
tombstone markers.

On hind sight it maybe easier to just upgrade and downgrade the messages on 
version as you propose.






Sent using OWA for iPhone

From: Jun Rao <j...@confluent.io>
Sent: Tuesday, November 8, 2016 12:34:41 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

For the use case, one potential use case is for schema registration. For
example, in Avro, a null value corresponds to a Null schema. So, if you
want to be able to keep the schema id in a delete message, the value can't
be null. We could get around this issue by specializing null value during
schema registration though.

Now for the proposed changes. We probably should preserve client
compatibility. If a client application is sending a null value to a
compacted topic, ideally, it should work the same after the client upgrades.

I am not sure about making the tombstone marker configurable, especially at
the topic level. Should we allow users to change the config values back and
forth, and what would be the implication?

Thanks,

Jun

On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin <becket@gmail.com> wrote:

> Hi Michael,
>
> Yes, changing the logic in the log cleaner makes sense. There could be some
> other thing worth thinking (e.g. the message size change after conversion),
> though.
>
> The scenario I was thinking is the following:
> Imagine a distributed caching system built on top of Kafka. A user is
> consuming from a topic and it is guaranteed that if the user consume to the
> log end it will get the latest value for all the keys. Currently if the
> consumer sees a null value it knows the key has been removed. Now let's say
> we rolled out this change. And the producer applies a message with the
> tombstone flag set, but the value was not null. When we append that message
> to the log I suppose we will not do the down conversion if the broker has
> set the message.format.version to the latest. Because the log cleaner won't
> touch the active log segment, so that message will be sitting in the active
> segment as is. Now when a consumer that hasn't upgraded yet consumes that
> tombstone message in the active segment, it seems that the broker will need
> to down convert that message to remove the value, right? In this case, we
> cannot wait for the log cleaner to do the down conversion because that
> message may have already been consumed before the log compaction happens.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > Hi Becket,
> >
> > We were thinking more about having the logic that’s in the method
> > shouldRetainMessage configurable via http://kafka.apache.org/
> > documentation.html#brokerconfigs  at a broker/topic level. And then
> scrap
> > auto converting the message, and allow organisations to manage the
> rollout
> > of enabling of the feature.
> > (this isn’t in documentation but in response to the discussion thread as
> > an alternative approach to roll out the feature)
> >
> > Does this make any more sense?
> >
> > Thanks
> > Mike
> >
> > On 11/3/16, 2:27 PM, "Becket Qin" <becket@gmail.com> wrote:
> >
> > Hi Michael,
> >
> > Do you mean using a new configuration it is just the exiting
> > message.format.version config? It seems the message.format.version
> > config
> > is enough in this case. And the default value would always be the
> > latest
> > version.
> >
> > > Message version migration would be handled as like in KIP-32
> >
> > Also just want to c

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-08 Thread Michael Pearce
Thanks Jun on the feedback, I think I understand the issue/point now.

We def can add that on older client version if tombstone marker make the value 
null to preserve behaviour.

There is one caveats to this:

* we have to be clear that data is lost if reading via old client/message 
format - I don't think this is a big issue as mostly the idea/use case is 
around meta data transport as such would only be as bad as current situation

Re having configurable broker this was to handle cases like you described but 
in another way by allowing organisation choose the behaviour of the compaction 
per broker or per topic so they could manage their transition to using 
tombstone markers.

On hind sight it maybe easier to just upgrade and downgrade the messages on 
version as you propose.






Sent using OWA for iPhone

From: Jun Rao <j...@confluent.io>
Sent: Tuesday, November 8, 2016 12:34:41 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

For the use case, one potential use case is for schema registration. For
example, in Avro, a null value corresponds to a Null schema. So, if you
want to be able to keep the schema id in a delete message, the value can't
be null. We could get around this issue by specializing null value during
schema registration though.

Now for the proposed changes. We probably should preserve client
compatibility. If a client application is sending a null value to a
compacted topic, ideally, it should work the same after the client upgrades.

I am not sure about making the tombstone marker configurable, especially at
the topic level. Should we allow users to change the config values back and
forth, and what would be the implication?

Thanks,

Jun

On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin <becket@gmail.com> wrote:

> Hi Michael,
>
> Yes, changing the logic in the log cleaner makes sense. There could be some
> other thing worth thinking (e.g. the message size change after conversion),
> though.
>
> The scenario I was thinking is the following:
> Imagine a distributed caching system built on top of Kafka. A user is
> consuming from a topic and it is guaranteed that if the user consume to the
> log end it will get the latest value for all the keys. Currently if the
> consumer sees a null value it knows the key has been removed. Now let's say
> we rolled out this change. And the producer applies a message with the
> tombstone flag set, but the value was not null. When we append that message
> to the log I suppose we will not do the down conversion if the broker has
> set the message.format.version to the latest. Because the log cleaner won't
> touch the active log segment, so that message will be sitting in the active
> segment as is. Now when a consumer that hasn't upgraded yet consumes that
> tombstone message in the active segment, it seems that the broker will need
> to down convert that message to remove the value, right? In this case, we
> cannot wait for the log cleaner to do the down conversion because that
> message may have already been consumed before the log compaction happens.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > Hi Becket,
> >
> > We were thinking more about having the logic that’s in the method
> > shouldRetainMessage configurable via http://kafka.apache.org/
> > documentation.html#brokerconfigs  at a broker/topic level. And then
> scrap
> > auto converting the message, and allow organisations to manage the
> rollout
> > of enabling of the feature.
> > (this isn’t in documentation but in response to the discussion thread as
> > an alternative approach to roll out the feature)
> >
> > Does this make any more sense?
> >
> > Thanks
> > Mike
> >
> > On 11/3/16, 2:27 PM, "Becket Qin" <becket@gmail.com> wrote:
> >
> > Hi Michael,
> >
> > Do you mean using a new configuration it is just the exiting
> > message.format.version config? It seems the message.format.version
> > config
> > is enough in this case. And the default value would always be the
> > latest
> > version.
> >
> > > Message version migration would be handled as like in KIP-32
> >
> > Also just want to confirm on this. Today if an old consumer consumes
> a
> > log
> > compacted topic and sees an empty value, it knows that is a
> tombstone.
> > After we start to use the attribute bit, a tombstone message can
> have a
> > non-empty value. So by "like in KIP-32" you mean we will remove the
> > value
> > to down convert the message 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-07 Thread Jun Rao
For the use case, one potential use case is for schema registration. For
example, in Avro, a null value corresponds to a Null schema. So, if you
want to be able to keep the schema id in a delete message, the value can't
be null. We could get around this issue by specializing null value during
schema registration though.

Now for the proposed changes. We probably should preserve client
compatibility. If a client application is sending a null value to a
compacted topic, ideally, it should work the same after the client upgrades.

I am not sure about making the tombstone marker configurable, especially at
the topic level. Should we allow users to change the config values back and
forth, and what would be the implication?

Thanks,

Jun

On Mon, Nov 7, 2016 at 10:48 AM, Becket Qin  wrote:

> Hi Michael,
>
> Yes, changing the logic in the log cleaner makes sense. There could be some
> other thing worth thinking (e.g. the message size change after conversion),
> though.
>
> The scenario I was thinking is the following:
> Imagine a distributed caching system built on top of Kafka. A user is
> consuming from a topic and it is guaranteed that if the user consume to the
> log end it will get the latest value for all the keys. Currently if the
> consumer sees a null value it knows the key has been removed. Now let's say
> we rolled out this change. And the producer applies a message with the
> tombstone flag set, but the value was not null. When we append that message
> to the log I suppose we will not do the down conversion if the broker has
> set the message.format.version to the latest. Because the log cleaner won't
> touch the active log segment, so that message will be sitting in the active
> segment as is. Now when a consumer that hasn't upgraded yet consumes that
> tombstone message in the active segment, it seems that the broker will need
> to down convert that message to remove the value, right? In this case, we
> cannot wait for the log cleaner to do the down conversion because that
> message may have already been consumed before the log compaction happens.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce 
> wrote:
>
> > Hi Becket,
> >
> > We were thinking more about having the logic that’s in the method
> > shouldRetainMessage configurable via http://kafka.apache.org/
> > documentation.html#brokerconfigs  at a broker/topic level. And then
> scrap
> > auto converting the message, and allow organisations to manage the
> rollout
> > of enabling of the feature.
> > (this isn’t in documentation but in response to the discussion thread as
> > an alternative approach to roll out the feature)
> >
> > Does this make any more sense?
> >
> > Thanks
> > Mike
> >
> > On 11/3/16, 2:27 PM, "Becket Qin"  wrote:
> >
> > Hi Michael,
> >
> > Do you mean using a new configuration it is just the exiting
> > message.format.version config? It seems the message.format.version
> > config
> > is enough in this case. And the default value would always be the
> > latest
> > version.
> >
> > > Message version migration would be handled as like in KIP-32
> >
> > Also just want to confirm on this. Today if an old consumer consumes
> a
> > log
> > compacted topic and sees an empty value, it knows that is a
> tombstone.
> > After we start to use the attribute bit, a tombstone message can
> have a
> > non-empty value. So by "like in KIP-32" you mean we will remove the
> > value
> > to down convert the message if the consumer version is old, right?
> >
> > Thanks.
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce <
> michael.pea...@ig.com>
> > wrote:
> >
> > > Hi Joel , et al.
> > >
> > > Any comments on the below idea to handle roll out / compatibility
> of
> > this
> > > feature, using a configuration?
> > >
> > > Does it make sense/clear?
> > > Does it add value?
> > > Do we want to enforce flag by default, or value by default, or
> both?
> > >
> > > Cheers
> > > Mike
> > >
> > >
> > > On 10/27/16, 4:47 PM, "Michael Pearce" 
> > wrote:
> > >
> > > Thanks, James, I think this is a really good addition to the
> KIP
> > > details, please feel free to amend the wiki/add the use cases, also
> > if any
> > > others you think of. I definitely think its worthwhile documenting.
> > If you
> > > can’t let me know ill add them next week (just leaving for a long
> > weekend
> > > off)
> > >
> > > Re Joel and others comments about upgrade and compatibility.
> > >
> > > Rather than trying to auto manage this.
> > >
> > > Actually maybe we make a configuration option, both at server
> > and per
> > > topic level to control the behavior of how the server logic should
> > work out
> > > if the record, is 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-07 Thread Becket Qin
Hi Michael,

Yes, changing the logic in the log cleaner makes sense. There could be some
other thing worth thinking (e.g. the message size change after conversion),
though.

The scenario I was thinking is the following:
Imagine a distributed caching system built on top of Kafka. A user is
consuming from a topic and it is guaranteed that if the user consume to the
log end it will get the latest value for all the keys. Currently if the
consumer sees a null value it knows the key has been removed. Now let's say
we rolled out this change. And the producer applies a message with the
tombstone flag set, but the value was not null. When we append that message
to the log I suppose we will not do the down conversion if the broker has
set the message.format.version to the latest. Because the log cleaner won't
touch the active log segment, so that message will be sitting in the active
segment as is. Now when a consumer that hasn't upgraded yet consumes that
tombstone message in the active segment, it seems that the broker will need
to down convert that message to remove the value, right? In this case, we
cannot wait for the log cleaner to do the down conversion because that
message may have already been consumed before the log compaction happens.

Thanks,

Jiangjie (Becket) Qin



On Mon, Nov 7, 2016 at 9:59 AM, Michael Pearce 
wrote:

> Hi Becket,
>
> We were thinking more about having the logic that’s in the method
> shouldRetainMessage configurable via http://kafka.apache.org/
> documentation.html#brokerconfigs  at a broker/topic level. And then scrap
> auto converting the message, and allow organisations to manage the rollout
> of enabling of the feature.
> (this isn’t in documentation but in response to the discussion thread as
> an alternative approach to roll out the feature)
>
> Does this make any more sense?
>
> Thanks
> Mike
>
> On 11/3/16, 2:27 PM, "Becket Qin"  wrote:
>
> Hi Michael,
>
> Do you mean using a new configuration it is just the exiting
> message.format.version config? It seems the message.format.version
> config
> is enough in this case. And the default value would always be the
> latest
> version.
>
> > Message version migration would be handled as like in KIP-32
>
> Also just want to confirm on this. Today if an old consumer consumes a
> log
> compacted topic and sees an empty value, it knows that is a tombstone.
> After we start to use the attribute bit, a tombstone message can have a
> non-empty value. So by "like in KIP-32" you mean we will remove the
> value
> to down convert the message if the consumer version is old, right?
>
> Thanks.
>
> Jiangjie (Becket) Qin
>
> On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce 
> wrote:
>
> > Hi Joel , et al.
> >
> > Any comments on the below idea to handle roll out / compatibility of
> this
> > feature, using a configuration?
> >
> > Does it make sense/clear?
> > Does it add value?
> > Do we want to enforce flag by default, or value by default, or both?
> >
> > Cheers
> > Mike
> >
> >
> > On 10/27/16, 4:47 PM, "Michael Pearce" 
> wrote:
> >
> > Thanks, James, I think this is a really good addition to the KIP
> > details, please feel free to amend the wiki/add the use cases, also
> if any
> > others you think of. I definitely think its worthwhile documenting.
> If you
> > can’t let me know ill add them next week (just leaving for a long
> weekend
> > off)
> >
> > Re Joel and others comments about upgrade and compatibility.
> >
> > Rather than trying to auto manage this.
> >
> > Actually maybe we make a configuration option, both at server
> and per
> > topic level to control the behavior of how the server logic should
> work out
> > if the record, is a tombstone record .
> >
> > e.g.
> >
> > key = compation.tombstone.marker
> >
> > value options:
> >
> > value   (continues to use null value as tombstone marker)
> > flag (expects to use the tombstone flag)
> > value_or_flag (if either is true it treats the record as a
> tombstone)
> >
> > This way on upgrade users can keep current behavior, and slowly
> > migrate to the new. Having a transition period of using
> value_or_flag,
> > finally having flag only if an organization wishes to use null values
> > without it being treated as a tombstone marker (use case noted below)
> >
> > Having it both global broker level and topic override also
> allows some
> > flexibility here.
> >
> > Cheers
> > Mike
> >
> >
> >
> >
> >
> >
> > On 10/27/16, 8:03 AM, "James Cheng" 
> wrote:
> >
> > This KIP would definitely address a gap in the current
> 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-07 Thread Michael Pearce
Hi Becket,

We were thinking more about having the logic that’s in the method 
shouldRetainMessage configurable via 
http://kafka.apache.org/documentation.html#brokerconfigs  at a broker/topic 
level. And then scrap auto converting the message, and allow organisations to 
manage the rollout of enabling of the feature.
(this isn’t in documentation but in response to the discussion thread as an 
alternative approach to roll out the feature)

Does this make any more sense?

Thanks
Mike

On 11/3/16, 2:27 PM, "Becket Qin"  wrote:

Hi Michael,

Do you mean using a new configuration it is just the exiting
message.format.version config? It seems the message.format.version config
is enough in this case. And the default value would always be the latest
version.

> Message version migration would be handled as like in KIP-32

Also just want to confirm on this. Today if an old consumer consumes a log
compacted topic and sees an empty value, it knows that is a tombstone.
After we start to use the attribute bit, a tombstone message can have a
non-empty value. So by "like in KIP-32" you mean we will remove the value
to down convert the message if the consumer version is old, right?

Thanks.

Jiangjie (Becket) Qin

On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce 
wrote:

> Hi Joel , et al.
>
> Any comments on the below idea to handle roll out / compatibility of this
> feature, using a configuration?
>
> Does it make sense/clear?
> Does it add value?
> Do we want to enforce flag by default, or value by default, or both?
>
> Cheers
> Mike
>
>
> On 10/27/16, 4:47 PM, "Michael Pearce"  wrote:
>
> Thanks, James, I think this is a really good addition to the KIP
> details, please feel free to amend the wiki/add the use cases, also if any
> others you think of. I definitely think its worthwhile documenting. If you
> can’t let me know ill add them next week (just leaving for a long weekend
> off)
>
> Re Joel and others comments about upgrade and compatibility.
>
> Rather than trying to auto manage this.
>
> Actually maybe we make a configuration option, both at server and per
> topic level to control the behavior of how the server logic should work 
out
> if the record, is a tombstone record .
>
> e.g.
>
> key = compation.tombstone.marker
>
> value options:
>
> value   (continues to use null value as tombstone marker)
> flag (expects to use the tombstone flag)
> value_or_flag (if either is true it treats the record as a tombstone)
>
> This way on upgrade users can keep current behavior, and slowly
> migrate to the new. Having a transition period of using value_or_flag,
> finally having flag only if an organization wishes to use null values
> without it being treated as a tombstone marker (use case noted below)
>
> Having it both global broker level and topic override also allows some
> flexibility here.
>
> Cheers
> Mike
>
>
>
>
>
>
> On 10/27/16, 8:03 AM, "James Cheng"  wrote:
>
> This KIP would definitely address a gap in the current
> functionality, where you currently can't have a tombstone with any
> associated content.
>
> That said, I'd like to talk about use cases, to make sure that
> this is in fact useful. The KIP should be updated with whatever use cases
> we come up with.
>
> First of all, an observation: When we speak about log compaction,
> we typically think of "the latest message for a key is retained". In that
> respect, a delete tombstone (i.e. a message with a null payload) is 
treated
> the same as any other Kafka message: the latest message is retained. It
> doesn't matter whether the latest message is null, or if the latest 
message
> has actual content. In all cases, the last message is retained.
>
> The only way a delete tombstone is treated differently from other
> Kafka messages is that it automatically disappears after a while. The time
> of deletion is specified using delete.retention.ms.
>
> So what we're really talking about is, do we want to support
> messages in a log-compacted topic that auto-delete themselves after a 
while?
>
> In a thread from 2015, there was a discussion on first-class
> support of headers between Roger Hoover, Felix GV, Jun Rao, and I. See
> thread at https://groups.google.com/d/msg/confluent-platform/
> 8xPbjyUE_7E/yQ1AeCufL_gJ  msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ> . In that thread, Jun
> raised a good question that I didn't have a good answer 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-03 Thread Becket Qin
Hi Michael,

Do you mean using a new configuration it is just the exiting
message.format.version config? It seems the message.format.version config
is enough in this case. And the default value would always be the latest
version.

> Message version migration would be handled as like in KIP-32

Also just want to confirm on this. Today if an old consumer consumes a log
compacted topic and sees an empty value, it knows that is a tombstone.
After we start to use the attribute bit, a tombstone message can have a
non-empty value. So by "like in KIP-32" you mean we will remove the value
to down convert the message if the consumer version is old, right?

Thanks.

Jiangjie (Becket) Qin

On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce 
wrote:

> Hi Joel , et al.
>
> Any comments on the below idea to handle roll out / compatibility of this
> feature, using a configuration?
>
> Does it make sense/clear?
> Does it add value?
> Do we want to enforce flag by default, or value by default, or both?
>
> Cheers
> Mike
>
>
> On 10/27/16, 4:47 PM, "Michael Pearce"  wrote:
>
> Thanks, James, I think this is a really good addition to the KIP
> details, please feel free to amend the wiki/add the use cases, also if any
> others you think of. I definitely think its worthwhile documenting. If you
> can’t let me know ill add them next week (just leaving for a long weekend
> off)
>
> Re Joel and others comments about upgrade and compatibility.
>
> Rather than trying to auto manage this.
>
> Actually maybe we make a configuration option, both at server and per
> topic level to control the behavior of how the server logic should work out
> if the record, is a tombstone record .
>
> e.g.
>
> key = compation.tombstone.marker
>
> value options:
>
> value   (continues to use null value as tombstone marker)
> flag (expects to use the tombstone flag)
> value_or_flag (if either is true it treats the record as a tombstone)
>
> This way on upgrade users can keep current behavior, and slowly
> migrate to the new. Having a transition period of using value_or_flag,
> finally having flag only if an organization wishes to use null values
> without it being treated as a tombstone marker (use case noted below)
>
> Having it both global broker level and topic override also allows some
> flexibility here.
>
> Cheers
> Mike
>
>
>
>
>
>
> On 10/27/16, 8:03 AM, "James Cheng"  wrote:
>
> This KIP would definitely address a gap in the current
> functionality, where you currently can't have a tombstone with any
> associated content.
>
> That said, I'd like to talk about use cases, to make sure that
> this is in fact useful. The KIP should be updated with whatever use cases
> we come up with.
>
> First of all, an observation: When we speak about log compaction,
> we typically think of "the latest message for a key is retained". In that
> respect, a delete tombstone (i.e. a message with a null payload) is treated
> the same as any other Kafka message: the latest message is retained. It
> doesn't matter whether the latest message is null, or if the latest message
> has actual content. In all cases, the last message is retained.
>
> The only way a delete tombstone is treated differently from other
> Kafka messages is that it automatically disappears after a while. The time
> of deletion is specified using delete.retention.ms.
>
> So what we're really talking about is, do we want to support
> messages in a log-compacted topic that auto-delete themselves after a while?
>
> In a thread from 2015, there was a discussion on first-class
> support of headers between Roger Hoover, Felix GV, Jun Rao, and I. See
> thread at https://groups.google.com/d/msg/confluent-platform/
> 8xPbjyUE_7E/yQ1AeCufL_gJ  msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ> . In that thread, Jun
> raised a good question that I didn't have a good answer for at the time: If
> a message is going to auto-delete itself after a while, how important was
> the message? That is, what information did the message contain that was
> important *for a while* but not so important that it needed to be kept
> around forever?
>
> Some use cases that I can think of:
>
> 1) Tracability. I would like to know who issued this delete
> tombstone. It might include the hostname, IP of the producer of the delete.
> 2) Timestamps. I would like to know when this delete was issued.
> This use case is already addressed by the availability of per-message
> timestamps that came in 0.10.0
> 3) Data provenance. I hope I'm using this phrase correctly, but
> what I mean is, where did this delete come from? What processing job
> emitted it? What input to the processing job caused this delete to be
> produced? For example, if a record in topic A was processed and caused a
> delete tombstone to be 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-02 Thread Michael Pearce
Hi Joel , et al.

Any comments on the below idea to handle roll out / compatibility of this 
feature, using a configuration? 

Does it make sense/clear?
Does it add value?
Do we want to enforce flag by default, or value by default, or both?

Cheers
Mike


On 10/27/16, 4:47 PM, "Michael Pearce"  wrote:

Thanks, James, I think this is a really good addition to the KIP details, 
please feel free to amend the wiki/add the use cases, also if any others you 
think of. I definitely think its worthwhile documenting. If you can’t let me 
know ill add them next week (just leaving for a long weekend off)

Re Joel and others comments about upgrade and compatibility.

Rather than trying to auto manage this.

Actually maybe we make a configuration option, both at server and per topic 
level to control the behavior of how the server logic should work out if the 
record, is a tombstone record .

e.g.

key = compation.tombstone.marker

value options:

value   (continues to use null value as tombstone marker)
flag (expects to use the tombstone flag)
value_or_flag (if either is true it treats the record as a tombstone)

This way on upgrade users can keep current behavior, and slowly migrate to 
the new. Having a transition period of using value_or_flag, finally having flag 
only if an organization wishes to use null values without it being treated as a 
tombstone marker (use case noted below)

Having it both global broker level and topic override also allows some 
flexibility here.

Cheers
Mike






On 10/27/16, 8:03 AM, "James Cheng"  wrote:

This KIP would definitely address a gap in the current functionality, 
where you currently can't have a tombstone with any associated content.

That said, I'd like to talk about use cases, to make sure that this is 
in fact useful. The KIP should be updated with whatever use cases we come up 
with.

First of all, an observation: When we speak about log compaction, we 
typically think of "the latest message for a key is retained". In that respect, 
a delete tombstone (i.e. a message with a null payload) is treated the same as 
any other Kafka message: the latest message is retained. It doesn't matter 
whether the latest message is null, or if the latest message has actual 
content. In all cases, the last message is retained.

The only way a delete tombstone is treated differently from other Kafka 
messages is that it automatically disappears after a while. The time of 
deletion is specified using delete.retention.ms.

So what we're really talking about is, do we want to support messages 
in a log-compacted topic that auto-delete themselves after a while?

In a thread from 2015, there was a discussion on first-class support of 
headers between Roger Hoover, Felix GV, Jun Rao, and I. See thread at 
https://groups.google.com/d/msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ 
 . 
In that thread, Jun raised a good question that I didn't have a good answer for 
at the time: If a message is going to auto-delete itself after a while, how 
important was the message? That is, what information did the message contain 
that was important *for a while* but not so important that it needed to be kept 
around forever?

Some use cases that I can think of:

1) Tracability. I would like to know who issued this delete tombstone. 
It might include the hostname, IP of the producer of the delete.
2) Timestamps. I would like to know when this delete was issued. This 
use case is already addressed by the availability of per-message timestamps 
that came in 0.10.0
3) Data provenance. I hope I'm using this phrase correctly, but what I 
mean is, where did this delete come from? What processing job emitted it? What 
input to the processing job caused this delete to be produced? For example, if 
a record in topic A was processed and caused a delete tombstone to be emitted 
to topic B, I might like the offset of the topic A message to be attached to 
the topic B message.
4) Distributed tracing for stream topologies. This might be a slight 
repeat of the above use cases. In the microservices world, we can generate 
call-graphs of webservices using tools like Zipkin/opentracing.io 
, or something homegrown like 
https://engineering.linkedin.com/distributed-service-call-graph/real-time-distributed-tracing-website-performance-and-efficiency
 
.
 I can imagine that you might want to do something similar for stream 
processing topologies, where stream processing jobs carry along and forward 
along a globally 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-02 Thread Michael Pearce
Hi James,

We would send currently our message headers wrapper so we can have some 
platform/audit data even on the delete available. We find having a custom 
wrapper means some of our internal platform pieces we currently cannot open 
source or share but would love to (see kip-82) 

This data still needed on a delete from our message headers wrapper for our 
tooling and platform needs:

General:

Transaction guid’s used by our APM tooling to track and trace a transaction 
through multiple systems.
ClusterId (we have this custom set on old brokers and look to transition to the 
new clusterId available in 0.10.1.0)  this is used by our inter datacenter 
replication tool that keeps our clusters in multiple DC’s and regions where 
needed in sync 
   we can do this because expect/design our system so a single key for a 
data piece to be transactioned only in one DC (unless DC failure) but can have 
different keys transactioned in other DC’s. But DC’s need a full global view 
and from app perspective agnostic to their DC (platform does this).
   this system is always on and we alert if any consuming lag, as such we 
can expect to get every event (even on compaction)
   this system reconciles on restart if we find consumer lag > x 
predetermined time value (sub minute we set this)
   
We also have the following setup for some flows:

App -> compacted topic -> replicator -> event topic

Apps use kafka for compacted topics for k,v stores instead of the likes of 
postgres, just as per the details in bottled water or any CDC solution, you 
want to capture every change (including delete), there is some business meta 
data even on a DB delete that you want to capture the traditional, who, what, 
where and when set of data that a) needs to be captured when deleting data, and 
like wise we need to still replicate to our event topics for storage, audit and 
replay needs.

Cheers
Mike

On 10/28/16, 8:14 AM, "James Cheng" <wushuja...@gmail.com> wrote:

Michael,

What information would you want to include on a delete tombstone, and what 
would you use it for?

-James

Sent from my iPhone

> On Oct 27, 2016, at 9:17 PM, Michael Pearce <michael.pea...@ig.com> wrote:
> 
> Hi Jay,
> 
> I think use case that is the issue that Konstantin mentioned in the 
kip-82 thread , and also we have at IG is clear use case.
> 
> Many companies are using message wrappers, these are useful because as 
per kip-82 see their use cases (I don't think I need to re iterate the large 
list here) many of these need the headers even on a null value.
> 
> The issue this though then causes is that you cannot send these messages 
onto a compacted topic and ever have a delete/tombstone. And so companies are 
doing things like double send one with an message envelope so it gets 
transported followed by and empty message. Or by having a seperate process 
looking for the empty envelope and pushing back an empty value record to make 
the broker tombstone it off. As mentioned in the kip-82 thread these cause 
nasty race issues and prod issues.
> 
> LinkedIn were also very clear in if you use compaction currently there 
they cannot use their managed Kafka services that rely on their headers 
implementation. This was flagged in the kip-82 discussion also.
> 
> For streams it would be fairly easy to keep its current behaviour by when 
sending a null value to have logic to also add the delete marker. This would be 
the same for any framework built on Kafka if their desire was to keep the same 
logic spark and samza come to mind.
> 
> Like wise as noted, we could make this configurable globally and topic 
level as per this thread where we are discussing the section about 
comparability and rollout.
> 
> 
> 
> Rgds
> Mike
> 
> From: Jay Kreps <j...@confluent.io>
    > Sent: Thursday, October 27, 2016 10:54:45 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> 
> I kind of agree with James that it is a bit questionable how valuable any
> data in a delete marker can be since it will be deleted somewhat
> nondeterministically.
> 
> Let's definitely ensure the change is worth the resulting pain and
> additional complexity in the data model.
> 
> I think the two things we maybe conflated in the original compaction work
> was the semantics of the message and its retention policy (I'm not sure,
> but maybe).
> 
> In some sense a normal Kafka topic is a stream of pure appends (inserts). 
A
> compacted topic is a series of revisions to the keyed entity--updates or
> deletes.
> 
> Currently the semantics of the messages a

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-01 Thread Mayuresh Gharat
On a side note just a use case that came to my mind, if we ever plan to add
headers to kafka, we can provide this functionality of compaction based on
headers and make the compaction policy pluggable on the broker side. This
might give flexibility for doing compaction in future.

Thanks,

Mayuresh

On Tue, Nov 1, 2016 at 1:15 PM, Becket Qin  wrote:

> My two cents on the magic value bumping.
>
> Although the broker can infer from the request version to determine whether
> it should read the new attribute bit or not, currently the request version
> is not propagated to the replica manager.
> Bumping up the magic value does not seem a big overhead in this case. It
> also help avoid the potential ambiguity. maybe it is better to bump up the
> magic value.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Oct 31, 2016 at 4:17 PM, Jay Kreps  wrote:
>
> > Xavier,
> >
> > Yeah I think that post KIP-58 it is possible to depend on the delivery of
> > messages in compacted topics, if you override the default compaction
> time.
> > Prior to that it was true that you could control the delete retention,
> but
> > any message including the tombstone could be compacted away prior to
> > delivery. That was what I meant by non-deterministic. Now that we have
> that
> > KIP I agree that at least it can be given an SLA like any other message
> > (which was what I was meaning by deterministic).
> >
> > -Jay
> >
> > On Fri, Oct 28, 2016 at 10:15 AM, Xavier Léauté 
> > wrote:
> >
> > > >
> > > > I kind of agree with James that it is a bit questionable how valuable
> > any
> > > > data in a delete marker can be since it will be deleted somewhat
> > > > nondeterministically.
> > > >
> > >
> > > One could argue that even in normal topics, assuming a time-based log
> > > retention policy is in place, any message will be deleted somewhat
> > > nondeterministally, so why treat the compacted ones any differently? To
> > me
> > > at least, the retention setting for delete messages seems to be the
> > > counterpart to the time-based retention setting for normal topics.
> > >
> > > Currently the semantics of the messages are in the eye of the
> > beholder--you
> > > > can choose to interpret a stream as either being appends or revisions
> > as
> > > > you choose. This proposal is changing that so that the semantics are
> > > > determined by the sender.
> > >
> > >
> > > Let's imagine someone wanted to augment this stream to include audit
> logs
> > > for each record update, e.g. which user made the change. One would want
> > to
> > > include that information as part of the message, and have the ability
> to
> > > mark a deletion.
> > >
> > > I don't think it changes the semantics in this case, you can still
> choose
> > > to interpret the data as a stream of audit log entries (inserts),
> > ignoring
> > > the tombstone flag, or you can interpret it as a table modeling only
> the
> > > latest version of each record. Whether a compacted or normal topic is
> > used
> > > shouldn't matter to the sender.
> > >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-01 Thread Becket Qin
My two cents on the magic value bumping.

Although the broker can infer from the request version to determine whether
it should read the new attribute bit or not, currently the request version
is not propagated to the replica manager.
Bumping up the magic value does not seem a big overhead in this case. It
also help avoid the potential ambiguity. maybe it is better to bump up the
magic value.

Thanks,

Jiangjie (Becket) Qin

On Mon, Oct 31, 2016 at 4:17 PM, Jay Kreps  wrote:

> Xavier,
>
> Yeah I think that post KIP-58 it is possible to depend on the delivery of
> messages in compacted topics, if you override the default compaction time.
> Prior to that it was true that you could control the delete retention, but
> any message including the tombstone could be compacted away prior to
> delivery. That was what I meant by non-deterministic. Now that we have that
> KIP I agree that at least it can be given an SLA like any other message
> (which was what I was meaning by deterministic).
>
> -Jay
>
> On Fri, Oct 28, 2016 at 10:15 AM, Xavier Léauté 
> wrote:
>
> > >
> > > I kind of agree with James that it is a bit questionable how valuable
> any
> > > data in a delete marker can be since it will be deleted somewhat
> > > nondeterministically.
> > >
> >
> > One could argue that even in normal topics, assuming a time-based log
> > retention policy is in place, any message will be deleted somewhat
> > nondeterministally, so why treat the compacted ones any differently? To
> me
> > at least, the retention setting for delete messages seems to be the
> > counterpart to the time-based retention setting for normal topics.
> >
> > Currently the semantics of the messages are in the eye of the
> beholder--you
> > > can choose to interpret a stream as either being appends or revisions
> as
> > > you choose. This proposal is changing that so that the semantics are
> > > determined by the sender.
> >
> >
> > Let's imagine someone wanted to augment this stream to include audit logs
> > for each record update, e.g. which user made the change. One would want
> to
> > include that information as part of the message, and have the ability to
> > mark a deletion.
> >
> > I don't think it changes the semantics in this case, you can still choose
> > to interpret the data as a stream of audit log entries (inserts),
> ignoring
> > the tombstone flag, or you can interpret it as a table modeling only the
> > latest version of each record. Whether a compacted or normal topic is
> used
> > shouldn't matter to the sender.
> >
>


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-31 Thread Jay Kreps
Xavier,

Yeah I think that post KIP-58 it is possible to depend on the delivery of
messages in compacted topics, if you override the default compaction time.
Prior to that it was true that you could control the delete retention, but
any message including the tombstone could be compacted away prior to
delivery. That was what I meant by non-deterministic. Now that we have that
KIP I agree that at least it can be given an SLA like any other message
(which was what I was meaning by deterministic).

-Jay

On Fri, Oct 28, 2016 at 10:15 AM, Xavier Léauté  wrote:

> >
> > I kind of agree with James that it is a bit questionable how valuable any
> > data in a delete marker can be since it will be deleted somewhat
> > nondeterministically.
> >
>
> One could argue that even in normal topics, assuming a time-based log
> retention policy is in place, any message will be deleted somewhat
> nondeterministally, so why treat the compacted ones any differently? To me
> at least, the retention setting for delete messages seems to be the
> counterpart to the time-based retention setting for normal topics.
>
> Currently the semantics of the messages are in the eye of the beholder--you
> > can choose to interpret a stream as either being appends or revisions as
> > you choose. This proposal is changing that so that the semantics are
> > determined by the sender.
>
>
> Let's imagine someone wanted to augment this stream to include audit logs
> for each record update, e.g. which user made the change. One would want to
> include that information as part of the message, and have the ability to
> mark a deletion.
>
> I don't think it changes the semantics in this case, you can still choose
> to interpret the data as a stream of audit log entries (inserts), ignoring
> the tombstone flag, or you can interpret it as a table modeling only the
> latest version of each record. Whether a compacted or normal topic is used
> shouldn't matter to the sender.
>


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-28 Thread Xavier Léauté
>
> I kind of agree with James that it is a bit questionable how valuable any
> data in a delete marker can be since it will be deleted somewhat
> nondeterministically.
>

One could argue that even in normal topics, assuming a time-based log
retention policy is in place, any message will be deleted somewhat
nondeterministally, so why treat the compacted ones any differently? To me
at least, the retention setting for delete messages seems to be the
counterpart to the time-based retention setting for normal topics.

Currently the semantics of the messages are in the eye of the beholder--you
> can choose to interpret a stream as either being appends or revisions as
> you choose. This proposal is changing that so that the semantics are
> determined by the sender.


Let's imagine someone wanted to augment this stream to include audit logs
for each record update, e.g. which user made the change. One would want to
include that information as part of the message, and have the ability to
mark a deletion.

I don't think it changes the semantics in this case, you can still choose
to interpret the data as a stream of audit log entries (inserts), ignoring
the tombstone flag, or you can interpret it as a table modeling only the
latest version of each record. Whether a compacted or normal topic is used
shouldn't matter to the sender.


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-28 Thread James Cheng
Michael,

What information would you want to include on a delete tombstone, and what 
would you use it for?

-James

Sent from my iPhone

> On Oct 27, 2016, at 9:17 PM, Michael Pearce <michael.pea...@ig.com> wrote:
> 
> Hi Jay,
> 
> I think use case that is the issue that Konstantin mentioned in the kip-82 
> thread , and also we have at IG is clear use case.
> 
> Many companies are using message wrappers, these are useful because as per 
> kip-82 see their use cases (I don't think I need to re iterate the large list 
> here) many of these need the headers even on a null value.
> 
> The issue this though then causes is that you cannot send these messages onto 
> a compacted topic and ever have a delete/tombstone. And so companies are 
> doing things like double send one with an message envelope so it gets 
> transported followed by and empty message. Or by having a seperate process 
> looking for the empty envelope and pushing back an empty value record to make 
> the broker tombstone it off. As mentioned in the kip-82 thread these cause 
> nasty race issues and prod issues.
> 
> LinkedIn were also very clear in if you use compaction currently there they 
> cannot use their managed Kafka services that rely on their headers 
> implementation. This was flagged in the kip-82 discussion also.
> 
> For streams it would be fairly easy to keep its current behaviour by when 
> sending a null value to have logic to also add the delete marker. This would 
> be the same for any framework built on Kafka if their desire was to keep the 
> same logic spark and samza come to mind.
> 
> Like wise as noted, we could make this configurable globally and topic level 
> as per this thread where we are discussing the section about comparability 
> and rollout.
> 
> 
> 
> Rgds
> Mike
> 
> From: Jay Kreps <j...@confluent.io>
> Sent: Thursday, October 27, 2016 10:54:45 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> 
> I kind of agree with James that it is a bit questionable how valuable any
> data in a delete marker can be since it will be deleted somewhat
> nondeterministically.
> 
> Let's definitely ensure the change is worth the resulting pain and
> additional complexity in the data model.
> 
> I think the two things we maybe conflated in the original compaction work
> was the semantics of the message and its retention policy (I'm not sure,
> but maybe).
> 
> In some sense a normal Kafka topic is a stream of pure appends (inserts). A
> compacted topic is a series of revisions to the keyed entity--updates or
> deletes.
> 
> Currently the semantics of the messages are in the eye of the beholder--you
> can choose to interpret a stream as either being appends or revisions as
> you choose. This proposal is changing that so that the semantics are
> determined by the sender.
> 
> So in Kafka Streams you could have a KTable of "customer account updates"
> which would model the latest version of each customer account record; you
> could also have a KStream which would model the stream of updates being
> made to customer account records. You can create either of these off the
> same topic---the semantics come from your interpretation not the data
> itself. Both of these interpretations actually make sense: if you want to
> count the number of accounts in a given geographical region you want to
> compute that off the KTable, if you want to count the number of account
> modifications you want to compute that off the KStream.
> 
> This proposal changes this slightly. Now we are saying the semantics of the
> message are set by the sender. I'm not sure if this is better or worse--it
> seems a little at odds with what we are doing in streams. If we are going
> to change it I wonder if there aren't actually three types of message
> {INSERT, UPDATE, DELETE}. (And by update I really mean "upsert"). Compacted
> topics only make sense if your topic contains only UPDATE and/or DELETE
> messages. "Normal" topics are pure inserts.
> 
> If we are making this change how does it effect streams? What happens if I
> send a DELETE message to a non-compacted topic where deletes are impossible?
> 
> -Jay
> 
> 
> 
> On Tue, Oct 25, 2016 at 9:09 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
> 
>> Hi All,
>> 
>> I would like to discuss the following KIP proposal:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 87+-+Add+Compaction+Tombstone+Flag
>> 
>> This is off the back of the discussion on KIP-82  / KIP meeting where it
>> was agreed to separate this issue and feature. See:
>> http://mai

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-27 Thread Michael Pearce
*compatability

From: Michael Pearce <michael.pea...@ig.com>
Sent: Friday, October 28, 2016 5:17:07 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

Hi Jay,

I think use case that is the issue that Konstantin mentioned in the kip-82 
thread , and also we have at IG is clear use case.

Many companies are using message wrappers, these are useful because as per 
kip-82 see their use cases (I don't think I need to re iterate the large list 
here) many of these need the headers even on a null value.

The issue this though then causes is that you cannot send these messages onto a 
compacted topic and ever have a delete/tombstone. And so companies are doing 
things like double send one with an message envelope so it gets transported 
followed by and empty message. Or by having a seperate process looking for the 
empty envelope and pushing back an empty value record to make the broker 
tombstone it off. As mentioned in the kip-82 thread these cause nasty race 
issues and prod issues.

LinkedIn were also very clear in if you use compaction currently there they 
cannot use their managed Kafka services that rely on their headers 
implementation. This was flagged in the kip-82 discussion also.

For streams it would be fairly easy to keep its current behaviour by when 
sending a null value to have logic to also add the delete marker. This would be 
the same for any framework built on Kafka if their desire was to keep the same 
logic spark and samza come to mind.

Like wise as noted, we could make this configurable globally and topic level as 
per this thread where we are discussing the section about comparability and 
rollout.



Rgds
Mike

From: Jay Kreps <j...@confluent.io>
Sent: Thursday, October 27, 2016 10:54:45 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

I kind of agree with James that it is a bit questionable how valuable any
data in a delete marker can be since it will be deleted somewhat
nondeterministically.

Let's definitely ensure the change is worth the resulting pain and
additional complexity in the data model.

I think the two things we maybe conflated in the original compaction work
was the semantics of the message and its retention policy (I'm not sure,
but maybe).

In some sense a normal Kafka topic is a stream of pure appends (inserts). A
compacted topic is a series of revisions to the keyed entity--updates or
deletes.

Currently the semantics of the messages are in the eye of the beholder--you
can choose to interpret a stream as either being appends or revisions as
you choose. This proposal is changing that so that the semantics are
determined by the sender.

So in Kafka Streams you could have a KTable of "customer account updates"
which would model the latest version of each customer account record; you
could also have a KStream which would model the stream of updates being
made to customer account records. You can create either of these off the
same topic---the semantics come from your interpretation not the data
itself. Both of these interpretations actually make sense: if you want to
count the number of accounts in a given geographical region you want to
compute that off the KTable, if you want to count the number of account
modifications you want to compute that off the KStream.

This proposal changes this slightly. Now we are saying the semantics of the
message are set by the sender. I'm not sure if this is better or worse--it
seems a little at odds with what we are doing in streams. If we are going
to change it I wonder if there aren't actually three types of message
{INSERT, UPDATE, DELETE}. (And by update I really mean "upsert"). Compacted
topics only make sense if your topic contains only UPDATE and/or DELETE
messages. "Normal" topics are pure inserts.

If we are making this change how does it effect streams? What happens if I
send a DELETE message to a non-compacted topic where deletes are impossible?

-Jay



On Tue, Oct 25, 2016 at 9:09 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Hi All,
>
> I would like to discuss the following KIP proposal:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 87+-+Add+Compaction+Tombstone+Flag
>
> This is off the back of the discussion on KIP-82  / KIP meeting where it
> was agreed to separate this issue and feature. See:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> EZsBsNyDcKr=g...@mail.gmail.com%3e
>
> Thanks
> Mike
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also no

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-27 Thread Michael Pearce
Hi Jay,

I think use case that is the issue that Konstantin mentioned in the kip-82 
thread , and also we have at IG is clear use case.

Many companies are using message wrappers, these are useful because as per 
kip-82 see their use cases (I don't think I need to re iterate the large list 
here) many of these need the headers even on a null value.

The issue this though then causes is that you cannot send these messages onto a 
compacted topic and ever have a delete/tombstone. And so companies are doing 
things like double send one with an message envelope so it gets transported 
followed by and empty message. Or by having a seperate process looking for the 
empty envelope and pushing back an empty value record to make the broker 
tombstone it off. As mentioned in the kip-82 thread these cause nasty race 
issues and prod issues.

LinkedIn were also very clear in if you use compaction currently there they 
cannot use their managed Kafka services that rely on their headers 
implementation. This was flagged in the kip-82 discussion also.

For streams it would be fairly easy to keep its current behaviour by when 
sending a null value to have logic to also add the delete marker. This would be 
the same for any framework built on Kafka if their desire was to keep the same 
logic spark and samza come to mind.

Like wise as noted, we could make this configurable globally and topic level as 
per this thread where we are discussing the section about comparability and 
rollout.



Rgds
Mike

From: Jay Kreps <j...@confluent.io>
Sent: Thursday, October 27, 2016 10:54:45 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

I kind of agree with James that it is a bit questionable how valuable any
data in a delete marker can be since it will be deleted somewhat
nondeterministically.

Let's definitely ensure the change is worth the resulting pain and
additional complexity in the data model.

I think the two things we maybe conflated in the original compaction work
was the semantics of the message and its retention policy (I'm not sure,
but maybe).

In some sense a normal Kafka topic is a stream of pure appends (inserts). A
compacted topic is a series of revisions to the keyed entity--updates or
deletes.

Currently the semantics of the messages are in the eye of the beholder--you
can choose to interpret a stream as either being appends or revisions as
you choose. This proposal is changing that so that the semantics are
determined by the sender.

So in Kafka Streams you could have a KTable of "customer account updates"
which would model the latest version of each customer account record; you
could also have a KStream which would model the stream of updates being
made to customer account records. You can create either of these off the
same topic---the semantics come from your interpretation not the data
itself. Both of these interpretations actually make sense: if you want to
count the number of accounts in a given geographical region you want to
compute that off the KTable, if you want to count the number of account
modifications you want to compute that off the KStream.

This proposal changes this slightly. Now we are saying the semantics of the
message are set by the sender. I'm not sure if this is better or worse--it
seems a little at odds with what we are doing in streams. If we are going
to change it I wonder if there aren't actually three types of message
{INSERT, UPDATE, DELETE}. (And by update I really mean "upsert"). Compacted
topics only make sense if your topic contains only UPDATE and/or DELETE
messages. "Normal" topics are pure inserts.

If we are making this change how does it effect streams? What happens if I
send a DELETE message to a non-compacted topic where deletes are impossible?

-Jay



On Tue, Oct 25, 2016 at 9:09 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Hi All,
>
> I would like to discuss the following KIP proposal:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 87+-+Add+Compaction+Tombstone+Flag
>
> This is off the back of the discussion on KIP-82  / KIP meeting where it
> was agreed to separate this issue and feature. See:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> EZsBsNyDcKr=g...@mail.gmail.com%3e
>
> Thanks
> Mike
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understo

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-27 Thread Jay Kreps
I kind of agree with James that it is a bit questionable how valuable any
data in a delete marker can be since it will be deleted somewhat
nondeterministically.

Let's definitely ensure the change is worth the resulting pain and
additional complexity in the data model.

I think the two things we maybe conflated in the original compaction work
was the semantics of the message and its retention policy (I'm not sure,
but maybe).

In some sense a normal Kafka topic is a stream of pure appends (inserts). A
compacted topic is a series of revisions to the keyed entity--updates or
deletes.

Currently the semantics of the messages are in the eye of the beholder--you
can choose to interpret a stream as either being appends or revisions as
you choose. This proposal is changing that so that the semantics are
determined by the sender.

So in Kafka Streams you could have a KTable of "customer account updates"
which would model the latest version of each customer account record; you
could also have a KStream which would model the stream of updates being
made to customer account records. You can create either of these off the
same topic---the semantics come from your interpretation not the data
itself. Both of these interpretations actually make sense: if you want to
count the number of accounts in a given geographical region you want to
compute that off the KTable, if you want to count the number of account
modifications you want to compute that off the KStream.

This proposal changes this slightly. Now we are saying the semantics of the
message are set by the sender. I'm not sure if this is better or worse--it
seems a little at odds with what we are doing in streams. If we are going
to change it I wonder if there aren't actually three types of message
{INSERT, UPDATE, DELETE}. (And by update I really mean "upsert"). Compacted
topics only make sense if your topic contains only UPDATE and/or DELETE
messages. "Normal" topics are pure inserts.

If we are making this change how does it effect streams? What happens if I
send a DELETE message to a non-compacted topic where deletes are impossible?

-Jay



On Tue, Oct 25, 2016 at 9:09 AM, Michael Pearce 
wrote:

> Hi All,
>
> I would like to discuss the following KIP proposal:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 87+-+Add+Compaction+Tombstone+Flag
>
> This is off the back of the discussion on KIP-82  / KIP meeting where it
> was agreed to separate this issue and feature. See:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> EZsBsNyDcKr=g...@mail.gmail.com%3e
>
> Thanks
> Mike
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-27 Thread Michael Pearce
Thanks, James, I think this is a really good addition to the KIP details, 
please feel free to amend the wiki/add the use cases, also if any others you 
think of. I definitely think its worthwhile documenting. If you can’t let me 
know ill add them next week (just leaving for a long weekend off)

Re Joel and others comments about upgrade and compatibility.

Rather than trying to auto manage this.

Actually maybe we make a configuration option, both at server and per topic 
level to control the behavior of how the server logic should work out if the 
record, is a tombstone record .

e.g.

key = compation.tombstone.marker

value options:

value   (continues to use null value as tombstone marker)
flag (expects to use the tombstone flag)
value_or_flag (if either is true it treats the record as a tombstone)

This way on upgrade users can keep current behavior, and slowly migrate to the 
new. Having a transition period of using value_or_flag, finally having flag 
only if an organization wishes to use null values without it being treated as a 
tombstone marker (use case noted below)

Having it both global broker level and topic override also allows some 
flexibility here.

Cheers
Mike






On 10/27/16, 8:03 AM, "James Cheng"  wrote:

This KIP would definitely address a gap in the current functionality, where 
you currently can't have a tombstone with any associated content.

That said, I'd like to talk about use cases, to make sure that this is in 
fact useful. The KIP should be updated with whatever use cases we come up with.

First of all, an observation: When we speak about log compaction, we 
typically think of "the latest message for a key is retained". In that respect, 
a delete tombstone (i.e. a message with a null payload) is treated the same as 
any other Kafka message: the latest message is retained. It doesn't matter 
whether the latest message is null, or if the latest message has actual 
content. In all cases, the last message is retained.

The only way a delete tombstone is treated differently from other Kafka 
messages is that it automatically disappears after a while. The time of 
deletion is specified using delete.retention.ms.

So what we're really talking about is, do we want to support messages in a 
log-compacted topic that auto-delete themselves after a while?

In a thread from 2015, there was a discussion on first-class support of 
headers between Roger Hoover, Felix GV, Jun Rao, and I. See thread at 
https://groups.google.com/d/msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ 
 . 
In that thread, Jun raised a good question that I didn't have a good answer for 
at the time: If a message is going to auto-delete itself after a while, how 
important was the message? That is, what information did the message contain 
that was important *for a while* but not so important that it needed to be kept 
around forever?

Some use cases that I can think of:

1) Tracability. I would like to know who issued this delete tombstone. It 
might include the hostname, IP of the producer of the delete.
2) Timestamps. I would like to know when this delete was issued. This use 
case is already addressed by the availability of per-message timestamps that 
came in 0.10.0
3) Data provenance. I hope I'm using this phrase correctly, but what I mean 
is, where did this delete come from? What processing job emitted it? What input 
to the processing job caused this delete to be produced? For example, if a 
record in topic A was processed and caused a delete tombstone to be emitted to 
topic B, I might like the offset of the topic A message to be attached to the 
topic B message.
4) Distributed tracing for stream topologies. This might be a slight repeat 
of the above use cases. In the microservices world, we can generate call-graphs 
of webservices using tools like Zipkin/opentracing.io , 
or something homegrown like 
https://engineering.linkedin.com/distributed-service-call-graph/real-time-distributed-tracing-website-performance-and-efficiency
 
.
 I can imagine that you might want to do something similar for stream 
processing topologies, where stream processing jobs carry along and forward 
along a globally unique identifier, and a distributed topology graph is 
generated.
5) Cases where processing a delete requires data that is not available in 
the message key. I'm not sure I have a good example of this, though. One 
hand-wavy example might be where I am publishing documents into Kafka where the 
documentId is the message key, and the text contents of the document are in the 
message body. And I have a consuming job that does some analytics on the 
message body. If that document gets deleted, then the consuming job might 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-27 Thread James Cheng
This KIP would definitely address a gap in the current functionality, where you 
currently can't have a tombstone with any associated content.

That said, I'd like to talk about use cases, to make sure that this is in fact 
useful. The KIP should be updated with whatever use cases we come up with.

First of all, an observation: When we speak about log compaction, we typically 
think of "the latest message for a key is retained". In that respect, a delete 
tombstone (i.e. a message with a null payload) is treated the same as any other 
Kafka message: the latest message is retained. It doesn't matter whether the 
latest message is null, or if the latest message has actual content. In all 
cases, the last message is retained.

The only way a delete tombstone is treated differently from other Kafka 
messages is that it automatically disappears after a while. The time of 
deletion is specified using delete.retention.ms.

So what we're really talking about is, do we want to support messages in a 
log-compacted topic that auto-delete themselves after a while?

In a thread from 2015, there was a discussion on first-class support of headers 
between Roger Hoover, Felix GV, Jun Rao, and I. See thread at 
https://groups.google.com/d/msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ 
 . 
In that thread, Jun raised a good question that I didn't have a good answer for 
at the time: If a message is going to auto-delete itself after a while, how 
important was the message? That is, what information did the message contain 
that was important *for a while* but not so important that it needed to be kept 
around forever?

Some use cases that I can think of:

1) Tracability. I would like to know who issued this delete tombstone. It might 
include the hostname, IP of the producer of the delete.
2) Timestamps. I would like to know when this delete was issued. This use case 
is already addressed by the availability of per-message timestamps that came in 
0.10.0
3) Data provenance. I hope I'm using this phrase correctly, but what I mean is, 
where did this delete come from? What processing job emitted it? What input to 
the processing job caused this delete to be produced? For example, if a record 
in topic A was processed and caused a delete tombstone to be emitted to topic 
B, I might like the offset of the topic A message to be attached to the topic B 
message.
4) Distributed tracing for stream topologies. This might be a slight repeat of 
the above use cases. In the microservices world, we can generate call-graphs of 
webservices using tools like Zipkin/opentracing.io , or 
something homegrown like 
https://engineering.linkedin.com/distributed-service-call-graph/real-time-distributed-tracing-website-performance-and-efficiency
 
.
 I can imagine that you might want to do something similar for stream 
processing topologies, where stream processing jobs carry along and forward 
along a globally unique identifier, and a distributed topology graph is 
generated.
5) Cases where processing a delete requires data that is not available in the 
message key. I'm not sure I have a good example of this, though. One hand-wavy 
example might be where I am publishing documents into Kafka where the 
documentId is the message key, and the text contents of the document are in the 
message body. And I have a consuming job that does some analytics on the 
message body. If that document gets deleted, then the consuming job might need 
the original message body in order to "delete" that message's impact from the 
analytics. But I'm not sure that is a great example. If the consumer was 
worried about that, the consumer would probably keep the original message 
around, stored by primary key. And then all it would need from a delete message 
would be the primary key of the message.

Do people think these are valid use cases?

What are other use cases that people can think of?

-James

> On Oct 26, 2016, at 3:46 PM, Mayuresh Gharat  
> wrote:
> 
> +1 @Joel.
> I think a clear migration plan of upgrading and downgrading of server and
> clients along with handling of issues that Joel mentioned, on the KIP would
> be really great.
> 
> Thanks,
> 
> Mayuresh
> 
> On Wed, Oct 26, 2016 at 3:31 PM, Joel Koshy  wrote:
> 
>> I'm not sure why it would be useful, but it should be theoretically
>> possible if the attribute bit alone is enough to mark a tombstone. OTOH, we
>> could consider that as invalid if we wish. These are relevant details that
>> I think should be added to the KIP.
>> 
>> Also, in the few odd scenarios that I mentioned we should also consider
>> that fetches could be coming from other yet-to-be-upgraded brokers in a
>> cluster that is being upgraded. So we would 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Mayuresh Gharat
+1 @Joel.
I think a clear migration plan of upgrading and downgrading of server and
clients along with handling of issues that Joel mentioned, on the KIP would
be really great.

Thanks,

Mayuresh

On Wed, Oct 26, 2016 at 3:31 PM, Joel Koshy  wrote:

> I'm not sure why it would be useful, but it should be theoretically
> possible if the attribute bit alone is enough to mark a tombstone. OTOH, we
> could consider that as invalid if we wish. These are relevant details that
> I think should be added to the KIP.
>
> Also, in the few odd scenarios that I mentioned we should also consider
> that fetches could be coming from other yet-to-be-upgraded brokers in a
> cluster that is being upgraded. So we would probably want to continue to
> support nulls as tombstones or down-convert in a way that we are sure works
> with least surprise to fetchers.
>
> There is a slightly vague statement under "Compatibility, Deprecation, and
> Migration Plan" that could benefit more details: *Logic would base on
> current behavior of null value or if tombstone flag set to true, as such
> wouldn't impact any existing flows simply allow new producers to make use
> of the feature*. It is unclear to me based on that whether you would
> interpret null as a tombstone if the tombstone attribute bit is off.
>
> On Wed, Oct 26, 2016 at 3:10 PM, Xavier Léauté 
> wrote:
>
> > Does this mean that starting with V4 requests we would allow storing null
> > messages in compacted topics? The KIP should probably clarify the
> behavior
> > for null messages where the tombstone flag is not net.
> >
> > On Wed, Oct 26, 2016 at 1:32 AM Magnus Edenhill 
> > wrote:
> >
> > > 2016-10-25 21:36 GMT+02:00 Nacho Solis :
> > >
> > > > I think you probably require a MagicByte bump if you expect correct
> > > > behavior of the system as a whole.
> > > >
> > > > From a client perspective you want to make sure that when you
> deliver a
> > > > message that the broker supports the feature you're expecting
> > > > (compaction).  So, depending on the behavior of the broker on
> > > encountering
> > > > a previously undefined bit flag I would suggest making some change to
> > > make
> > > > certain that flag-based compaction is supported.  I'm going to guess
> > that
> > > > the MagicByte would do this.
> > > >
> > >
> > > I dont believe this is needed since it is already attributed through
> the
> > > request's API version.
> > >
> > > Producer:
> > >  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> > > tombstone
> > >  * if a clients sends ProduceRequest  ignored
> > > and value==null indicates a tombstone
> > >  * in both cases the on-disk messages are stored with attributes.bit5
> (I
> > > assume?)
> > >
> > > Consumer:
> > >  * if a clients sends FetchRequest V4 messages are sendfile():ed
> directly
> > > from disk (with attributes.bit5)
> > >  * if a client sends FetchRequest  > > translated from attributes.bit5 to value=null as required.
> > >
> > >
> > > That's my understanding anyway, please correct me if I'm wrong.
> > >
> > > /Magnus
> > >
> > >
> > >
> > > > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill <
> mag...@edenhill.se>
> > > > wrote:
> > > >
> > > > > It is safe to assume that a previously undefined attributes bit
> will
> > be
> > > > > unset in protocol requests from existing clients, if not, such a
> > client
> > > > is
> > > > > already violating the protocol and needs to be fixed.
> > > > >
> > > > > So I dont see a need for a MagicByte bump, both broker and client
> has
> > > the
> > > > > information it needs to construct or parse the message according to
> > > > request
> > > > > version.
> > > > >
> > > > >
> > > > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > > > >
> > > > > > Hi Magnus,
> > > > > >
> > > > > > I was wondering if I even needed to change those also, as
> > technically
> > > > > > we’re just making use of a non used attribute bit, but im not
> 100%
> > > that
> > > > > it
> > > > > > be always false currently.
> > > > > >
> > > > > > If someone can say 100% it will already be set false with current
> > and
> > > > > > historic bit wise masking techniques used over the time, we could
> > do
> > > > away
> > > > > > with both, and simply just start to use it. Unfortunately I don’t
> > > have
> > > > > that
> > > > > > historic knowledge so was hoping it would be flagged up in this
> > > > > discussion
> > > > > > thread ☺
> > > > > >
> > > > > > Cheers
> > > > > > Mike
> > > > > >
> > > > > > On 10/25/16, 5:36 PM, "Magnus Edenhill" 
> > wrote:
> > > > > >
> > > > > > Hi Michael,
> > > > > >
> > > > > > With the version bumps for Produce and Fetch requests, do you
> > > > really
> > > > > > need
> > > > > > to bump MagicByte too?
> > > > > >
> > > > > > Regards,
> > > > > > Magnus
> > > > > >
> > > > > >
> > > > > > 2016-10-25 18:09 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Joel Koshy
I'm not sure why it would be useful, but it should be theoretically
possible if the attribute bit alone is enough to mark a tombstone. OTOH, we
could consider that as invalid if we wish. These are relevant details that
I think should be added to the KIP.

Also, in the few odd scenarios that I mentioned we should also consider
that fetches could be coming from other yet-to-be-upgraded brokers in a
cluster that is being upgraded. So we would probably want to continue to
support nulls as tombstones or down-convert in a way that we are sure works
with least surprise to fetchers.

There is a slightly vague statement under "Compatibility, Deprecation, and
Migration Plan" that could benefit more details: *Logic would base on
current behavior of null value or if tombstone flag set to true, as such
wouldn't impact any existing flows simply allow new producers to make use
of the feature*. It is unclear to me based on that whether you would
interpret null as a tombstone if the tombstone attribute bit is off.

On Wed, Oct 26, 2016 at 3:10 PM, Xavier Léauté  wrote:

> Does this mean that starting with V4 requests we would allow storing null
> messages in compacted topics? The KIP should probably clarify the behavior
> for null messages where the tombstone flag is not net.
>
> On Wed, Oct 26, 2016 at 1:32 AM Magnus Edenhill 
> wrote:
>
> > 2016-10-25 21:36 GMT+02:00 Nacho Solis :
> >
> > > I think you probably require a MagicByte bump if you expect correct
> > > behavior of the system as a whole.
> > >
> > > From a client perspective you want to make sure that when you deliver a
> > > message that the broker supports the feature you're expecting
> > > (compaction).  So, depending on the behavior of the broker on
> > encountering
> > > a previously undefined bit flag I would suggest making some change to
> > make
> > > certain that flag-based compaction is supported.  I'm going to guess
> that
> > > the MagicByte would do this.
> > >
> >
> > I dont believe this is needed since it is already attributed through the
> > request's API version.
> >
> > Producer:
> >  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> > tombstone
> >  * if a clients sends ProduceRequest  > and value==null indicates a tombstone
> >  * in both cases the on-disk messages are stored with attributes.bit5 (I
> > assume?)
> >
> > Consumer:
> >  * if a clients sends FetchRequest V4 messages are sendfile():ed directly
> > from disk (with attributes.bit5)
> >  * if a client sends FetchRequest  > translated from attributes.bit5 to value=null as required.
> >
> >
> > That's my understanding anyway, please correct me if I'm wrong.
> >
> > /Magnus
> >
> >
> >
> > > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
> > > wrote:
> > >
> > > > It is safe to assume that a previously undefined attributes bit will
> be
> > > > unset in protocol requests from existing clients, if not, such a
> client
> > > is
> > > > already violating the protocol and needs to be fixed.
> > > >
> > > > So I dont see a need for a MagicByte bump, both broker and client has
> > the
> > > > information it needs to construct or parse the message according to
> > > request
> > > > version.
> > > >
> > > >
> > > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > > >
> > > > > Hi Magnus,
> > > > >
> > > > > I was wondering if I even needed to change those also, as
> technically
> > > > > we’re just making use of a non used attribute bit, but im not 100%
> > that
> > > > it
> > > > > be always false currently.
> > > > >
> > > > > If someone can say 100% it will already be set false with current
> and
> > > > > historic bit wise masking techniques used over the time, we could
> do
> > > away
> > > > > with both, and simply just start to use it. Unfortunately I don’t
> > have
> > > > that
> > > > > historic knowledge so was hoping it would be flagged up in this
> > > > discussion
> > > > > thread ☺
> > > > >
> > > > > Cheers
> > > > > Mike
> > > > >
> > > > > On 10/25/16, 5:36 PM, "Magnus Edenhill" 
> wrote:
> > > > >
> > > > > Hi Michael,
> > > > >
> > > > > With the version bumps for Produce and Fetch requests, do you
> > > really
> > > > > need
> > > > > to bump MagicByte too?
> > > > >
> > > > > Regards,
> > > > > Magnus
> > > > >
> > > > >
> > > > > 2016-10-25 18:09 GMT+02:00 Michael Pearce <
> michael.pea...@ig.com
> > >:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I would like to discuss the following KIP proposal:
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 87+-+Add+Compaction+Tombstone+Flag
> > > > > >
> > > > > > This is off the back of the discussion on KIP-82  / KIP
> meeting
> > > > > where it
> > > > > > was agreed to separate this issue and feature. See:
> > > > > > 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Xavier Léauté
Does this mean that starting with V4 requests we would allow storing null
messages in compacted topics? The KIP should probably clarify the behavior
for null messages where the tombstone flag is not net.

On Wed, Oct 26, 2016 at 1:32 AM Magnus Edenhill  wrote:

> 2016-10-25 21:36 GMT+02:00 Nacho Solis :
>
> > I think you probably require a MagicByte bump if you expect correct
> > behavior of the system as a whole.
> >
> > From a client perspective you want to make sure that when you deliver a
> > message that the broker supports the feature you're expecting
> > (compaction).  So, depending on the behavior of the broker on
> encountering
> > a previously undefined bit flag I would suggest making some change to
> make
> > certain that flag-based compaction is supported.  I'm going to guess that
> > the MagicByte would do this.
> >
>
> I dont believe this is needed since it is already attributed through the
> request's API version.
>
> Producer:
>  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> tombstone
>  * if a clients sends ProduceRequest  and value==null indicates a tombstone
>  * in both cases the on-disk messages are stored with attributes.bit5 (I
> assume?)
>
> Consumer:
>  * if a clients sends FetchRequest V4 messages are sendfile():ed directly
> from disk (with attributes.bit5)
>  * if a client sends FetchRequest  translated from attributes.bit5 to value=null as required.
>
>
> That's my understanding anyway, please correct me if I'm wrong.
>
> /Magnus
>
>
>
> > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
> > wrote:
> >
> > > It is safe to assume that a previously undefined attributes bit will be
> > > unset in protocol requests from existing clients, if not, such a client
> > is
> > > already violating the protocol and needs to be fixed.
> > >
> > > So I dont see a need for a MagicByte bump, both broker and client has
> the
> > > information it needs to construct or parse the message according to
> > request
> > > version.
> > >
> > >
> > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > >
> > > > Hi Magnus,
> > > >
> > > > I was wondering if I even needed to change those also, as technically
> > > > we’re just making use of a non used attribute bit, but im not 100%
> that
> > > it
> > > > be always false currently.
> > > >
> > > > If someone can say 100% it will already be set false with current and
> > > > historic bit wise masking techniques used over the time, we could do
> > away
> > > > with both, and simply just start to use it. Unfortunately I don’t
> have
> > > that
> > > > historic knowledge so was hoping it would be flagged up in this
> > > discussion
> > > > thread ☺
> > > >
> > > > Cheers
> > > > Mike
> > > >
> > > > On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > With the version bumps for Produce and Fetch requests, do you
> > really
> > > > need
> > > > to bump MagicByte too?
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> > > >
> > > > 2016-10-25 18:09 GMT+02:00 Michael Pearce  >:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to discuss the following KIP proposal:
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 87+-+Add+Compaction+Tombstone+Flag
> > > > >
> > > > > This is off the back of the discussion on KIP-82  / KIP meeting
> > > > where it
> > > > > was agreed to separate this issue and feature. See:
> > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > > > > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > > > > EZsBsNyDcKr=g...@mail.gmail.com%3e
> > > > >
> > > > > Thanks
> > > > > Mike
> > > > >
> > > > > The information contained in this email is strictly
> confidential
> > > and
> > > > for
> > > > > the use of the addressee only, unless otherwise indicated. If
> you
> > > > are not
> > > > > the intended recipient, please do not read, copy, use or
> disclose
> > > to
> > > > others
> > > > > this message or any attachment. Please also notify the sender
> by
> > > > replying
> > > > > to this email or by telephone (+44(020 7896 0011) and then
> delete
> > > > the email
> > > > > and any copies of it. Opinions, conclusion (etc) that do not
> > relate
> > > > to the
> > > > > official business of this company shall be understood as
> neither
> > > > given nor
> > > > > endorsed by it. IG is a trading name of IG Markets Limited (a
> > > company
> > > > > registered in England and Wales, company number 04008957) and
> IG
> > > > Index
> > > > > Limited (a company registered in England and Wales, company
> > number
> > > > > 01190902). Registered address at Cannon Bridge House, 25
> Dowgate
> > > > Hill,
> > > > > London EC4R 2YA. Both IG Markets Limited (register number
> 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Joel Koshy
A magic byte bump would be required for the addition of new fields; or
removal of existing fields. Changing the interpretation of an existing
field (e.g., switching from absolute to relative offsets) almost always
needs a magic byte bump as well.

One concern Nacho alluded to (I think) is if a newer client sends a
tombstone with the tombstone attribute set (and non-null value field) to an
older broker. While it is true that a higher magic byte would cause the
(older) broker to reject the message, the fact that it is sending a newer
version of the produce request which the broker does not accept will also
cause it to be rejected. So that safeguard already exists for the broker to
indicate that it does not support flag-based compaction.

So strictly speaking we could do without a magic byte bump. However in this
case, we are also changing the interpretation of the value field. Up until
now, a null entry in the value field of the message would cause it to be
interpreted as a tombstone. With this proposal, that is no longer the case.
In fact if an older client issues a fetch request for a tombstone on a
newer broker we should decide on whether we should down-convert it to the
older format and consider replacing the value field (if non-null) to be
null. Another weird scenario is if there is a non-tombstone message with a
null value. An older client that fetches this message would be unable to
tell whether it is a tombstone or not.

Also, under "rejected alternatives" - I'm not a huge fan of using headers
(should they materialize) to mark tombstones. I strongly prefer the
attribute bit or even the existing mechanism of nulls in payloads over
custom headers.

Thanks,

Joel

On Wed, Oct 26, 2016 at 11:23 AM, Magnus Edenhill 
wrote:

> Hi Renu,
>
> that is not completely true, the LZ4 compression codec was added without a
> MagicByte bump.
> (LZ4 might be a bad example though since this feature was added without
> updating the protocol docs..)
>
> Unless the broker needs the MagicByte internally (for translating old logs
> on disk or whatever)
> I dont see a need for bumping the MagicByte when just adding a bit
> *definition* to an existing field.
>
> With just one MagicByte bump so far it is hard to say what is right or
> wrong, but this might be
> a good time for us to decide on some rules.
>
> I dont have a strong opinion, either way is fine with me.
>
> Regards,
> Magnus
>
>
> 2016-10-26 20:01 GMT+02:00 Renu Tewari :
>
> > +1 @Magnus.
> > It is also in line with traditional use of the magic field to indicate a
> > change in the format of the message. Thus a change in the magic field
> > indicates a different "schema" which in this case would reflect adding a
> > new field, removing a field or changing the type of fields etc.
> >
> > The version number bump does not change the message format but just the
> > interpretation of the values.
> >
> > Going with what @Michael suggested on keeping it simple we don't need to
> a
> > add a new field to indicate a tombstone and therefore require a change in
> > the magic byte field. The attribute bit is sufficient for indicating the
> > new interpretation of attribute.bit5 to indicate a tombstone.
> >
> > Bumping the version number and changing the attribute bit keeps it
> simple.
> >
> >
> > Thanks
> > Renu
> >
> >
> >
> > On Wed, Oct 26, 2016 at 10:09 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > I see the reasoning Magnus described.
> > > If you check the docs https://kafka.apache.org/
> > documentation#messageformat
> > > ,
> > > it says :
> > >
> > > 1 byte "magic" identifier to allow format changes, value is 0 or 1
> > >
> > > Moreover as per comments in the code :
> > > /**
> > >* The "magic" value
> > >* When magic value is 0, the message uses absolute offset and does
> not
> > > have a timestamp field.
> > >* When magic value is 1, the message uses relative offset and has a
> > > timestamp field.
> > >*/
> > >
> > > Since timeStamp was added as an actual field we bumped the the magic
> byte
> > > to 1 for this change.
> > > But since we are not adding an actual field, we can do away with
> bumping
> > up
> > > the magic byte.
> > >
> > > If we really want to go the standard route of bumping up the magic byte
> > for
> > > any change to message format we should actually add a new field for
> > > handling log compaction instead of just using the attribute field,
> which
> > > might sound like an overkill.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Oct 26, 2016 at 1:32 AM, Magnus Edenhill 
> > > wrote:
> > >
> > > > 2016-10-25 21:36 GMT+02:00 Nacho Solis  >:
> > > >
> > > > > I think you probably require a MagicByte bump if you expect correct
> > > > > behavior of the system as a whole.
> > > > >
> > > > > From a client perspective you want to make sure that when you
> > deliver a
> > > > > 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Mayuresh Gharat
I agree with Magnus about deciding rules on when to update.
If we can handle such changes by request versioning, it would be good to
understand the use cases for bumping up magic byte or vice versa.
Unless I am not understanding it correctly, having two versioning schemes
seems little weird.

Thanks,

Mayuresh

On Wed, Oct 26, 2016 at 11:23 AM, Magnus Edenhill 
wrote:

> Hi Renu,
>
> that is not completely true, the LZ4 compression codec was added without a
> MagicByte bump.
> (LZ4 might be a bad example though since this feature was added without
> updating the protocol docs..)
>
> Unless the broker needs the MagicByte internally (for translating old logs
> on disk or whatever)
> I dont see a need for bumping the MagicByte when just adding a bit
> *definition* to an existing field.
>
> With just one MagicByte bump so far it is hard to say what is right or
> wrong, but this might be
> a good time for us to decide on some rules.
>
> I dont have a strong opinion, either way is fine with me.
>
> Regards,
> Magnus
>
>
> 2016-10-26 20:01 GMT+02:00 Renu Tewari :
>
> > +1 @Magnus.
> > It is also in line with traditional use of the magic field to indicate a
> > change in the format of the message. Thus a change in the magic field
> > indicates a different "schema" which in this case would reflect adding a
> > new field, removing a field or changing the type of fields etc.
> >
> > The version number bump does not change the message format but just the
> > interpretation of the values.
> >
> > Going with what @Michael suggested on keeping it simple we don't need to
> a
> > add a new field to indicate a tombstone and therefore require a change in
> > the magic byte field. The attribute bit is sufficient for indicating the
> > new interpretation of attribute.bit5 to indicate a tombstone.
> >
> > Bumping the version number and changing the attribute bit keeps it
> simple.
> >
> >
> > Thanks
> > Renu
> >
> >
> >
> > On Wed, Oct 26, 2016 at 10:09 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > I see the reasoning Magnus described.
> > > If you check the docs https://kafka.apache.org/
> > documentation#messageformat
> > > ,
> > > it says :
> > >
> > > 1 byte "magic" identifier to allow format changes, value is 0 or 1
> > >
> > > Moreover as per comments in the code :
> > > /**
> > >* The "magic" value
> > >* When magic value is 0, the message uses absolute offset and does
> not
> > > have a timestamp field.
> > >* When magic value is 1, the message uses relative offset and has a
> > > timestamp field.
> > >*/
> > >
> > > Since timeStamp was added as an actual field we bumped the the magic
> byte
> > > to 1 for this change.
> > > But since we are not adding an actual field, we can do away with
> bumping
> > up
> > > the magic byte.
> > >
> > > If we really want to go the standard route of bumping up the magic byte
> > for
> > > any change to message format we should actually add a new field for
> > > handling log compaction instead of just using the attribute field,
> which
> > > might sound like an overkill.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Oct 26, 2016 at 1:32 AM, Magnus Edenhill 
> > > wrote:
> > >
> > > > 2016-10-25 21:36 GMT+02:00 Nacho Solis  >:
> > > >
> > > > > I think you probably require a MagicByte bump if you expect correct
> > > > > behavior of the system as a whole.
> > > > >
> > > > > From a client perspective you want to make sure that when you
> > deliver a
> > > > > message that the broker supports the feature you're expecting
> > > > > (compaction).  So, depending on the behavior of the broker on
> > > > encountering
> > > > > a previously undefined bit flag I would suggest making some change
> to
> > > > make
> > > > > certain that flag-based compaction is supported.  I'm going to
> guess
> > > that
> > > > > the MagicByte would do this.
> > > > >
> > > >
> > > > I dont believe this is needed since it is already attributed through
> > the
> > > > request's API version.
> > > >
> > > > Producer:
> > > >  * if a client sends ProduceRequest V4 then attributes.bit5
> indicates a
> > > > tombstone
> > > >  * if a clients sends ProduceRequest  > ignored
> > > > and value==null indicates a tombstone
> > > >  * in both cases the on-disk messages are stored with attributes.bit5
> > (I
> > > > assume?)
> > > >
> > > > Consumer:
> > > >  * if a clients sends FetchRequest V4 messages are sendfile():ed
> > directly
> > > > from disk (with attributes.bit5)
> > > >  * if a client sends FetchRequest  > > > translated from attributes.bit5 to value=null as required.
> > > >
> > > >
> > > > That's my understanding anyway, please correct me if I'm wrong.
> > > >
> > > > /Magnus
> > > >
> > > >
> > > >
> > > > > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill <
> > mag...@edenhill.se>
> > > > > wrote:
> > > > >
> > > > > > It is safe to 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Magnus Edenhill
Hi Renu,

that is not completely true, the LZ4 compression codec was added without a
MagicByte bump.
(LZ4 might be a bad example though since this feature was added without
updating the protocol docs..)

Unless the broker needs the MagicByte internally (for translating old logs
on disk or whatever)
I dont see a need for bumping the MagicByte when just adding a bit
*definition* to an existing field.

With just one MagicByte bump so far it is hard to say what is right or
wrong, but this might be
a good time for us to decide on some rules.

I dont have a strong opinion, either way is fine with me.

Regards,
Magnus


2016-10-26 20:01 GMT+02:00 Renu Tewari :

> +1 @Magnus.
> It is also in line with traditional use of the magic field to indicate a
> change in the format of the message. Thus a change in the magic field
> indicates a different "schema" which in this case would reflect adding a
> new field, removing a field or changing the type of fields etc.
>
> The version number bump does not change the message format but just the
> interpretation of the values.
>
> Going with what @Michael suggested on keeping it simple we don't need to a
> add a new field to indicate a tombstone and therefore require a change in
> the magic byte field. The attribute bit is sufficient for indicating the
> new interpretation of attribute.bit5 to indicate a tombstone.
>
> Bumping the version number and changing the attribute bit keeps it simple.
>
>
> Thanks
> Renu
>
>
>
> On Wed, Oct 26, 2016 at 10:09 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > I see the reasoning Magnus described.
> > If you check the docs https://kafka.apache.org/
> documentation#messageformat
> > ,
> > it says :
> >
> > 1 byte "magic" identifier to allow format changes, value is 0 or 1
> >
> > Moreover as per comments in the code :
> > /**
> >* The "magic" value
> >* When magic value is 0, the message uses absolute offset and does not
> > have a timestamp field.
> >* When magic value is 1, the message uses relative offset and has a
> > timestamp field.
> >*/
> >
> > Since timeStamp was added as an actual field we bumped the the magic byte
> > to 1 for this change.
> > But since we are not adding an actual field, we can do away with bumping
> up
> > the magic byte.
> >
> > If we really want to go the standard route of bumping up the magic byte
> for
> > any change to message format we should actually add a new field for
> > handling log compaction instead of just using the attribute field, which
> > might sound like an overkill.
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> >
> >
> >
> >
> > On Wed, Oct 26, 2016 at 1:32 AM, Magnus Edenhill 
> > wrote:
> >
> > > 2016-10-25 21:36 GMT+02:00 Nacho Solis :
> > >
> > > > I think you probably require a MagicByte bump if you expect correct
> > > > behavior of the system as a whole.
> > > >
> > > > From a client perspective you want to make sure that when you
> deliver a
> > > > message that the broker supports the feature you're expecting
> > > > (compaction).  So, depending on the behavior of the broker on
> > > encountering
> > > > a previously undefined bit flag I would suggest making some change to
> > > make
> > > > certain that flag-based compaction is supported.  I'm going to guess
> > that
> > > > the MagicByte would do this.
> > > >
> > >
> > > I dont believe this is needed since it is already attributed through
> the
> > > request's API version.
> > >
> > > Producer:
> > >  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> > > tombstone
> > >  * if a clients sends ProduceRequest  ignored
> > > and value==null indicates a tombstone
> > >  * in both cases the on-disk messages are stored with attributes.bit5
> (I
> > > assume?)
> > >
> > > Consumer:
> > >  * if a clients sends FetchRequest V4 messages are sendfile():ed
> directly
> > > from disk (with attributes.bit5)
> > >  * if a client sends FetchRequest  > > translated from attributes.bit5 to value=null as required.
> > >
> > >
> > > That's my understanding anyway, please correct me if I'm wrong.
> > >
> > > /Magnus
> > >
> > >
> > >
> > > > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill <
> mag...@edenhill.se>
> > > > wrote:
> > > >
> > > > > It is safe to assume that a previously undefined attributes bit
> will
> > be
> > > > > unset in protocol requests from existing clients, if not, such a
> > client
> > > > is
> > > > > already violating the protocol and needs to be fixed.
> > > > >
> > > > > So I dont see a need for a MagicByte bump, both broker and client
> has
> > > the
> > > > > information it needs to construct or parse the message according to
> > > > request
> > > > > version.
> > > > >
> > > > >
> > > > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > > > >
> > > > > > Hi Magnus,
> > > > > >
> > > > > > I was wondering if I even needed to change those also, as
> > technically
> > > > > > we’re 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Renu Tewari
+1 @Magnus.
It is also in line with traditional use of the magic field to indicate a
change in the format of the message. Thus a change in the magic field
indicates a different "schema" which in this case would reflect adding a
new field, removing a field or changing the type of fields etc.

The version number bump does not change the message format but just the
interpretation of the values.

Going with what @Michael suggested on keeping it simple we don't need to a
add a new field to indicate a tombstone and therefore require a change in
the magic byte field. The attribute bit is sufficient for indicating the
new interpretation of attribute.bit5 to indicate a tombstone.

Bumping the version number and changing the attribute bit keeps it simple.


Thanks
Renu



On Wed, Oct 26, 2016 at 10:09 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> I see the reasoning Magnus described.
> If you check the docs https://kafka.apache.org/documentation#messageformat
> ,
> it says :
>
> 1 byte "magic" identifier to allow format changes, value is 0 or 1
>
> Moreover as per comments in the code :
> /**
>* The "magic" value
>* When magic value is 0, the message uses absolute offset and does not
> have a timestamp field.
>* When magic value is 1, the message uses relative offset and has a
> timestamp field.
>*/
>
> Since timeStamp was added as an actual field we bumped the the magic byte
> to 1 for this change.
> But since we are not adding an actual field, we can do away with bumping up
> the magic byte.
>
> If we really want to go the standard route of bumping up the magic byte for
> any change to message format we should actually add a new field for
> handling log compaction instead of just using the attribute field, which
> might sound like an overkill.
>
> Thanks,
>
> Mayuresh
>
>
>
>
>
>
> On Wed, Oct 26, 2016 at 1:32 AM, Magnus Edenhill 
> wrote:
>
> > 2016-10-25 21:36 GMT+02:00 Nacho Solis :
> >
> > > I think you probably require a MagicByte bump if you expect correct
> > > behavior of the system as a whole.
> > >
> > > From a client perspective you want to make sure that when you deliver a
> > > message that the broker supports the feature you're expecting
> > > (compaction).  So, depending on the behavior of the broker on
> > encountering
> > > a previously undefined bit flag I would suggest making some change to
> > make
> > > certain that flag-based compaction is supported.  I'm going to guess
> that
> > > the MagicByte would do this.
> > >
> >
> > I dont believe this is needed since it is already attributed through the
> > request's API version.
> >
> > Producer:
> >  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> > tombstone
> >  * if a clients sends ProduceRequest  > and value==null indicates a tombstone
> >  * in both cases the on-disk messages are stored with attributes.bit5 (I
> > assume?)
> >
> > Consumer:
> >  * if a clients sends FetchRequest V4 messages are sendfile():ed directly
> > from disk (with attributes.bit5)
> >  * if a client sends FetchRequest  > translated from attributes.bit5 to value=null as required.
> >
> >
> > That's my understanding anyway, please correct me if I'm wrong.
> >
> > /Magnus
> >
> >
> >
> > > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
> > > wrote:
> > >
> > > > It is safe to assume that a previously undefined attributes bit will
> be
> > > > unset in protocol requests from existing clients, if not, such a
> client
> > > is
> > > > already violating the protocol and needs to be fixed.
> > > >
> > > > So I dont see a need for a MagicByte bump, both broker and client has
> > the
> > > > information it needs to construct or parse the message according to
> > > request
> > > > version.
> > > >
> > > >
> > > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > > >
> > > > > Hi Magnus,
> > > > >
> > > > > I was wondering if I even needed to change those also, as
> technically
> > > > > we’re just making use of a non used attribute bit, but im not 100%
> > that
> > > > it
> > > > > be always false currently.
> > > > >
> > > > > If someone can say 100% it will already be set false with current
> and
> > > > > historic bit wise masking techniques used over the time, we could
> do
> > > away
> > > > > with both, and simply just start to use it. Unfortunately I don’t
> > have
> > > > that
> > > > > historic knowledge so was hoping it would be flagged up in this
> > > > discussion
> > > > > thread ☺
> > > > >
> > > > > Cheers
> > > > > Mike
> > > > >
> > > > > On 10/25/16, 5:36 PM, "Magnus Edenhill" 
> wrote:
> > > > >
> > > > > Hi Michael,
> > > > >
> > > > > With the version bumps for Produce and Fetch requests, do you
> > > really
> > > > > need
> > > > > to bump MagicByte too?
> > > > >
> > > > > Regards,
> > > > > Magnus
> > > > >
> > > > >
> > > > > 2016-10-25 18:09 GMT+02:00 Michael 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Mayuresh Gharat
I see the reasoning Magnus described.
If you check the docs https://kafka.apache.org/documentation#messageformat,
it says :

1 byte "magic" identifier to allow format changes, value is 0 or 1

Moreover as per comments in the code :
/**
   * The "magic" value
   * When magic value is 0, the message uses absolute offset and does not
have a timestamp field.
   * When magic value is 1, the message uses relative offset and has a
timestamp field.
   */

Since timeStamp was added as an actual field we bumped the the magic byte
to 1 for this change.
But since we are not adding an actual field, we can do away with bumping up
the magic byte.

If we really want to go the standard route of bumping up the magic byte for
any change to message format we should actually add a new field for
handling log compaction instead of just using the attribute field, which
might sound like an overkill.

Thanks,

Mayuresh






On Wed, Oct 26, 2016 at 1:32 AM, Magnus Edenhill  wrote:

> 2016-10-25 21:36 GMT+02:00 Nacho Solis :
>
> > I think you probably require a MagicByte bump if you expect correct
> > behavior of the system as a whole.
> >
> > From a client perspective you want to make sure that when you deliver a
> > message that the broker supports the feature you're expecting
> > (compaction).  So, depending on the behavior of the broker on
> encountering
> > a previously undefined bit flag I would suggest making some change to
> make
> > certain that flag-based compaction is supported.  I'm going to guess that
> > the MagicByte would do this.
> >
>
> I dont believe this is needed since it is already attributed through the
> request's API version.
>
> Producer:
>  * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
> tombstone
>  * if a clients sends ProduceRequest  and value==null indicates a tombstone
>  * in both cases the on-disk messages are stored with attributes.bit5 (I
> assume?)
>
> Consumer:
>  * if a clients sends FetchRequest V4 messages are sendfile():ed directly
> from disk (with attributes.bit5)
>  * if a client sends FetchRequest  translated from attributes.bit5 to value=null as required.
>
>
> That's my understanding anyway, please correct me if I'm wrong.
>
> /Magnus
>
>
>
> > On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
> > wrote:
> >
> > > It is safe to assume that a previously undefined attributes bit will be
> > > unset in protocol requests from existing clients, if not, such a client
> > is
> > > already violating the protocol and needs to be fixed.
> > >
> > > So I dont see a need for a MagicByte bump, both broker and client has
> the
> > > information it needs to construct or parse the message according to
> > request
> > > version.
> > >
> > >
> > > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> > >
> > > > Hi Magnus,
> > > >
> > > > I was wondering if I even needed to change those also, as technically
> > > > we’re just making use of a non used attribute bit, but im not 100%
> that
> > > it
> > > > be always false currently.
> > > >
> > > > If someone can say 100% it will already be set false with current and
> > > > historic bit wise masking techniques used over the time, we could do
> > away
> > > > with both, and simply just start to use it. Unfortunately I don’t
> have
> > > that
> > > > historic knowledge so was hoping it would be flagged up in this
> > > discussion
> > > > thread ☺
> > > >
> > > > Cheers
> > > > Mike
> > > >
> > > > On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > With the version bumps for Produce and Fetch requests, do you
> > really
> > > > need
> > > > to bump MagicByte too?
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> > > >
> > > > 2016-10-25 18:09 GMT+02:00 Michael Pearce  >:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to discuss the following KIP proposal:
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 87+-+Add+Compaction+Tombstone+Flag
> > > > >
> > > > > This is off the back of the discussion on KIP-82  / KIP meeting
> > > > where it
> > > > > was agreed to separate this issue and feature. See:
> > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > > > > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > > > > EZsBsNyDcKr=g...@mail.gmail.com%3e
> > > > >
> > > > > Thanks
> > > > > Mike
> > > > >
> > > > > The information contained in this email is strictly
> confidential
> > > and
> > > > for
> > > > > the use of the addressee only, unless otherwise indicated. If
> you
> > > > are not
> > > > > the intended recipient, please do not read, copy, use or
> disclose
> > > to
> > > > others
> > > > > this message or any attachment. Please also notify the sender
> by
> > > > replying
> > > > > to this email or 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-26 Thread Magnus Edenhill
2016-10-25 21:36 GMT+02:00 Nacho Solis :

> I think you probably require a MagicByte bump if you expect correct
> behavior of the system as a whole.
>
> From a client perspective you want to make sure that when you deliver a
> message that the broker supports the feature you're expecting
> (compaction).  So, depending on the behavior of the broker on encountering
> a previously undefined bit flag I would suggest making some change to make
> certain that flag-based compaction is supported.  I'm going to guess that
> the MagicByte would do this.
>

I dont believe this is needed since it is already attributed through the
request's API version.

Producer:
 * if a client sends ProduceRequest V4 then attributes.bit5 indicates a
tombstone
 * if a clients sends ProduceRequest  On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
> wrote:
>
> > It is safe to assume that a previously undefined attributes bit will be
> > unset in protocol requests from existing clients, if not, such a client
> is
> > already violating the protocol and needs to be fixed.
> >
> > So I dont see a need for a MagicByte bump, both broker and client has the
> > information it needs to construct or parse the message according to
> request
> > version.
> >
> >
> > 2016-10-25 18:48 GMT+02:00 Michael Pearce :
> >
> > > Hi Magnus,
> > >
> > > I was wondering if I even needed to change those also, as technically
> > > we’re just making use of a non used attribute bit, but im not 100% that
> > it
> > > be always false currently.
> > >
> > > If someone can say 100% it will already be set false with current and
> > > historic bit wise masking techniques used over the time, we could do
> away
> > > with both, and simply just start to use it. Unfortunately I don’t have
> > that
> > > historic knowledge so was hoping it would be flagged up in this
> > discussion
> > > thread ☺
> > >
> > > Cheers
> > > Mike
> > >
> > > On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:
> > >
> > > Hi Michael,
> > >
> > > With the version bumps for Produce and Fetch requests, do you
> really
> > > need
> > > to bump MagicByte too?
> > >
> > > Regards,
> > > Magnus
> > >
> > >
> > > 2016-10-25 18:09 GMT+02:00 Michael Pearce :
> > >
> > > > Hi All,
> > > >
> > > > I would like to discuss the following KIP proposal:
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 87+-+Add+Compaction+Tombstone+Flag
> > > >
> > > > This is off the back of the discussion on KIP-82  / KIP meeting
> > > where it
> > > > was agreed to separate this issue and feature. See:
> > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > > > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > > > EZsBsNyDcKr=g...@mail.gmail.com%3e
> > > >
> > > > Thanks
> > > > Mike
> > > >
> > > > The information contained in this email is strictly confidential
> > and
> > > for
> > > > the use of the addressee only, unless otherwise indicated. If you
> > > are not
> > > > the intended recipient, please do not read, copy, use or disclose
> > to
> > > others
> > > > this message or any attachment. Please also notify the sender by
> > > replying
> > > > to this email or by telephone (+44(020 7896 0011) and then delete
> > > the email
> > > > and any copies of it. Opinions, conclusion (etc) that do not
> relate
> > > to the
> > > > official business of this company shall be understood as neither
> > > given nor
> > > > endorsed by it. IG is a trading name of IG Markets Limited (a
> > company
> > > > registered in England and Wales, company number 04008957) and IG
> > > Index
> > > > Limited (a company registered in England and Wales, company
> number
> > > > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
> > > Hill,
> > > > London EC4R 2YA. Both IG Markets Limited (register number 195355)
> > > and IG
> > > > Index Limited (register number 114059) are authorised and
> regulated
> > > by the
> > > > Financial Conduct Authority.
> > > >
> > >
> > >
> > > The information contained in this email is strictly confidential and
> for
> > > the use of the addressee only, unless otherwise indicated. If you are
> not
> > > the intended recipient, please do not read, copy, use or disclose to
> > others
> > > this message or any attachment. Please also notify the sender by
> replying
> > > to this email or by telephone (+44(020 7896 0011) and then delete the
> > email
> > > and any copies of it. Opinions, conclusion (etc) that do not relate to
> > the
> > > official business of this company shall be understood as neither given
> > nor
> > > endorsed by it. IG is a trading name of IG Markets Limited (a company
> > > registered in England and Wales, company number 04008957) and IG Index
> > > Limited (a company registered in England and 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Michael Pearce
If we can be sure as per Magnus mentions then I'd actually prefer not to be 
bumping the magic bit, as we take the default as false as such no behaviour 
change.

I had included the magic bit and api changes only because of uncertainty how 
true that by default the flags should be false.

The less change the less intrusive we are the better imo (this was my intent on 
kip-82 also), here we can simply update the protocol to state the attribute bit 
is used for tombstone flag and make the server side change and update the Java 
client.

After all the attributes as mentioned in kip-82 meetings are for essentially 
core Kafka header flags as such we should as per our arguement for headers be 
able to allocate and use without a wire protocol change. It be damn ugly if we 
had to do this every time we need to support a new header value for a feature. 
If this isn't true then I think that's an argument further for headers (alas I 
think though it is and as such we should be able to)

Like wise I think typically the upgrade path so far has been to upgrade the 
broker then the clients.





From: Nacho Solis <nso...@linkedin.com.INVALID>
Sent: Tuesday, October 25, 2016 8:36:24 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

I think you probably require a MagicByte bump if you expect correct
behavior of the system as a whole.

From a client perspective you want to make sure that when you deliver a
message that the broker supports the feature you're expecting
(compaction).  So, depending on the behavior of the broker on encountering
a previously undefined bit flag I would suggest making some change to make
certain that flag-based compaction is supported.  I'm going to guess that
the MagicByte would do this.

While we're at it, we might as well see if anything else needs to be
changed.

BTW, this could potentially be handled differently in other circumstances,
like when you're expecting the flag to be interpreted end-to-end.

Nacho

(I haven't looked at the specific code so maybe my assumptions are wrong)

On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill <mag...@edenhill.se>
wrote:

> It is safe to assume that a previously undefined attributes bit will be
> unset in protocol requests from existing clients, if not, such a client is
> already violating the protocol and needs to be fixed.
>
> So I dont see a need for a MagicByte bump, both broker and client has the
> information it needs to construct or parse the message according to request
> version.
>
>
> 2016-10-25 18:48 GMT+02:00 Michael Pearce <michael.pea...@ig.com>:
>
> > Hi Magnus,
> >
> > I was wondering if I even needed to change those also, as technically
> > we’re just making use of a non used attribute bit, but im not 100% that
> it
> > be always false currently.
> >
> > If someone can say 100% it will already be set false with current and
> > historic bit wise masking techniques used over the time, we could do away
> > with both, and simply just start to use it. Unfortunately I don’t have
> that
> > historic knowledge so was hoping it would be flagged up in this
> discussion
> > thread ☺
> >
> > Cheers
> > Mike
> >
> > On 10/25/16, 5:36 PM, "Magnus Edenhill" <mag...@edenhill.se> wrote:
> >
> > Hi Michael,
> >
> > With the version bumps for Produce and Fetch requests, do you really
> > need
> > to bump MagicByte too?
> >
> > Regards,
> > Magnus
> >
> >
> > 2016-10-25 18:09 GMT+02:00 Michael Pearce <michael.pea...@ig.com>:
> >
> > > Hi All,
> > >
> > > I would like to discuss the following KIP proposal:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 87+-+Add+Compaction+Tombstone+Flag
> > >
> > > This is off the back of the discussion on KIP-82  / KIP meeting
> > where it
> > > was agreed to separate this issue and feature. See:
> > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > > EZsBsNyDcKr=g...@mail.gmail.com%3e
> > >
> > > Thanks
> > > Mike
> > >
> > > The information contained in this email is strictly confidential
> and
> > for
> > > the use of the addressee only, unless otherwise indicated. If you
> > are not
> > > the intended recipient, please do not read, copy, use or disclose
> to
> > others
> > > this message or any attachment. Please also notify the sender by
> > replying
> > > to this email or by telephone

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Nacho Solis
I think you probably require a MagicByte bump if you expect correct
behavior of the system as a whole.

>From a client perspective you want to make sure that when you deliver a
message that the broker supports the feature you're expecting
(compaction).  So, depending on the behavior of the broker on encountering
a previously undefined bit flag I would suggest making some change to make
certain that flag-based compaction is supported.  I'm going to guess that
the MagicByte would do this.

While we're at it, we might as well see if anything else needs to be
changed.

BTW, this could potentially be handled differently in other circumstances,
like when you're expecting the flag to be interpreted end-to-end.

Nacho

(I haven't looked at the specific code so maybe my assumptions are wrong)

On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill 
wrote:

> It is safe to assume that a previously undefined attributes bit will be
> unset in protocol requests from existing clients, if not, such a client is
> already violating the protocol and needs to be fixed.
>
> So I dont see a need for a MagicByte bump, both broker and client has the
> information it needs to construct or parse the message according to request
> version.
>
>
> 2016-10-25 18:48 GMT+02:00 Michael Pearce :
>
> > Hi Magnus,
> >
> > I was wondering if I even needed to change those also, as technically
> > we’re just making use of a non used attribute bit, but im not 100% that
> it
> > be always false currently.
> >
> > If someone can say 100% it will already be set false with current and
> > historic bit wise masking techniques used over the time, we could do away
> > with both, and simply just start to use it. Unfortunately I don’t have
> that
> > historic knowledge so was hoping it would be flagged up in this
> discussion
> > thread ☺
> >
> > Cheers
> > Mike
> >
> > On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:
> >
> > Hi Michael,
> >
> > With the version bumps for Produce and Fetch requests, do you really
> > need
> > to bump MagicByte too?
> >
> > Regards,
> > Magnus
> >
> >
> > 2016-10-25 18:09 GMT+02:00 Michael Pearce :
> >
> > > Hi All,
> > >
> > > I would like to discuss the following KIP proposal:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 87+-+Add+Compaction+Tombstone+Flag
> > >
> > > This is off the back of the discussion on KIP-82  / KIP meeting
> > where it
> > > was agreed to separate this issue and feature. See:
> > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > > EZsBsNyDcKr=g...@mail.gmail.com%3e
> > >
> > > Thanks
> > > Mike
> > >
> > > The information contained in this email is strictly confidential
> and
> > for
> > > the use of the addressee only, unless otherwise indicated. If you
> > are not
> > > the intended recipient, please do not read, copy, use or disclose
> to
> > others
> > > this message or any attachment. Please also notify the sender by
> > replying
> > > to this email or by telephone (+44(020 7896 0011) and then delete
> > the email
> > > and any copies of it. Opinions, conclusion (etc) that do not relate
> > to the
> > > official business of this company shall be understood as neither
> > given nor
> > > endorsed by it. IG is a trading name of IG Markets Limited (a
> company
> > > registered in England and Wales, company number 04008957) and IG
> > Index
> > > Limited (a company registered in England and Wales, company number
> > > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
> > Hill,
> > > London EC4R 2YA. Both IG Markets Limited (register number 195355)
> > and IG
> > > Index Limited (register number 114059) are authorised and regulated
> > by the
> > > Financial Conduct Authority.
> > >
> >
> >
> > The information contained in this email is strictly confidential and for
> > the use of the addressee only, unless otherwise indicated. If you are not
> > the intended recipient, please do not read, copy, use or disclose to
> others
> > this message or any attachment. Please also notify the sender by replying
> > to this email or by telephone (+44(020 7896 0011) and then delete the
> email
> > and any copies of it. Opinions, conclusion (etc) that do not relate to
> the
> > official business of this company shall be understood as neither given
> nor
> > endorsed by it. IG is a trading name of IG Markets Limited (a company
> > registered in England and Wales, company number 04008957) and IG Index
> > Limited (a company registered in England and Wales, company number
> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> > Index Limited (register number 114059) are authorised and 

Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Magnus Edenhill
It is safe to assume that a previously undefined attributes bit will be
unset in protocol requests from existing clients, if not, such a client is
already violating the protocol and needs to be fixed.

So I dont see a need for a MagicByte bump, both broker and client has the
information it needs to construct or parse the message according to request
version.


2016-10-25 18:48 GMT+02:00 Michael Pearce :

> Hi Magnus,
>
> I was wondering if I even needed to change those also, as technically
> we’re just making use of a non used attribute bit, but im not 100% that it
> be always false currently.
>
> If someone can say 100% it will already be set false with current and
> historic bit wise masking techniques used over the time, we could do away
> with both, and simply just start to use it. Unfortunately I don’t have that
> historic knowledge so was hoping it would be flagged up in this discussion
> thread ☺
>
> Cheers
> Mike
>
> On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:
>
> Hi Michael,
>
> With the version bumps for Produce and Fetch requests, do you really
> need
> to bump MagicByte too?
>
> Regards,
> Magnus
>
>
> 2016-10-25 18:09 GMT+02:00 Michael Pearce :
>
> > Hi All,
> >
> > I would like to discuss the following KIP proposal:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 87+-+Add+Compaction+Tombstone+Flag
> >
> > This is off the back of the discussion on KIP-82  / KIP meeting
> where it
> > was agreed to separate this issue and feature. See:
> > http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> > mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> > EZsBsNyDcKr=g...@mail.gmail.com%3e
> >
> > Thanks
> > Mike
> >
> > The information contained in this email is strictly confidential and
> for
> > the use of the addressee only, unless otherwise indicated. If you
> are not
> > the intended recipient, please do not read, copy, use or disclose to
> others
> > this message or any attachment. Please also notify the sender by
> replying
> > to this email or by telephone (+44(020 7896 0011) and then delete
> the email
> > and any copies of it. Opinions, conclusion (etc) that do not relate
> to the
> > official business of this company shall be understood as neither
> given nor
> > endorsed by it. IG is a trading name of IG Markets Limited (a company
> > registered in England and Wales, company number 04008957) and IG
> Index
> > Limited (a company registered in England and Wales, company number
> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
> Hill,
> > London EC4R 2YA. Both IG Markets Limited (register number 195355)
> and IG
> > Index Limited (register number 114059) are authorised and regulated
> by the
> > Financial Conduct Authority.
> >
>
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Michael Pearce
Hi Magnus,

I was wondering if I even needed to change those also, as technically we’re 
just making use of a non used attribute bit, but im not 100% that it be always 
false currently.

If someone can say 100% it will already be set false with current and historic 
bit wise masking techniques used over the time, we could do away with both, and 
simply just start to use it. Unfortunately I don’t have that historic knowledge 
so was hoping it would be flagged up in this discussion thread ☺

Cheers
Mike

On 10/25/16, 5:36 PM, "Magnus Edenhill"  wrote:

Hi Michael,

With the version bumps for Produce and Fetch requests, do you really need
to bump MagicByte too?

Regards,
Magnus


2016-10-25 18:09 GMT+02:00 Michael Pearce :

> Hi All,
>
> I would like to discuss the following KIP proposal:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 87+-+Add+Compaction+Tombstone+Flag
>
> This is off the back of the discussion on KIP-82  / KIP meeting where it
> was agreed to separate this issue and feature. See:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> EZsBsNyDcKr=g...@mail.gmail.com%3e
>
> Thanks
> Mike
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to 
others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the 
email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>


The information contained in this email is strictly confidential and for the 
use of the addressee only, unless otherwise indicated. If you are not the 
intended recipient, please do not read, copy, use or disclose to others this 
message or any attachment. Please also notify the sender by replying to this 
email or by telephone (+44(020 7896 0011) and then delete the email and any 
copies of it. Opinions, conclusion (etc) that do not relate to the official 
business of this company shall be understood as neither given nor endorsed by 
it. IG is a trading name of IG Markets Limited (a company registered in England 
and Wales, company number 04008957) and IG Index Limited (a company registered 
in England and Wales, company number 01190902). Registered address at Cannon 
Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited 
(register number 195355) and IG Index Limited (register number 114059) are 
authorised and regulated by the Financial Conduct Authority.


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Magnus Edenhill
Hi Michael,

With the version bumps for Produce and Fetch requests, do you really need
to bump MagicByte too?

Regards,
Magnus


2016-10-25 18:09 GMT+02:00 Michael Pearce :

> Hi All,
>
> I would like to discuss the following KIP proposal:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 87+-+Add+Compaction+Tombstone+Flag
>
> This is off the back of the discussion on KIP-82  / KIP meeting where it
> was agreed to separate this issue and feature. See:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
> EZsBsNyDcKr=g...@mail.gmail.com%3e
>
> Thanks
> Mike
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>


[DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-10-25 Thread Michael Pearce
Hi All,

I would like to discuss the following KIP proposal:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-87+-+Add+Compaction+Tombstone+Flag

This is off the back of the discussion on KIP-82  / KIP meeting where it was 
agreed to separate this issue and feature. See:
http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-EZsBsNyDcKr=g...@mail.gmail.com%3e

Thanks
Mike

The information contained in this email is strictly confidential and for the 
use of the addressee only, unless otherwise indicated. If you are not the 
intended recipient, please do not read, copy, use or disclose to others this 
message or any attachment. Please also notify the sender by replying to this 
email or by telephone (+44(020 7896 0011) and then delete the email and any 
copies of it. Opinions, conclusion (etc) that do not relate to the official 
business of this company shall be understood as neither given nor endorsed by 
it. IG is a trading name of IG Markets Limited (a company registered in England 
and Wales, company number 04008957) and IG Index Limited (a company registered 
in England and Wales, company number 01190902). Registered address at Cannon 
Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited 
(register number 195355) and IG Index Limited (register number 114059) are 
authorised and regulated by the Financial Conduct Authority.