Hi James,
We would send currently our message headers wrapper so we can have some
platform/audit data even on the delete available. We find having a custom
wrapper means some of our internal platform pieces we currently cannot open
source or share but would love to (see kip-82)
This data still needed on a delete from our message headers wrapper for our
tooling and platform needs:
General:
Transaction guid’s used by our APM tooling to track and trace a transaction
through multiple systems.
ClusterId (we have this custom set on old brokers and look to transition to the
new clusterId available in 0.10.1.0) this is used by our inter datacenter
replication tool that keeps our clusters in multiple DC’s and regions where
needed in sync
we can do this because expect/design our system so a single key for a
data piece to be transactioned only in one DC (unless DC failure) but can have
different keys transactioned in other DC’s. But DC’s need a full global view
and from app perspective agnostic to their DC (platform does this).
this system is always on and we alert if any consuming lag, as such we
can expect to get every event (even on compaction)
this system reconciles on restart if we find consumer lag > x
predetermined time value (sub minute we set this)
We also have the following setup for some flows:
App -> compacted topic -> replicator -> event topic
Apps use kafka for compacted topics for k,v stores instead of the likes of
postgres, just as per the details in bottled water or any CDC solution, you
want to capture every change (including delete), there is some business meta
data even on a DB delete that you want to capture the traditional, who, what,
where and when set of data that a) needs to be captured when deleting data, and
like wise we need to still replicate to our event topics for storage, audit and
replay needs.
Cheers
Mike
On 10/28/16, 8:14 AM, "James Cheng" <[email protected]> wrote:
Michael,
What information would you want to include on a delete tombstone, and what
would you use it for?
-James
Sent from my iPhone
> On Oct 27, 2016, at 9:17 PM, Michael Pearce <[email protected]> wrote:
>
> Hi Jay,
>
> I think use case that is the issue that Konstantin mentioned in the
kip-82 thread , and also we have at IG is clear use case.
>
> Many companies are using message wrappers, these are useful because as
per kip-82 see their use cases (I don't think I need to re iterate the large
list here) many of these need the headers even on a null value.
>
> The issue this though then causes is that you cannot send these messages
onto a compacted topic and ever have a delete/tombstone. And so companies are
doing things like double send one with an message envelope so it gets
transported followed by and empty message. Or by having a seperate process
looking for the empty envelope and pushing back an empty value record to make
the broker tombstone it off. As mentioned in the kip-82 thread these cause
nasty race issues and prod issues.
>
> LinkedIn were also very clear in if you use compaction currently there
they cannot use their managed Kafka services that rely on their headers
implementation. This was flagged in the kip-82 discussion also.
>
> For streams it would be fairly easy to keep its current behaviour by when
sending a null value to have logic to also add the delete marker. This would be
the same for any framework built on Kafka if their desire was to keep the same
logic spark and samza come to mind.
>
> Like wise as noted, we could make this configurable globally and topic
level as per this thread where we are discussing the section about
comparability and rollout.
>
>
>
> Rgds
> Mike
> ________________________________________
> From: Jay Kreps <[email protected]>
> Sent: Thursday, October 27, 2016 10:54:45 PM
> To: [email protected]
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> I kind of agree with James that it is a bit questionable how valuable any
> data in a delete marker can be since it will be deleted somewhat
> nondeterministically.
>
> Let's definitely ensure the change is worth the resulting pain and
> additional complexity in the data model.
>
> I think the two things we maybe conflated in the original compaction work
> was the semantics of the message and its retention policy (I'm not sure,
> but maybe).
>
> In some sense a normal Kafka topic is a stream of pure appends (inserts).
A
> compacted topic is a series of revisions to the keyed entity--updates or
> deletes.
>
> Currently the semantics of the messages are in the eye of the
beholder--you
> can choose to interpret a stream as either being appends or revisions as
> you choose. This proposal is changing that so that the semantics are
> determined by the sender.
>
> So in Kafka Streams you could have a KTable of "customer account updates"
> which would model the latest version of each customer account record; you
> could also have a KStream which would model the stream of updates being
> made to customer account records. You can create either of these off the
> same topic---the semantics come from your interpretation not the data
> itself. Both of these interpretations actually make sense: if you want to
> count the number of accounts in a given geographical region you want to
> compute that off the KTable, if you want to count the number of account
> modifications you want to compute that off the KStream.
>
> This proposal changes this slightly. Now we are saying the semantics of
the
> message are set by the sender. I'm not sure if this is better or worse--it
> seems a little at odds with what we are doing in streams. If we are going
> to change it I wonder if there aren't actually three types of message
> {INSERT, UPDATE, DELETE}. (And by update I really mean "upsert").
Compacted
> topics only make sense if your topic contains only UPDATE and/or DELETE
> messages. "Normal" topics are pure inserts.
>
> If we are making this change how does it effect streams? What happens if I
> send a DELETE message to a non-compacted topic where deletes are
impossible?
>
> -Jay
>
>
>
> On Tue, Oct 25, 2016 at 9:09 AM, Michael Pearce <[email protected]>
> wrote:
>
>> Hi All,
>>
>> I would like to discuss the following KIP proposal:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 87+-+Add+Compaction+Tombstone+Flag
>>
>> This is off the back of the discussion on KIP-82 / KIP meeting where it
>> was agreed to separate this issue and feature. See:
>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610.
>> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
>> [email protected]%3e
>>
>> Thanks
>> Mike
>>
>> The information contained in this email is strictly confidential and for
>> the use of the addressee only, unless otherwise indicated. If you are not
>> the intended recipient, please do not read, copy, use or disclose to
others
>> this message or any attachment. Please also notify the sender by replying
>> to this email or by telephone (+44(020 7896 0011) and then delete the
email
>> and any copies of it. Opinions, conclusion (etc) that do not relate to
the
>> official business of this company shall be understood as neither given
nor
>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>> registered in England and Wales, company number 04008957) and IG Index
>> Limited (a company registered in England and Wales, company number
>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>> Index Limited (register number 114059) are authorised and regulated by
the
>> Financial Conduct Authority.
> The information contained in this email is strictly confidential and for
the use of the addressee only, unless otherwise indicated. If you are not the
intended recipient, please do not read, copy, use or disclose to others this
message or any attachment. Please also notify the sender by replying to this
email or by telephone (+44(020 7896 0011) and then delete the email and any
copies of it. Opinions, conclusion (etc) that do not relate to the official
business of this company shall be understood as neither given nor endorsed by
it. IG is a trading name of IG Markets Limited (a company registered in England
and Wales, company number 04008957) and IG Index Limited (a company registered
in England and Wales, company number 01190902). Registered address at Cannon
Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited
(register number 195355) and IG Index Limited (register number 114059) are
authorised and regulated by the Financial Conduct Authority.