Re: [DISCUSS] KIP-82 - Add Record Headers

Jun Rao Wed, 30 Nov 2016 18:51:07 -0800

Hi, Michael,

In order to answer the first two questions, it would be helpful if we could
identify 1 or 2 strong use cases for headers in the space for third-party
vendors. For use cases within an organization, one could always use other
approaches such as company-wise containers to get around w/o headers. I
went through the use cases in the KIP and in Radai's wiki (
https://cwiki.apache.org/confluence/display/KAFKA/A+Case+for+Kafka+Headers).
The following are the ones that that I understand and could be in the
third-party use case category.


A. content-type
It seems that in general, content-type should be set at the topic level.
Not sure if mixing messages with different content types should be
encouraged.

B. schema id
Since the value is mostly useless without schema id, it seems that storing
the schema id together with serialized bytes in the value is better?

C. per message encryption
One drawback of this approach is that this significantly reduce the
effectiveness of compression, which happens on a set of serialized
messages. An alternative is to enable SSL for wire encryption and rely on
the storage system (e.g. LUKS) for at rest encryption.

D. cluster ID for mirroring across Kafka clusters
This is actually interesting. Today, to avoid introducing cycles when doing
mirroring across data centers, one would either have to set up two Kafka
clusters (a local and an aggregate) per data center or rename topics.
Neither is ideal. With headers, the producer could tag each message with
the producing cluster ID in the header. MirrorMaker could then avoid
mirroring messages to a cluster if they are tagged with the same cluster id.

However, an alternative approach is to introduce sth like hierarchical
topic and store messages from different clusters in different partitions
under the same topic. This approach avoids filtering out unneeded data and
makes offset preserving easier to support. It may make compaction trickier
though since the same key may show up in different partitions.

E. record-level lineage
For example, a source connector could store in the message the metadata
(e.g. UUID) of the source record. Similarly, if a stream job transforms
messages from topic A to topic B, the library could include the source
message offset in each of the transformed message in the header. Not sure
how widely useful record-level lineage is though since the overhead could
be significant.

F. auditing metadata
We could put things like clientId/host/user in the header in each message
for auditing. These metadata are really at the producer level though. So, a
more efficient way is to only include a "producerId" per message and send
the producerId -> metadata mapping independently. KIP-98 is actually
proposing including such a producerId natively in the message.

So, overall, I not sure that I am fully convinced of the strong third-party
use cases of headers yet. Perhaps we could discuss a bit more to make one
or two really convincing use cases.

Another orthogonal  question is whether header should be exposed in stream
processing systems such Kafka stream, Samza, and Spark streaming.
Currently, those systems just deal with key/value pairs. Should we expose a
third thing header there too or somehow map header to key or value?

Thanks,

Jun


On Tue, Nov 29, 2016 at 3:35 AM, Michael Pearce <[email protected]>
wrote:

> I assume, that after a period of a week, that there is no concerns now
> with points 1, and 2 and now we have agreement that headers are useful and
> needed in Kafka. As such if put to a KIP vote, this wouldn’t be a reason to
> reject.
>
> @
> Ignacio on point 4).
> I think for purpose of getting this KIP moving past this, we can state the
> key will be a 4 bytes space that can will be naturally interpreted as an
> Int32 (if namespacing is later wanted you can easily split this into two
> int16 spaces), from the wire protocol implementation this makes no
> difference I don’t believe. Is this reasonable to all?
>
> On 5) as per point 4 therefor happy we keep with 32 bits.
>
>
>
>
>
>
> On 18/11/2016, 20:34, "[email protected] on behalf of Ignacio
> Solis" <[email protected] on behalf of [email protected]> wrote:
>
>     Summary:
>
>     3) Yes - Header value as byte[]
>
>     4a) Int,Int - No
>     4b) Int - Yes
>     4c) String - Reluctant maybe
>
>     5) I believe the header system should take a single int.  I think
> 32bits is
>     a good size, if you want to interpret this as to 16bit numbers in the
> layer
>     above go right ahead.  If somebody wants to argue for 16 bits or 64
> bits of
>     header key space I would listen.
>
>
>     Discussion:
>     Dividing the key space into sub_key_1 and sub_key_2 makes no sense to
> me at
>     this layer.  Are we going to start providing APIs to get all the
>     sub_key_1s? or all the sub_key_2s?  If there is no distinguishing
> functions
>     that are applied to each one then they should be a single value.  At
> this
>     layer all we're doing is equality.
>     If the above layer wants to interpret this as 2, 3 or more values
> that's a
>     different question.  I personally think it's all one keyspace that is
>     getting assigned using some structure, but if you want to sub-assign
> parts
>     of it then that's fine.
>
>     The same discussion applies to strings.  If somebody argued for
> strings,
>     would we be arguing to divide the strings with dots ('.') as a
> requirement?
>     Would we want them to give us the different name segments separately?
>     Would we be performing any actions on this key other than matching?
>
>     Nacho
>
>
>
>     On Fri, Nov 18, 2016 at 9:30 AM, Michael Pearce <[email protected]
> >
>     wrote:
>
>     > #jay #jun any concerns on 1 and 2 still?
>     >
>     > @all
>     > To get this moving along a bit more I'd also like to ask to get
> clarity on
>     > the below last points:
>     >
>     > 3) I believe we're all roughly happy with the header value being a
> byte[]?
>     >
>     > 4) I believe consensus has been for an namespace based int approach
>     > {int,int} for the key. Any objections if this is what we go with?
>     >
>     > 5) as we have if assumption in (4)  is correct, {int,int} keys.
>     > Should both int's be int16 or int32?
>     > I'm for them being int16(2 bytes) as combined is space of 4bytes as
> per
>     > original and gives plenty of combinations for the foreseeable, and
> keeps
>     > the overhead small.
>     >
>     > Do we see any benefit in another kip call to discuss these at all?
>     >
>     > Cheers
>     > Mike
>     > ________________________________________
>     > From: K Burstev <[email protected]>
>     > Sent: Friday, November 18, 2016 7:07:07 AM
>     > To: [email protected]
>     > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>     >
>     > For what it is worth also i agree. As a user:
>     >
>     >  1) Yes - Headers are worthwhile
>     >  2) Yes - Headers should be a top level option
>     >
>     > 14.11.2016, 21:15, "Ignacio Solis" <[email protected]>:
>     > > 1) Yes - Headers are worthwhile
>     > > 2) Yes - Headers should be a top level option
>     > >
>     > > On Mon, Nov 14, 2016 at 9:16 AM, Michael Pearce <
> [email protected]>
>     > > wrote:
>     > >
>     > >>  Hi Roger,
>     > >>
>     > >>  The kip details/examples the original proposal for key spacing ,
> not
>     > the
>     > >>  new mentioned as per discussion namespace idea.
>     > >>
>     > >>  We will need to update the kip, when we get agreement this is a
> better
>     > >>  approach (which seems to be the case if I have understood the
> general
>     > >>  feeling in the conversation)
>     > >>
>     > >>  Re the variable ints, at very early stage we did think about
> this. I
>     > think
>     > >>  the added complexity for the saving isn't worth it. I'd rather go
>     > with, if
>     > >>  we want to reduce overheads and size int16 (2bytes) keys as it
> keeps it
>     > >>  simple.
>     > >>
>     > >>  On the note of no headers, there is as per the kip as we use an
>     > attribute
>     > >>  bit to denote if headers are present or not as such provides a
> zero
>     > >>  overhead currently if headers are not used.
>     > >>
>     > >>  I think as radai mentions would be good first if we can get
> clarity if
>     > do
>     > >>  we now have general consensus that (1) headers are worthwhile and
>     > useful,
>     > >>  and (2) we want it as a top level entity.
>     > >>
>     > >>  Just to state the obvious i believe (1) headers are worthwhile
> and (2)
>     > >>  agree as a top level entity.
>     > >>
>     > >>  Cheers
>     > >>  Mike
>     > >>  ________________________________________
>     > >>  From: Roger Hoover <[email protected]>
>     > >>  Sent: Wednesday, November 9, 2016 9:10:47 PM
>     > >>  To: [email protected]
>     > >>  Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>     > >>
>     > >>  Sorry for going a little in the weeds but thanks for the replies
>     > regarding
>     > >>  varint.
>     > >>
>     > >>  Agreed that a prefix and {int, int} can be the same. It doesn't
> look
>     > like
>     > >>  that's what the KIP is saying the "Open" section. The example
> shows
>     > >>  2100001
>     > >>  for New Relic and 210002 for App Dynamics implying that the New
> Relic
>     > >>  organization will have only a single header id to work with. Or
> is
>     > 2100001
>     > >>  a prefix? The main point of a namespace or prefix is to reduce
> the
>     > >>  overhead of config mapping or registration depending on how
>     > >>  namespaces/prefixes are managed.
>     > >>
>     > >>  Would love to hear more feedback on the higher-level questions
>     > though...
>     > >>
>     > >>  Cheers,
>     > >>
>     > >>  Roger
>     > >>
>     > >>  On Wed, Nov 9, 2016 at 11:38 AM, radai <
> [email protected]>
>     > wrote:
>     > >>
>     > >>  > I think this discussion is getting a bit into the weeds on
> technical
>     > >>  > implementation details.
>     > >>  > I'd liek to step back a minute and try and establish where we
> are in
>     > the
>     > >>  > larger picture:
>     > >>  >
>     > >>  > (re-wording nacho's last paragraph)
>     > >>  > 1. are we all in agreement that headers are a worthwhile and
> useful
>     > >>  > addition to have? this was contested early on
>     > >>  > 2. are we all in agreement on headers as top level entity vs
> headers
>     > >>  > squirreled-away in V?
>     > >>  >
>     > >>  > if there are still concerns around these #2 points (#jay?
> #jun?)?
>     > >>  >
>     > >>  > (and now back to our normal programming ...)
>     > >>  >
>     > >>  > varints are nice. having said that, its adding complexity (see
>     > >>  > https://github.com/addthis/stream-lib/blob/master/src/
>     > >>  > main/java/com/clearspring/analytics/util/Varint.java
>     > >>  > as 1st google result) and would require anyone writing other
> clients
>     > (C?
>     > >>  > Python? Go? Bash? ;-) ) to get/implement the same, and for
> relatively
>     > >>  > little gain (int vs string is order of magnitude, this isnt).
>     > >>  >
>     > >>  > int namespacing vs {int, int} namespacing are basically the
> same
>     > thing -
>     > >>  > youre just namespacing an int64 and giving people while 2^32
> ranges
>     > at a
>     > >>  > time. the part i like about this is letting people have a large
>     > swath of
>     > >>  > numbers with one registration so they dont have to come back
> for
>     > every
>     > >>  > single plugin/header they want to "reserve".
>     > >>  >
>     > >>  >
>     > >>  > On Wed, Nov 9, 2016 at 11:01 AM, Roger Hoover <
>     > [email protected]>
>     > >>  > wrote:
>     > >>  >
>     > >>  > > Since some of the debate has been about overhead +
> performance, I'm
>     > >>  > > wondering if we have considered a varint encoding (
>     > >>  > > https://developers.google.com/protocol-buffers/docs/
>     > encoding#varints)
>     > >>  > for
>     > >>  > > the header length field (int32 in the proposal) and for
> header
>     > ids? If
>     > >>  > you
>     > >>  > > don't use headers, the overhead would be a single byte and
> for each
>     > >>  > header
>     > >>  > > id < 128 would also need only a single byte?
>     > >>  > >
>     > >>  > >
>     > >>  > >
>     > >>  > > On Wed, Nov 9, 2016 at 6:43 AM, radai <
> [email protected]>
>     > >>  > wrote:
>     > >>  > >
>     > >>  > > > @magnus - and very dangerous (youre essentially
> downloading and
>     > >>  > executing
>     > >>  > > > arbitrary code off the internet on your servers ... bad
> idea
>     > without
>     > >>  a
>     > >>  > > > sandbox, even with)
>     > >>  > > >
>     > >>  > > > as for it being a purely administrative task - i disagree.
>     > >>  > > >
>     > >>  > > > i wish it would, really, because then my earlier point on
> the
>     > >>  > complexity
>     > >>  > > of
>     > >>  > > > the remapping process would be invalid, but at linkedin,
> for
>     > example,
>     > >>  > we
>     > >>  > > > (the team im in) run kafka as a service. we dont really
> know
>     > what our
>     > >>  > > users
>     > >>  > > > (developing applications that use kafka) are up to at any
> given
>     > >>  moment.
>     > >>  > > it
>     > >>  > > > is very possible (given the existance of headers and a
>     > corresponding
>     > >>  > > plugin
>     > >>  > > > ecosystem) for some application to "equip" their producers
> and
>     > >>  > consumers
>     > >>  > > > with the required plugin without us knowing. i dont mean
> to imply
>     > >>  thats
>     > >>  > > > bad, i just want to make the point that its not as simple
>     > keeping it
>     > >>  in
>     > >>  > > > sync across a large-enough organization.
>     > >>  > > >
>     > >>  > > >
>     > >>  > > > On Wed, Nov 9, 2016 at 6:17 AM, Magnus Edenhill <
>     > [email protected]>
>     > >>  > > > wrote:
>     > >>  > > >
>     > >>  > > > > I think there is a piece missing in the Strings
> discussion,
>     > where
>     > >>  > > > > pro-Stringers
>     > >>  > > > > reason that by providing unique string identifiers for
> each
>     > header
>     > >>  > > > > everything will just
>     > >>  > > > > magically work for all parts of the stream pipeline.
>     > >>  > > > >
>     > >>  > > > > But the strings dont mean anything by themselves, and
> while we
>     > >>  could
>     > >>  > > > > probably envision
>     > >>  > > > > some auto plugin loader that downloads, compiles, links
> and
>     > runs
>     > >>  > > plugins
>     > >>  > > > > on-demand
>     > >>  > > > > as soon as they're seen by a consumer, I dont really see
> a
>     > use-case
>     > >>  > for
>     > >>  > > > > something
>     > >>  > > > > so dynamic (and fragile) in practice.
>     > >>  > > > >
>     > >>  > > > > In the real world an application will be configured with
> a set
>     > of
>     > >>  > > plugins
>     > >>  > > > > to either add (producer)
>     > >>  > > > > or read (consumer) headers.
>     > >>  > > > > This is an administrative task based on what features a
> client
>     > >>  > > > > needs/provides and results in
>     > >>  > > > > some sort of configuration to enable and configure the
> desired
>     > >>  > plugins.
>     > >>  > > > >
>     > >>  > > > > Since this needs to be kept somewhat in sync across an
>     > organisation
>     > >>  > > > (there
>     > >>  > > > > is no point in having producers
>     > >>  > > > > add headers no consumers will read, and vice versa), the
> added
>     > >>  > > complexity
>     > >>  > > > > of assigning an id namespace
>     > >>  > > > > for each plugin as it is being configured should be
> tolerable.
>     > >>  > > > >
>     > >>  > > > >
>     > >>  > > > > /Magnus
>     > >>  > > > >
>     > >>  > > > > 2016-11-09 13:06 GMT+01:00 Michael Pearce <
>     > [email protected]>:
>     > >>  > > > >
>     > >>  > > > > > Just following/catching up on what seems to be an
> active
>     > night :)
>     > >>  > > > > >
>     > >>  > > > > > @Radai sorry if it may seem obvious but what does MD
> stand
>     > for?
>     > >>  > > > > >
>     > >>  > > > > > My take on String vs Int:
>     > >>  > > > > >
>     > >>  > > > > > I will state first I am pro Int (16 or 32).
>     > >>  > > > > >
>     > >>  > > > > > I do though playing devils advocate see a big plus
> with the
>     > >>  > argument
>     > >>  > > of
>     > >>  > > > > > String keys, this is around integrating into an
> existing
>     > >>  > eco-system.
>     > >>  > > > > >
>     > >>  > > > > > As many other systems use String based headers (Flume,
> JMS)
>     > it
>     > >>  > makes
>     > >>  > > > it
>     > >>  > > > > > much easier for these to be incorporated/integrated
> into.
>     > >>  > > > > >
>     > >>  > > > > > How with Int based headers could we provide a
> way/guidence to
>     > >>  make
>     > >>  > > this
>     > >>  > > > > > integration simple / easy with transition flows over to
>     > kafka?
>     > >>  > > > > >
>     > >>  > > > > > * tough luck buddy you're on your own
>     > >>  > > > > > * simply hash the string into int code and hope for no
>     > collisions
>     > >>  > > (how
>     > >>  > > > to
>     > >>  > > > > > convert back though?)
>     > >>  > > > > > * http2 style as mentioned by nacho.
>     > >>  > > > > >
>     > >>  > > > > > cheers,
>     > >>  > > > > > Mike
>     > >>  > > > > >
>     > >>  > > > > >
>     > >>  > > > > > ________________________________________
>     > >>  > > > > > From: radai <[email protected]>
>     > >>  > > > > > Sent: Wednesday, November 9, 2016 8:12 AM
>     > >>  > > > > > To: [email protected]
>     > >>  > > > > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>     > >>  > > > > >
>     > >>  > > > > > thinking about it some more, the best way to transmit
> the
>     > header
>     > >>  > > > > remapping
>     > >>  > > > > > data to consumers would be to put it in the MD response
>     > payload,
>     > >>  so
>     > >>  > > > maybe
>     > >>  > > > > > it should be discussed now.
>     > >>  > > > > >
>     > >>  > > > > >
>     > >>  > > > > > On Wed, Nov 9, 2016 at 12:09 AM, radai <
>     > >>  [email protected]
>     > >>  > >
>     > >>  > > > > wrote:
>     > >>  > > > > >
>     > >>  > > > > > > im not opposed to the idea of namespace mapping. all
> im
>     > saying
>     > >>  is
>     > >>  > > > that
>     > >>  > > > > > its
>     > >>  > > > > > > not part of the "mvp" and, since it requires no wire
> format
>     > >>  > change,
>     > >>  > > > can
>     > >>  > > > > > > always be added later.
>     > >>  > > > > > > also, its not as simple as just configuring MM to do
> the
>     > >>  > transform:
>     > >>  > > > > lets
>     > >>  > > > > > > say i've implemented large message support as
> {666,1} and
>     > on
>     > >>  some
>     > >>  > > > > mirror
>     > >>  > > > > > > target cluster its been remapped to {999,1}. the
> consumer
>     > >>  plugin
>     > >>  > > code
>     > >>  > > > > > would
>     > >>  > > > > > > also need to be told to look for the large message
> "part X
>     > of
>     > >>  Y"
>     > >>  > > > header
>     > >>  > > > > > > under {999,1}. doable, but tricky.
>     > >>  > > > > > >
>     > >>  > > > > > > On Tue, Nov 8, 2016 at 10:29 PM, Gwen Shapira <
>     > >>  [email protected]
>     > >>  > >
>     > >>  > > > > wrote:
>     > >>  > > > > > >
>     > >>  > > > > > >> While you can do whatever you want with a namespace
> and
>     > your
>     > >>  > code,
>     > >>  > > > > > >> what I'd expect is for each app to namespaces
>     > configurable...
>     > >>  > > > > > >>
>     > >>  > > > > > >> So if I accidentally used 666 for my HR department,
> and
>     > still
>     > >>  > want
>     > >>  > > > to
>     > >>  > > > > > >> run RadaiApp, I can config "namespace=42" for
> RadaiApp and
>     > >>  > > > everything
>     > >>  > > > > > >> will look normal.
>     > >>  > > > > > >>
>     > >>  > > > > > >> This means you only need to sync usage inside your
> own
>     > >>  > > organization.
>     > >>  > > > > > >> Still hard, but somewhat easier than syncing with
> the
>     > entire
>     > >>  > > world.
>     > >>  > > > > > >>
>     > >>  > > > > > >> On Tue, Nov 8, 2016 at 10:07 PM, radai <
>     > >>  > > [email protected]>
>     > >>  > > > > > >> wrote:
>     > >>  > > > > > >> > and we can start with {namespace, id} and no
> re-mapping
>     > >>  > support
>     > >>  > > > and
>     > >>  > > > > > >> always
>     > >>  > > > > > >> > add it later on if/when collisions actually
> happen (i
>     > dont
>     > >>  > think
>     > >>  > > > > > they'd
>     > >>  > > > > > >> be
>     > >>  > > > > > >> > a problem).
>     > >>  > > > > > >> >
>     > >>  > > > > > >> > every interested party (so orgs or individuals)
> could
>     > then
>     > >>  > > > register
>     > >>  > > > > a
>     > >>  > > > > > >> > prefix (0 = reserved, 1 = confluent ... 666 = me
> :-) )
>     > and
>     > >>  do
>     > >>  > > > > whatever
>     > >>  > > > > > >> with
>     > >>  > > > > > >> > the 2nd ID - so once linkedin registers, say 3,
> then
>     > >>  linkedin
>     > >>  > > devs
>     > >>  > > > > are
>     > >>  > > > > > >> free
>     > >>  > > > > > >> > to use {3, *} with a reasonable expectation to to
>     > collide
>     > >>  with
>     > >>  > > > > > anything
>     > >>  > > > > > >> > else. further partitioning of that * becomes
> linkedin's
>     > >>  > problem,
>     > >>  > > > but
>     > >>  > > > > > the
>     > >>  > > > > > >> > "upstream registration" of a namespace only has to
>     > happen
>     > >>  > once.
>     > >>  > > > > > >> >
>     > >>  > > > > > >> > On Tue, Nov 8, 2016 at 9:03 PM, James Cheng <
>     > >>  > > [email protected]
>     > >>  > > > >
>     > >>  > > > > > >> wrote:
>     > >>  > > > > > >> >
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> > On Nov 8, 2016, at 5:54 PM, Gwen Shapira <
>     > >>  > [email protected]>
>     > >>  > > > > > wrote:
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> > Thank you so much for this clear and fair
> summary of
>     > the
>     > >>  > > > > arguments.
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> > I'm in favor of ints. Not a deal-breaker, but
> in
>     > favor.
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> > Even more in favor of Magnus's decentralized
>     > suggestion
>     > >>  > with
>     > >>  > > > > > Roger's
>     > >>  > > > > > >> >> > tweak: add a namespace for headers. This will
> allow
>     > each
>     > >>  > app
>     > >>  > > to
>     > >>  > > > > > just
>     > >>  > > > > > >> >> > use whatever IDs it wants internally, and then
> let
>     > the
>     > >>  > admin
>     > >>  > > > > > >> deploying
>     > >>  > > > > > >> >> > the app figure out an available namespace ID
> for the
>     > app
>     > >>  to
>     > >>  > > > live
>     > >>  > > > > > in.
>     > >>  > > > > > >> >> > So io.confluent.schema-registry can be
> namespace
>     > 0x01 on
>     > >>  my
>     > >>  > > > > > >> deployment
>     > >>  > > > > > >> >> > and 0x57 on yours, and the poor guys
> developing the
>     > app
>     > >>  > don't
>     > >>  > > > > need
>     > >>  > > > > > to
>     > >>  > > > > > >> >> > worry about that.
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> Gwen, if I understand your example right, an
>     > application
>     > >>  > > deployer
>     > >>  > > > > > might
>     > >>  > > > > > >> >> decide to use 0x01 in one deployment, and that
> means
>     > that
>     > >>  > once
>     > >>  > > > the
>     > >>  > > > > > >> message
>     > >>  > > > > > >> >> is written into the broker, it will be saved on
> the
>     > broker
>     > >>  > with
>     > >>  > > > > that
>     > >>  > > > > > >> >> specific namespace (0x01).
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> If you were to mirror that message into another
>     > cluster,
>     > >>  the
>     > >>  > > 0x01
>     > >>  > > > > > would
>     > >>  > > > > > >> >> accompany the message, right? What if the
> deployers of
>     > the
>     > >>  > same
>     > >>  > > > app
>     > >>  > > > > > in
>     > >>  > > > > > >> the
>     > >>  > > > > > >> >> other cluster uses 0x57? They won't understand
> each
>     > other?
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> I'm not sure that's an avoidable problem. I
> think it
>     > simply
>     > >>  > > means
>     > >>  > > > > > that
>     > >>  > > > > > >> in
>     > >>  > > > > > >> >> order to share data, you have to also have a
> shared
>     > (agreed
>     > >>  > > upon)
>     > >>  > > > > > >> >> understanding of what the namespaces mean. Which
> I
>     > think
>     > >>  > makes
>     > >>  > > > > sense,
>     > >>  > > > > > >> >> because the alternate (sharing *nothing* at all)
> would
>     > mean
>     > >>  > > that
>     > >>  > > > > > there
>     > >>  > > > > > >> >> would be no way to understand each other.
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> -James
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >> > Gwen
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> > On Tue, Nov 8, 2016 at 4:23 PM, radai <
>     > >>  > > > > [email protected]>
>     > >>  > > > > > >> >> wrote:
>     > >>  > > > > > >> >> >> +1 for sean's document. it covers pretty much
> all
>     > the
>     > >>  > > > trade-offs
>     > >>  > > > > > and
>     > >>  > > > > > >> >> >> provides concrete figures to argue about :-)
>     > >>  > > > > > >> >> >> (nit-picking - used the same xkcd twice, also
> trove
>     > has
>     > >>  > been
>     > >>  > > > > > >> superceded
>     > >>  > > > > > >> >> for
>     > >>  > > > > > >> >> >> purposes of high performance collections:
> look at
>     > >>  > > > > > >> >> >> https://github.com/leventov/Koloboke)
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> so to sum up the string vs int debate:
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> performance - you can do 140k ops/sec _per
> thread_
>     > with
>     > >>  > > string
>     > >>  > > > > > >> headers.
>     > >>  > > > > > >> >> you
>     > >>  > > > > > >> >> >> could do x2-3 better with ints. there's no
> arguing
>     > the
>     > >>  > > > relative
>     > >>  > > > > > diff
>     > >>  > > > > > >> >> >> between the two, there's only the question of
>     > whether or
>     > >>  > not
>     > >>  > > > > _the
>     > >>  > > > > > >> rest
>     > >>  > > > > > >> >> of
>     > >>  > > > > > >> >> >> kafka_ operates fast enough to care. if we
> want to
>     > make
>     > >>  > > > choices
>     > >>  > > > > > >> solely
>     > >>  > > > > > >> >> >> based on performance we need ints. if we are
>     > willing to
>     > >>  > > > > > >> >> settle/compromise
>     > >>  > > > > > >> >> >> for a nicer (to some) API than strings are
> good
>     > enough
>     > >>  for
>     > >>  > > the
>     > >>  > > > > > >> current
>     > >>  > > > > > >> >> >> state of affairs.
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> message size - with batching and compression
> it
>     > comes
>     > >>  down
>     > >>  > > to
>     > >>  > > > a
>     > >>  > > > > > ~5%
>     > >>  > > > > > >> >> >> difference (internal testing, not in the doc.
> maybe
>     > >>  would
>     > >>  > > help
>     > >>  > > > > > >> adding if
>     > >>  > > > > > >> >> >> this becomes a point of contention?). this
> means it
>     > wont
>     > >>  > > > really
>     > >>  > > > > > >> affect
>     > >>  > > > > > >> >> >> kafka in "throughput mode" (large, compressed
>     > batches).
>     > >>  in
>     > >>  > > > "low
>     > >>  > > > > > >> latency"
>     > >>  > > > > > >> >> >> mode (meaning less/no batching and
> compression) the
>     > >>  > > difference
>     > >>  > > > > can
>     > >>  > > > > > >> be
>     > >>  > > > > > >> >> >> extreme (it'll easily be an order of
> magnitude with
>     > >>  small
>     > >>  > > > > payloads
>     > >>  > > > > > >> like
>     > >>  > > > > > >> >> >> stock ticks and header keys of the form
>     > >>  > > > > > >> >> >> "com.acme.infraTeam.kafka.hiMom.auditPlugin").
> we
>     > have
>     > >>  a
>     > >>  > > few
>     > >>  > > > > such
>     > >>  > > > > > >> >> topics at
>     > >>  > > > > > >> >> >> linkedin where actual payloads are ~2 ints
> and are
>     > >>  > eclipsed
>     > >>  > > by
>     > >>  > > > > our
>     > >>  > > > > > >> >> in-house
>     > >>  > > > > > >> >> >> audit "header" which is why we liked ints to
> begin
>     > with.
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> "ease of use" - strings would probably still
> require
>     > >>  > _some_
>     > >>  > > > > degree
>     > >>  > > > > > >> of
>     > >>  > > > > > >> >> >> partitioning by convention (imagine if
> everyone
>     > used the
>     > >>  > key
>     > >>  > > > > > >> "infra"...)
>     > >>  > > > > > >> >> >> but its very intuitive for java devs to do
> anyway
>     > >>  > > > > (reverse-domain
>     > >>  > > > > > is
>     > >>  > > > > > >> >> >> ingrained into java developers at a young age
> :-) ).
>     > >>  also
>     > >>  > > most
>     > >>  > > > > > java
>     > >>  > > > > > >> devs
>     > >>  > > > > > >> >> >> find Map<String, whatever> more intuitive than
>     > >>  > Map<Integer,
>     > >>  > > > > > >> whatever> -
>     > >>  > > > > > >> >> >> probably because of other text-based
> protocols like
>     > >>  http.
>     > >>  > > ints
>     > >>  > > > > > would
>     > >>  > > > > > >> >> >> require a number registry. if you think number
>     > >>  registries
>     > >>  > > are
>     > >>  > > > > hard
>     > >>  > > > > > >> just
>     > >>  > > > > > >> >> >> look at the wiki page for KIPs (specifically
> the
>     > number
>     > >>  > for
>     > >>  > > > next
>     > >>  > > > > > >> >> available
>     > >>  > > > > > >> >> >> KIP) and think again - we are probably talking
>     > about the
>     > >>  > > same
>     > >>  > > > > > >> volume of
>     > >>  > > > > > >> >> >> requests. also this would only be "required"
> (good
>     > >>  > > > citizenship,
>     > >>  > > > > > more
>     > >>  > > > > > >> >> like)
>     > >>  > > > > > >> >> >> if you want to publish your plugin for others
> to
>     > use.
>     > >>  > within
>     > >>  > > > > your
>     > >>  > > > > > >> org do
>     > >>  > > > > > >> >> >> whatever you want - just know that if you use
> [some
>     > >>  > > "reserved"
>     > >>  > > > > > >> range]
>     > >>  > > > > > >> >> and a
>     > >>  > > > > > >> >> >> future kafka update breaks it its your
> problem.
>     > RTFM.
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> personally im in favor of ints.
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> having said that (and like nacho) I will
> settle if
>     > int
>     > >>  vs
>     > >>  > > > string
>     > >>  > > > > > >> remains
>     > >>  > > > > > >> >> >> the only obstacle to this.
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >> On Tue, Nov 8, 2016 at 3:53 PM, Nacho Solis
>     > >>  > > > > > >> <[email protected]
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> >> wrote:
>     > >>  > > > > > >> >> >>
>     > >>  > > > > > >> >> >>> I think it's well known I've been pushing
> for ints
>     > >>  (and I
>     > >>  > > > could
>     > >>  > > > > > >> switch
>     > >>  > > > > > >> >> to
>     > >>  > > > > > >> >> >>> 16 bit shorts if pressed).
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>> - efficient (space)
>     > >>  > > > > > >> >> >>> - efficient (processing)
>     > >>  > > > > > >> >> >>> - easily partitionable
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>> However, if the only thing that is keeping
> us from
>     > >>  > adopting
>     > >>  > > > > > >> headers is
>     > >>  > > > > > >> >> the
>     > >>  > > > > > >> >> >>> use of strings vs ints as keys, then I would
> cave
>     > in
>     > >>  and
>     > >>  > > > accept
>     > >>  > > > > > >> >> strings. If
>     > >>  > > > > > >> >> >>> we do so, I would like to limit string keys
> to 128
>     > >>  bytes
>     > >>  > in
>     > >>  > > > > > length.
>     > >>  > > > > > >> >> This
>     > >>  > > > > > >> >> >>> way 1) I could use a 3 letter string if I
> wanted
>     > >>  > > (effectively
>     > >>  > > > > > >> using 4
>     > >>  > > > > > >> >> total
>     > >>  > > > > > >> >> >>> bytes), 2) limit overall impact of possible
> keys
>     > (don't
>     > >>  > > > really
>     > >>  > > > > > want
>     > >>  > > > > > >> >> people
>     > >>  > > > > > >> >> >>> to send a 16K header string key).
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>> Nacho
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>> On Tue, Nov 8, 2016 at 3:35 PM, Gwen Shapira
> <
>     > >>  > > > > [email protected]>
>     > >>  > > > > > >> >> wrote:
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>>> Forgot to mention: Thank you for
> quantifying the
>     > >>  > > trade-off -
>     > >>  > > > > it
>     > >>  > > > > > is
>     > >>  > > > > > >> >> >>>> helpful and important regardless of what we
> end up
>     > >>  > > deciding.
>     > >>  > > > > > >> >> >>>>
>     > >>  > > > > > >> >> >>>> On Tue, Nov 8, 2016 at 3:12 PM, Sean
> McCauliff
>     > >>  > > > > > >> >> >>>> <[email protected]> wrote:
>     > >>  > > > > > >> >> >>>>> On Tue, Nov 8, 2016 at 2:15 PM, Gwen
> Shapira <
>     > >>  > > > > > [email protected]>
>     > >>  > > > > > >> >> >>> wrote:
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>> Since Kafka specifically targets
>     > high-throughput,
>     > >>  > > > > low-latency
>     > >>  > > > > > >> >> >>>>>> use-cases, I don't think we should trade
> them
>     > off
>     > >>  that
>     > >>  > > > > easily.
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> I find these kind of design goals not to be
>     > really
>     > >>  > > helpful
>     > >>  > > > > > unless
>     > >>  > > > > > >> >> it's
>     > >>  > > > > > >> >> >>>>> quantified in someway. Because it's always
>     > possible
>     > >>  to
>     > >>  > > > argue
>     > >>  > > > > > >> against
>     > >>  > > > > > >> >> >>>>> something as either being not performant
> or just
>     > an
>     > >>  > > > > > >> implementation
>     > >>  > > > > > >> >> >>>> detail.
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> This is a single threaded benchmarks so
> all the
>     > >>  > > > measurements
>     > >>  > > > > > are
>     > >>  > > > > > >> per
>     > >>  > > > > > >> >> >>>>> thread.
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> For 1M messages/s/thread if header keys
> are int
>     > and
>     > >>  > you
>     > >>  > > > had
>     > >>  > > > > > >> even a
>     > >>  > > > > > >> >> >>>> single
>     > >>  > > > > > >> >> >>>>> header key, value pair then it's still
> about 2^-2
>     > >>  > > > > microseconds
>     > >>  > > > > > >> which
>     > >>  > > > > > >> >> >>>> means
>     > >>  > > > > > >> >> >>>>> you only have another 0.75 microseconds to
> do
>     > >>  > everything
>     > >>  > > > else
>     > >>  > > > > > you
>     > >>  > > > > > >> >> want
>     > >>  > > > > > >> >> >>> to
>     > >>  > > > > > >> >> >>>>> do with a message (1M messages/s means 1
> micro
>     > second
>     > >>  > per
>     > >>  > > > > > >> message).
>     > >>  > > > > > >> >> >>> With
>     > >>  > > > > > >> >> >>>>> string header keys there is still 0.5 micro
>     > seconds
>     > >>  to
>     > >>  > > > > process
>     > >>  > > > > > a
>     > >>  > > > > > >> >> >>> message.
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> I love strings as much as the next guy (we
> had
>     > them
>     > >>  in
>     > >>  > > > > Flume),
>     > >>  > > > > > >> but I
>     > >>  > > > > > >> >> >>>>>> was convinced by Magnus/Michael/Radai that
>     > strings
>     > >>  > don't
>     > >>  > > > > > >> actually
>     > >>  > > > > > >> >> have
>     > >>  > > > > > >> >> >>>>>> strong benefits as opposed to ints
> (you'll need
>     > a
>     > >>  > string
>     > >>  > > > > > >> registry
>     > >>  > > > > > >> >> >>>>>> anyway - otherwise, how will you know
> what does
>     > the
>     > >>  > > > > > "profile_id"
>     > >>  > > > > > >> >> >>>>>> header refers to?) and I want to keep
> closer to
>     > our
>     > >>  > > > original
>     > >>  > > > > > >> design
>     > >>  > > > > > >> >> >>>>>> goals for Kafka.
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> "confluent.profile_id"
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>> If someone likes strings in the headers
> and
>     > doesn't
>     > >>  do
>     > >>  > > > > > millions
>     > >>  > > > > > >> of
>     > >>  > > > > > >> >> >>>>>> messages a sec, they probably have lots
> of other
>     > >>  > systems
>     > >>  > > > > they
>     > >>  > > > > > >> can
>     > >>  > > > > > >> >> use
>     > >>  > > > > > >> >> >>>>>> instead.
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>> None of them will scale like Kafka.
> Horizontal
>     > >>  scaling
>     > >>  > > is
>     > >>  > > > > > still
>     > >>  > > > > > >> >> good.
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>> On Tue, Nov 8, 2016 at 1:22 PM, Sean
> McCauliff
>     > >>  > > > > > >> >> >>>>>> <[email protected]> wrote:
>     > >>  > > > > > >> >> >>>>>>> +1 for String keys.
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> I've been doing some bechmarking and it
> seems
>     > like
>     > >>  > the
>     > >>  > > > > > speedup
>     > >>  > > > > > >> for
>     > >>  > > > > > >> >> >>>> using
>     > >>  > > > > > >> >> >>>>>>> integer keys is about 2-5 depending on
> the
>     > length
>     > >>  of
>     > >>  > > the
>     > >>  > > > > > >> strings
>     > >>  > > > > > >> >> and
>     > >>  > > > > > >> >> >>>> what
>     > >>  > > > > > >> >> >>>>>>> collections are being used. The overall
> amount
>     > of
>     > >>  > time
>     > >>  > > > > spent
>     > >>  > > > > > >> >> >>> parsing
>     > >>  > > > > > >> >> >>>> a
>     > >>  > > > > > >> >> >>>>>> set
>     > >>  > > > > > >> >> >>>>>>> of header key, value pairs probably does
> not
>     > matter
>     > >>  > > > unless
>     > >>  > > > > > you
>     > >>  > > > > > >> are
>     > >>  > > > > > >> >> >>>>>> getting
>     > >>  > > > > > >> >> >>>>>>> close to 1M messages per consumer. In
> which
>     > case
>     > >>  > > > probably
>     > >>  > > > > > >> don't
>     > >>  > > > > > >> >> use
>     > >>  > > > > > >> >> >>>>>>> headers. There is also the option to use
> very
>     > >>  short
>     > >>  > > > > strings;
>     > >>  > > > > > >> some
>     > >>  > > > > > >> >> >>>> that
>     > >>  > > > > > >> >> >>>>>> are
>     > >>  > > > > > >> >> >>>>>>> even shorter than integers.
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> Partitioning the string key space will be
>     > easier
>     > >>  than
>     > >>  > > > > > >> partitioning
>     > >>  > > > > > >> >> >>> an
>     > >>  > > > > > >> >> >>>>>>> integer key space. We won't need a global
>     > registry.
>     > >>  > > > Kafka
>     > >>  > > > > > >> >> >>> internally
>     > >>  > > > > > >> >> >>>> can
>     > >>  > > > > > >> >> >>>>>>> reserve some prefix like "_" as its
> namespace.
>     > >>  > > Everyone
>     > >>  > > > > else
>     > >>  > > > > > >> can
>     > >>  > > > > > >> >> >>> use
>     > >>  > > > > > >> >> >>>>>> their
>     > >>  > > > > > >> >> >>>>>>> company or project name as namespace
> prefix and
>     > >>  life
>     > >>  > > > should
>     > >>  > > > > > be
>     > >>  > > > > > >> >> good.
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> Here's the link to some of the
> benchmarking
>     > info:
>     > >>  > > > > > >> >> >>>>>>> https://docs.google.com/document/d/1tfT-
>     > >>  > > > > > >> >> >>>> 6SZdnKOLyWGDH82kS30PnUkmgb7nPL
>     > >>  > > > > > >> >> >>>>>> dw6p65pAI/edit?usp=sharing
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> --
>     > >>  > > > > > >> >> >>>>>>> Sean McCauliff
>     > >>  > > > > > >> >> >>>>>>> Staff Software Engineer
>     > >>  > > > > > >> >> >>>>>>> Kafka
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> [email protected]
>     > >>  > > > > > >> >> >>>>>>> linkedin.com/in/sean-mccauliff-b563192
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>> On Mon, Nov 7, 2016 at 11:51 PM, Michael
>     > Pearce <
>     > >>  > > > > > >> >> >>>> [email protected]>
>     > >>  > > > > > >> >> >>>>>>> wrote:
>     > >>  > > > > > >> >> >>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> +1 on this slimmer version of our
> proposal
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> I def think the Id space we can reduce
> from
>     > the
>     > >>  > > proposed
>     > >>  > > > > > >> >> >>>> int32(4bytes)
>     > >>  > > > > > >> >> >>>>>>>> down to int16(2bytes) it saves on space
> and as
>     > >>  > headers
>     > >>  > > > we
>     > >>  > > > > > >> wouldn't
>     > >>  > > > > > >> >> >>>>>> expect
>     > >>  > > > > > >> >> >>>>>>>> the number of headers being used
> concurrently
>     > >>  being
>     > >>  > > that
>     > >>  > > > > > high.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> I would wonder if we should make the
> value
>     > byte
>     > >>  > array
>     > >>  > > > > length
>     > >>  > > > > > >> still
>     > >>  > > > > > >> >> >>>> int32
>     > >>  > > > > > >> >> >>>>>>>> though as This is the standard Max array
>     > length in
>     > >>  > > Java
>     > >>  > > > > > saying
>     > >>  > > > > > >> >> that
>     > >>  > > > > > >> >> >>>> it
>     > >>  > > > > > >> >> >>>>>> is a
>     > >>  > > > > > >> >> >>>>>>>> header and I guess limiting the size is
>     > sensible
>     > >>  and
>     > >>  > > > would
>     > >>  > > > > > >> work
>     > >>  > > > > > >> >> for
>     > >>  > > > > > >> >> >>>> all
>     > >>  > > > > > >> >> >>>>>> the
>     > >>  > > > > > >> >> >>>>>>>> use cases we have in mind so happy with
>     > limiting
>     > >>  > this.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Do people generally concur on Magnus's
> slimmer
>     > >>  > > version?
>     > >>  > > > > > >> Anyone see
>     > >>  > > > > > >> >> >>>> any
>     > >>  > > > > > >> >> >>>>>>>> issues if we moved from int32 to int16?
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Re configurable ids per plugin over a
> global
>     > >>  > registry
>     > >>  > > > also
>     > >>  > > > > > >> would
>     > >>  > > > > > >> >> >>> work
>     > >>  > > > > > >> >> >>>>>> for
>     > >>  > > > > > >> >> >>>>>>>> us. As such if this has better
> concensus over
>     > the
>     > >>  > > > > proposed
>     > >>  > > > > > >> global
>     > >>  > > > > > >> >> >>>>>> registry
>     > >>  > > > > > >> >> >>>>>>>> I'd be happy to change that.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> I was already sold on ints over strings
> for
>     > keys
>     > >>  ;)
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Cheers
>     > >>  > > > > > >> >> >>>>>>>> Mike
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> ______________________________
> __________
>     > >>  > > > > > >> >> >>>>>>>> From: Magnus Edenhill <
> [email protected]>
>     > >>  > > > > > >> >> >>>>>>>> Sent: Monday, November 7, 2016 10:10:21
> PM
>     > >>  > > > > > >> >> >>>>>>>> To: [email protected]
>     > >>  > > > > > >> >> >>>>>>>> Subject: Re: [DISCUSS] KIP-82 - Add
> Record
>     > Headers
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Hi,
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> I'm +1 for adding generic message
> headers,
>     > but I
>     > >>  do
>     > >>  > > > share
>     > >>  > > > > > the
>     > >>  > > > > > >> >> >>>> concerns
>     > >>  > > > > > >> >> >>>>>>>> previously aired on this thread and
> during
>     > the KIP
>     > >>  > > > > meeting.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> So let me propose a slimmer alternative
> that
>     > does
>     > >>  > not
>     > >>  > > > > > require
>     > >>  > > > > > >> any
>     > >>  > > > > > >> >> >>>> sort
>     > >>  > > > > > >> >> >>>>>> of
>     > >>  > > > > > >> >> >>>>>>>> global header registry, does not affect
> broker
>     > >>  > > > performance
>     > >>  > > > > > or
>     > >>  > > > > > >> >> >>>>>> operations,
>     > >>  > > > > > >> >> >>>>>>>> and adds as little overhead as possible.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Message
>     > >>  > > > > > >> >> >>>>>>>> ------------
>     > >>  > > > > > >> >> >>>>>>>> The protocol Message type is extended
> with a
>     > >>  Headers
>     > >>  > > > array
>     > >>  > > > > > >> >> consting
>     > >>  > > > > > >> >> >>>> of
>     > >>  > > > > > >> >> >>>>>>>> Tags, where a Tag is defined as:
>     > >>  > > > > > >> >> >>>>>>>> int16 Id
>     > >>  > > > > > >> >> >>>>>>>> int16 Len // binary_data length
>     > >>  > > > > > >> >> >>>>>>>> binary_data[Len] // opaque binary data
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Ids
>     > >>  > > > > > >> >> >>>>>>>> ---
>     > >>  > > > > > >> >> >>>>>>>> The Id space is not centrally managed,
> so
>     > whenever
>     > >>  > an
>     > >>  > > > > > >> application
>     > >>  > > > > > >> >> >>>> needs
>     > >>  > > > > > >> >> >>>>>> to
>     > >>  > > > > > >> >> >>>>>>>> add headers, or use an eco-system
> plugin that
>     > >>  does,
>     > >>  > > its
>     > >>  > > > Id
>     > >>  > > > > > >> >> >>> allocation
>     > >>  > > > > > >> >> >>>>>> will
>     > >>  > > > > > >> >> >>>>>>>> need to be manually configured.
>     > >>  > > > > > >> >> >>>>>>>> This moves the allocation concern from
> the
>     > global
>     > >>  > > space
>     > >>  > > > > down
>     > >>  > > > > > >> to
>     > >>  > > > > > >> >> >>>>>>>> organization level and avoids the risk
> for id
>     > >>  > > conflicts.
>     > >>  > > > > > >> >> >>>>>>>> Example pseudo-config for some app:
>     > >>  > > > > > >> >> >>>>>>>> sometrackerplugin.tag.sourcev3.id=1000
>     > >>  > > > > > >> >> >>>>>>>> dbthing.tag.tablename.id=1001
>     > >>  > > > > > >> >> >>>>>>>> myschemareg.tag.schemaname.id=1002
>     > >>  > > > > > >> >> >>>>>>>> myschemareg.tag.schemaversion.id=1003
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Each header-writing or header-reading
> plugin
>     > must
>     > >>  > > > provide
>     > >>  > > > > > >> means
>     > >>  > > > > > >> >> >>>>>> (typically
>     > >>  > > > > > >> >> >>>>>>>> through configuration) to specify the
> tag for
>     > each
>     > >>  > > > header
>     > >>  > > > > it
>     > >>  > > > > > >> uses.
>     > >>  > > > > > >> >> >>>>>> Defaults
>     > >>  > > > > > >> >> >>>>>>>> should be avoided.
>     > >>  > > > > > >> >> >>>>>>>> A consumer silently ignores tags it
> does not
>     > have
>     > >>  a
>     > >>  > > > > mapping
>     > >>  > > > > > >> for
>     > >>  > > > > > >> >> >>>> (since
>     > >>  > > > > > >> >> >>>>>> the
>     > >>  > > > > > >> >> >>>>>>>> binary_data can't be parsed without
> knowing
>     > what
>     > >>  it
>     > >>  > > is).
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Id range 0..999 is reserved for future
> use by
>     > the
>     > >>  > > broker
>     > >>  > > > > and
>     > >>  > > > > > >> must
>     > >>  > > > > > >> >> >>>> not be
>     > >>  > > > > > >> >> >>>>>>>> used by plugins.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Broker
>     > >>  > > > > > >> >> >>>>>>>> ---------
>     > >>  > > > > > >> >> >>>>>>>> The broker does not process the tags
> (other
>     > than
>     > >>  the
>     > >>  > > > > > standard
>     > >>  > > > > > >> >> >>>> protocol
>     > >>  > > > > > >> >> >>>>>>>> syntax verification), it simply stores
> and
>     > >>  forwards
>     > >>  > > them
>     > >>  > > > > as
>     > >>  > > > > > >> opaque
>     > >>  > > > > > >> >> >>>> data.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Standard message translation (removal of
>     > Headers)
>     > >>  > > kicks
>     > >>  > > > in
>     > >>  > > > > > for
>     > >>  > > > > > >> >> >>> older
>     > >>  > > > > > >> >> >>>>>>>> clients.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Why not string ids?
>     > >>  > > > > > >> >> >>>>>>>> -------------------------
>     > >>  > > > > > >> >> >>>>>>>> String ids might seem like a good idea,
> but:
>     > >>  > > > > > >> >> >>>>>>>> * does not really solve uniqueness
>     > >>  > > > > > >> >> >>>>>>>> * consumes a lot of space (2 byte string
>     > length +
>     > >>  > > > string,
>     > >>  > > > > > per
>     > >>  > > > > > >> >> >>>> header)
>     > >>  > > > > > >> >> >>>>>> to
>     > >>  > > > > > >> >> >>>>>>>> be meaningful
>     > >>  > > > > > >> >> >>>>>>>> * doesn't really say anything how to
> parse the
>     > >>  tag's
>     > >>  > > > data,
>     > >>  > > > > > so
>     > >>  > > > > > >> it
>     > >>  > > > > > >> >> >>> is
>     > >>  > > > > > >> >> >>>> in
>     > >>  > > > > > >> >> >>>>>>>> effect useless on its own.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> Regards,
>     > >>  > > > > > >> >> >>>>>>>> Magnus
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> 2016-11-07 18:32 GMT+01:00 Michael
> Pearce <
>     > >>  > > > > > >> [email protected]
>     > >>  > > > > > >> >> >:
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Hi Roger,
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Thanks for the support.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> I think the key thing is to have a
> common key
>     > >>  space
>     > >>  > > to
>     > >>  > > > > make
>     > >>  > > > > > >> an
>     > >>  > > > > > >> >> >>>>>> ecosystem,
>     > >>  > > > > > >> >> >>>>>>>>> there does have to be some level of
> contract
>     > for
>     > >>  > > people
>     > >>  > > > > to
>     > >>  > > > > > >> play
>     > >>  > > > > > >> >> >>>>>> nicely.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Having map<String, byte[]> or as per
> current
>     > >>  > proposed
>     > >>  > > > in
>     > >>  > > > > > kip
>     > >>  > > > > > >> of
>     > >>  > > > > > >> >> >>>>>> having a
>     > >>  > > > > > >> >> >>>>>>>>> numerical key space of map<int,
> byte[]> is a
>     > >>  level
>     > >>  > > of
>     > >>  > > > > the
>     > >>  > > > > > >> >> >>> contract
>     > >>  > > > > > >> >> >>>>>> that
>     > >>  > > > > > >> >> >>>>>>>>> most people would expect.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> I think the example in a previous
> comment
>     > someone
>     > >>  > > else
>     > >>  > > > > made
>     > >>  > > > > > >> >> >>>> linking to
>     > >>  > > > > > >> >> >>>>>>>> AWS
>     > >>  > > > > > >> >> >>>>>>>>> blog and also implemented api where
>     > originally
>     > >>  they
>     > >>  > > > > didn't
>     > >>  > > > > > >> have a
>     > >>  > > > > > >> >> >>>>>> header
>     > >>  > > > > > >> >> >>>>>>>>> space but not they do, where keys are
>     > uniform but
>     > >>  > the
>     > >>  > > > > value
>     > >>  > > > > > >> can
>     > >>  > > > > > >> >> >>> be
>     > >>  > > > > > >> >> >>>>>>>> string,
>     > >>  > > > > > >> >> >>>>>>>>> int, anything is a good example.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Having a custom MetadataSerializer is
>     > something
>     > >>  we
>     > >>  > > had
>     > >>  > > > > > played
>     > >>  > > > > > >> >> >>> with,
>     > >>  > > > > > >> >> >>>>>> but
>     > >>  > > > > > >> >> >>>>>>>>> discounted the idea, as if you wanted
>     > everyone to
>     > >>  > > work
>     > >>  > > > > the
>     > >>  > > > > > >> same
>     > >>  > > > > > >> >> >>>> way in
>     > >>  > > > > > >> >> >>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>> ecosystem, having to have this also
>     > customizable
>     > >>  > > makes
>     > >>  > > > > it a
>     > >>  > > > > > >> bit
>     > >>  > > > > > >> >> >>>>>> harder.
>     > >>  > > > > > >> >> >>>>>>>>> Think about making the whole message
> record
>     > >>  custom
>     > >>  > > > > > >> serializable,
>     > >>  > > > > > >> >> >>>> this
>     > >>  > > > > > >> >> >>>>>>>> would
>     > >>  > > > > > >> >> >>>>>>>>> make it fairly tricky (though it would
> not be
>     > >>  > > > impossible)
>     > >>  > > > > > to
>     > >>  > > > > > >> have
>     > >>  > > > > > >> >> >>>> made
>     > >>  > > > > > >> >> >>>>>>>> work
>     > >>  > > > > > >> >> >>>>>>>>> nicely. Having the value customizable
> we
>     > thought
>     > >>  > is a
>     > >>  > > > > > >> reasonable
>     > >>  > > > > > >> >> >>>>>> tradeoff
>     > >>  > > > > > >> >> >>>>>>>>> here of flexibility over contract of
>     > interaction
>     > >>  > > > between
>     > >>  > > > > > >> >> >>> different
>     > >>  > > > > > >> >> >>>>>>>> parties.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Is there a particular case or benefit
> of
>     > having
>     > >>  > > > > > serialization
>     > >>  > > > > > >> >> >>>>>>>> customizable
>     > >>  > > > > > >> >> >>>>>>>>> that you have in mind?
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Saying this it is obviously something
> that
>     > could
>     > >>  be
>     > >>  > > > > > >> implemented,
>     > >>  > > > > > >> >> >>> if
>     > >>  > > > > > >> >> >>>>>> there
>     > >>  > > > > > >> >> >>>>>>>>> is a need. If we did go this avenue I
> think a
>     > >>  > > defaulted
>     > >>  > > > > > >> >> >>> serializer
>     > >>  > > > > > >> >> >>>>>>>>> implementation should exist so for the
> 80:20
>     > >>  rule,
>     > >>  > > > people
>     > >>  > > > > > can
>     > >>  > > > > > >> >> >>> just
>     > >>  > > > > > >> >> >>>>>> have
>     > >>  > > > > > >> >> >>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>> broker and clients get default
> behavior.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> Cheers
>     > >>  > > > > > >> >> >>>>>>>>> Mike
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> On 11/6/16, 5:25 PM, "radai" <
>     > >>  > > > [email protected]
>     > >>  > > > > >
>     > >>  > > > > > >> wrote:
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> making header _key_ serialization
>     > configurable
>     > >>  > > > > > potentially
>     > >>  > > > > > >> >> >>>>>> undermines
>     > >>  > > > > > >> >> >>>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>> board usefulness of the feature (any
> point
>     > >>  along
>     > >>  > > the
>     > >>  > > > > > path
>     > >>  > > > > > >> >> >>> must
>     > >>  > > > > > >> >> >>>> be
>     > >>  > > > > > >> >> >>>>>>>> able
>     > >>  > > > > > >> >> >>>>>>>>> to
>     > >>  > > > > > >> >> >>>>>>>>> read the header keys. the values may be
>     > >>  whatever
>     > >>  > > and
>     > >>  > > > > > >> require
>     > >>  > > > > > >> >> >>>> more
>     > >>  > > > > > >> >> >>>>>>>>> intimate
>     > >>  > > > > > >> >> >>>>>>>>> knowledge of the code that produced
> specific
>     > >>  > > > headers,
>     > >>  > > > > > but
>     > >>  > > > > > >> >> >>> keys
>     > >>  > > > > > >> >> >>>>>> should
>     > >>  > > > > > >> >> >>>>>>>>> be
>     > >>  > > > > > >> >> >>>>>>>>> universally readable).
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> it would also make it hard to write
> really
>     > >>  > > portable
>     > >>  > > > > > >> plugins -
>     > >>  > > > > > >> >> >>>> say
>     > >>  > > > > > >> >> >>>>>> i
>     > >>  > > > > > >> >> >>>>>>>>> wrote a
>     > >>  > > > > > >> >> >>>>>>>>> large message splitter/combiner - if i
> rely
>     > on
>     > >>  > key
>     > >>  > > > > > >> >> >>>> "largeMessage"
>     > >>  > > > > > >> >> >>>>>> and
>     > >>  > > > > > >> >> >>>>>>>>> values of the form "1/20" someone who
> uses
>     > >>  > > > (contrived
>     > >>  > > > > > >> >> >>> example)
>     > >>  > > > > > >> >> >>>>>>>>> Map<Byte[],
>     > >>  > > > > > >> >> >>>>>>>>> Double> wouldnt be able to re-use my
> code.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> not the end of a the world within an
>     > >>  > organization,
>     > >>  > > > but
>     > >>  > > > > > >> >> >>>>>> problematic if
>     > >>  > > > > > >> >> >>>>>>>>> you
>     > >>  > > > > > >> >> >>>>>>>>> want to enable an ecosystem
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> On Thu, Nov 3, 2016 at 2:04 PM, Roger
> Hoover
>     > <
>     > >>  > > > > > >> >> >>>>>> [email protected]
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> wrote:
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> As others have laid out, I see strong
>     > reasons
>     > >>  for
>     > >>  > a
>     > >>  > > > > common
>     > >>  > > > > > >> >> >>>>>> message
>     > >>  > > > > > >> >> >>>>>>>>>> metadata structure for the Kafka
> ecosystem.
>     > In
>     > >>  > > > > > particular,
>     > >>  > > > > > >> >> >>>> I've
>     > >>  > > > > > >> >> >>>>>>>>> seen that
>     > >>  > > > > > >> >> >>>>>>>>>> even within a single organization,
>     > >>  infrastructure
>     > >>  > > > teams
>     > >>  > > > > > >> >> >>> often
>     > >>  > > > > > >> >> >>>>>> own
>     > >>  > > > > > >> >> >>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>>> message metadata while application
> teams
>     > own the
>     > >>  > > > > > >> >> >>>>>> application-level
>     > >>  > > > > > >> >> >>>>>>>>> data
>     > >>  > > > > > >> >> >>>>>>>>>> format. Allowing metadata and content
> to
>     > have
>     > >>  > > > different
>     > >>  > > > > > >> >> >>>>>> structure
>     > >>  > > > > > >> >> >>>>>>>>> and
>     > >>  > > > > > >> >> >>>>>>>>>> evolve separately is very helpful for
> this.
>     > >>  > Also, I
>     > >>  > > > > think
>     > >>  > > > > > >> >> >>>>>> there's
>     > >>  > > > > > >> >> >>>>>>>> a
>     > >>  > > > > > >> >> >>>>>>>>> lot of
>     > >>  > > > > > >> >> >>>>>>>>>> value to having a common metadata
> structure
>     > >>  shared
>     > >>  > > > > across
>     > >>  > > > > > >> >> >>> the
>     > >>  > > > > > >> >> >>>>>> Kafka
>     > >>  > > > > > >> >> >>>>>>>>>> ecosystem so that tools which leverage
>     > metadata
>     > >>  > can
>     > >>  > > > more
>     > >>  > > > > > >> >> >>>> easily
>     > >>  > > > > > >> >> >>>>>> be
>     > >>  > > > > > >> >> >>>>>>>>> shared
>     > >>  > > > > > >> >> >>>>>>>>>> across organizations and integrated
>     > together.
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> The question is, where does the
> metadata
>     > >>  structure
>     > >>  > > > > belong?
>     > >>  > > > > > >> >> >>>>>> Here's
>     > >>  > > > > > >> >> >>>>>>>>> my take:
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> We change the Kafka wire and on-disk
> format
>     > to
>     > >>  > from
>     > >>  > > a
>     > >>  > > > > > (key,
>     > >>  > > > > > >> >> >>>>>> value)
>     > >>  > > > > > >> >> >>>>>>>>> model to
>     > >>  > > > > > >> >> >>>>>>>>>> a (key, metadata, value) model where
> all
>     > three
>     > >>  are
>     > >>  > > > byte
>     > >>  > > > > > >> >> >>>> arrays
>     > >>  > > > > > >> >> >>>>>> from
>     > >>  > > > > > >> >> >>>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>>> brokers point of view. The primary
> reason
>     > for
>     > >>  > this
>     > >>  > > is
>     > >>  > > > > > that
>     > >>  > > > > > >> >> >>>> it
>     > >>  > > > > > >> >> >>>>>>>>> provides a
>     > >>  > > > > > >> >> >>>>>>>>>> backward compatible migration path
> forward.
>     > >>  > > Producers
>     > >>  > > > > can
>     > >>  > > > > > >> >> >>>> start
>     > >>  > > > > > >> >> >>>>>>>>> populating
>     > >>  > > > > > >> >> >>>>>>>>>> metadata fields before all consumers
>     > understand
>     > >>  > the
>     > >>  > > > > > >> >> >>> metadata
>     > >>  > > > > > >> >> >>>>>>>>> structure.
>     > >>  > > > > > >> >> >>>>>>>>>> For people who already have custom
> envelope
>     > >>  > > > structures,
>     > >>  > > > > > >> >> >>> they
>     > >>  > > > > > >> >> >>>> can
>     > >>  > > > > > >> >> >>>>>>>>> populate
>     > >>  > > > > > >> >> >>>>>>>>>> their existing structure and the new
>     > structure
>     > >>  > for a
>     > >>  > > > > while
>     > >>  > > > > > >> >> >>> as
>     > >>  > > > > > >> >> >>>>>> they
>     > >>  > > > > > >> >> >>>>>>>>> make the
>     > >>  > > > > > >> >> >>>>>>>>>> transition.
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> We could stop there and let the
> clients
>     > plug in
>     > >>  a
>     > >>  > > > > > >> >> >>>> KeySerializer,
>     > >>  > > > > > >> >> >>>>>>>>>> MetadataSerializer, and
> ValueSerializer but
>     > I
>     > >>  > think
>     > >>  > > it
>     > >>  > > > > is
>     > >>  > > > > > >> >> >>>> also
>     > >>  > > > > > >> >> >>>>>> be
>     > >>  > > > > > >> >> >>>>>>>>> useful to
>     > >>  > > > > > >> >> >>>>>>>>>> have a default MetadataSerializer that
>     > >>  implements
>     > >>  > a
>     > >>  > > > > > >> >> >>> key-value
>     > >>  > > > > > >> >> >>>>>> model
>     > >>  > > > > > >> >> >>>>>>>>> similar
>     > >>  > > > > > >> >> >>>>>>>>>> to AMQP or HTTP headers. Or we could
> go even
>     > >>  > > further
>     > >>  > > > > and
>     > >>  > > > > > >> >> >>>>>>>> prescribe a
>     > >>  > > > > > >> >> >>>>>>>>>> Map<String, byte[]> or Map<String,
> String>
>     > data
>     > >>  > > model
>     > >>  > > > > for
>     > >>  > > > > > >> >> >>>>>> headers
>     > >>  > > > > > >> >> >>>>>>>> in
>     > >>  > > > > > >> >> >>>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>>> clients (while still allowing custom
>     > >>  serialization
>     > >>  > > of
>     > >>  > > > > the
>     > >>  > > > > > >> >> >>>> header
>     > >>  > > > > > >> >> >>>>>>>> data
>     > >>  > > > > > >> >> >>>>>>>>>> model).
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> I think this would address Radai's
> concerns:
>     > >>  > > > > > >> >> >>>>>>>>>> 1. All client code would not need to
> be
>     > updated
>     > >>  to
>     > >>  > > > know
>     > >>  > > > > > >> >> >>> about
>     > >>  > > > > > >> >> >>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>>> container.
>     > >>  > > > > > >> >> >>>>>>>>>> 2. Middleware friendly clients would
> have a
>     > >>  > standard
>     > >>  > > > > > header
>     > >>  > > > > > >> >> >>>> data
>     > >>  > > > > > >> >> >>>>>>>>> model to
>     > >>  > > > > > >> >> >>>>>>>>>> work with.
>     > >>  > > > > > >> >> >>>>>>>>>> 3. KIP is required both b/c of broker
>     > changes
>     > >>  and
>     > >>  > > > > because
>     > >>  > > > > > >> >> >>> of
>     > >>  > > > > > >> >> >>>>>> client
>     > >>  > > > > > >> >> >>>>>>>>> API
>     > >>  > > > > > >> >> >>>>>>>>>> changes.
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> Cheers,
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> Roger
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>> On Wed, Nov 2, 2016 at 4:38 PM, radai
> <
>     > >>  > > > > > >> >> >>>>>> [email protected]>
>     > >>  > > > > > >> >> >>>>>>>>> wrote:
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>> my biggest issues with a "standard"
> wrapper
>     > >>  > format:
>     > >>  > > > > > >> >> >>>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>> 1. _ALL_ client _CODE_ (as opposed to
>     > kafka lib
>     > >>  > > > > version)
>     > >>  > > > > > >> >> >>>> must
>     > >>  > > > > > >> >> >>>>>> be
>     > >>  > > > > > >> >> >>>>>>>>> updated
>     > >>  > > > > > >> >> >>>>>>>>>> to
>     > >>  > > > > > >> >> >>>>>>>>>>> know about the container, because
> any old
>     > naive
>     > >>  > > code
>     > >>  > > > > > >> >> >>>> trying to
>     > >>  > > > > > >> >> >>>>>>>>> directly
>     > >>  > > > > > >> >> >>>>>>>>>>> deserialize its own payload would
> keel
>     > over and
>     > >>  > die
>     > >>  > > > (it
>     > >>  > > > > > >> >> >>>> needs
>     > >>  > > > > > >> >> >>>>>> to
>     > >>  > > > > > >> >> >>>>>>>>> know to
>     > >>  > > > > > >> >> >>>>>>>>>>> deserialize a container, and then
> dig in
>     > there
>     > >>  > for
>     > >>  > > > its
>     > >>  > > > > > >> >> >>>>>> payload).
>     > >>  > > > > > >> >> >>>>>>>>>>> 2. in order to write
> middleware-friendly
>     > >>  clients
>     > >>  > > that
>     > >>  > > > > > >> >> >>>> utilize
>     > >>  > > > > > >> >> >>>>>>>> such
>     > >>  > > > > > >> >> >>>>>>>>> a
>     > >>  > > > > > >> >> >>>>>>>>>>> container one would basically have
> to write
>     > >>  their
>     > >>  > > own
>     > >>  > > > > > >> >> >>>>>>>>> producer/consumer
>     > >>  > > > > > >> >> >>>>>>>>>> API
>     > >>  > > > > > >> >> >>>>>>>>>>> on top of the open source kafka one.
>     > >>  > > > > > >> >> >>>>>>>>>>> 3. if you were going to go with a
> wrapper
>     > >>  format
>     > >>  > > you
>     > >>  > > > > > >> >> >>> really
>     > >>  > > > > > >> >> >>>>>> dont
>     > >>  > > > > > >> >> >>>>>>>>> need to
>     > >>  > > > > > >> >> >>>>>>>>>>> bother with a kip (just open source
> your
>     > own
>     > >>  > client
>     > >>  > > > > stack
>     > >>  > > > > > >> >> >>>>>> from #2
>     > >>  > > > > > >> >> >>>>>>>>> above
>     > >>  > > > > > >> >> >>>>>>>>>> so
>     > >>  > > > > > >> >> >>>>>>>>>>> others could stop re-inventing it)
>     > >>  > > > > > >> >> >>>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>> On Wed, Nov 2, 2016 at 4:25 PM, James
>     > Cheng <
>     > >>  > > > > > >> >> >>>>>>>> [email protected]>
>     > >>  > > > > > >> >> >>>>>>>>>> wrote:
>     > >>  > > > > > >> >> >>>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>>> How exactly would this work? Or
> maybe
>     > that's
>     > >>  out
>     > >>  > > of
>     > >>  > > > > > >> >> >>> scope
>     > >>  > > > > > >> >> >>>>>> for
>     > >>  > > > > > >> >> >>>>>>>>> this
>     > >>  > > > > > >> >> >>>>>>>>>> email.
>     > >>  > > > > > >> >> >>>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>>> The information contained in this
> email is
>     > >>  strictly
>     > >>  > > > > > >> confidential
>     > >>  > > > > > >> >> >>>> and
>     > >>  > > > > > >> >> >>>>>> for
>     > >>  > > > > > >> >> >>>>>>>>> the use of the addressee only, unless
>     > otherwise
>     > >>  > > > > indicated.
>     > >>  > > > > > >> If you
>     > >>  > > > > > >> >> >>>> are
>     > >>  > > > > > >> >> >>>>>> not
>     > >>  > > > > > >> >> >>>>>>>>> the intended recipient, please do not
> read,
>     > copy,
>     > >>  > use
>     > >>  > > > or
>     > >>  > > > > > >> disclose
>     > >>  > > > > > >> >> >>>> to
>     > >>  > > > > > >> >> >>>>>>>> others
>     > >>  > > > > > >> >> >>>>>>>>> this message or any attachment. Please
> also
>     > >>  notify
>     > >>  > > the
>     > >>  > > > > > >> sender by
>     > >>  > > > > > >> >> >>>>>> replying
>     > >>  > > > > > >> >> >>>>>>>>> to this email or by telephone (+44(020
> 7896
>     > 0011)
>     > >>  > and
>     > >>  > > > > then
>     > >>  > > > > > >> delete
>     > >>  > > > > > >> >> >>>> the
>     > >>  > > > > > >> >> >>>>>>>> email
>     > >>  > > > > > >> >> >>>>>>>>> and any copies of it. Opinions,
> conclusion
>     > (etc)
>     > >>  > that
>     > >>  > > > do
>     > >>  > > > > > not
>     > >>  > > > > > >> >> >>>> relate to
>     > >>  > > > > > >> >> >>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>> official business of this company
> shall be
>     > >>  > understood
>     > >>  > > > as
>     > >>  > > > > > >> neither
>     > >>  > > > > > >> >> >>>> given
>     > >>  > > > > > >> >> >>>>>>>> nor
>     > >>  > > > > > >> >> >>>>>>>>> endorsed by it. IG is a trading name
> of IG
>     > >>  Markets
>     > >>  > > > > Limited
>     > >>  > > > > > (a
>     > >>  > > > > > >> >> >>>> company
>     > >>  > > > > > >> >> >>>>>>>>> registered in England and Wales,
> company
>     > number
>     > >>  > > > 04008957)
>     > >>  > > > > > >> and IG
>     > >>  > > > > > >> >> >>>> Index
>     > >>  > > > > > >> >> >>>>>>>>> Limited (a company registered in
> England and
>     > >>  Wales,
>     > >>  > > > > company
>     > >>  > > > > > >> >> >>> number
>     > >>  > > > > > >> >> >>>>>>>>> 01190902). Registered address at Cannon
>     > Bridge
>     > >>  > House,
>     > >>  > > > 25
>     > >>  > > > > > >> Dowgate
>     > >>  > > > > > >> >> >>>> Hill,
>     > >>  > > > > > >> >> >>>>>>>>> London EC4R 2YA. Both IG Markets
> Limited
>     > >>  (register
>     > >>  > > > number
>     > >>  > > > > > >> 195355)
>     > >>  > > > > > >> >> >>>> and
>     > >>  > > > > > >> >> >>>>>> IG
>     > >>  > > > > > >> >> >>>>>>>>> Index Limited (register number 114059)
> are
>     > >>  > authorised
>     > >>  > > > and
>     > >>  > > > > > >> >> >>>> regulated by
>     > >>  > > > > > >> >> >>>>>>>> the
>     > >>  > > > > > >> >> >>>>>>>>> Financial Conduct Authority.
>     > >>  > > > > > >> >> >>>>>>>>>
>     > >>  > > > > > >> >> >>>>>>>> The information contained in this email
> is
>     > >>  strictly
>     > >>  > > > > > >> confidential
>     > >>  > > > > > >> >> >>> and
>     > >>  > > > > > >> >> >>>> for
>     > >>  > > > > > >> >> >>>>>>>> the use of the addressee only, unless
>     > otherwise
>     > >>  > > > indicated.
>     > >>  > > > > > If
>     > >>  > > > > > >> you
>     > >>  > > > > > >> >> >>> are
>     > >>  > > > > > >> >> >>>>>> not
>     > >>  > > > > > >> >> >>>>>>>> the intended recipient, please do not
> read,
>     > copy,
>     > >>  > use
>     > >>  > > or
>     > >>  > > > > > >> disclose
>     > >>  > > > > > >> >> >>> to
>     > >>  > > > > > >> >> >>>>>> others
>     > >>  > > > > > >> >> >>>>>>>> this message or any attachment. Please
> also
>     > notify
>     > >>  > the
>     > >>  > > > > > sender
>     > >>  > > > > > >> by
>     > >>  > > > > > >> >> >>>>>> replying
>     > >>  > > > > > >> >> >>>>>>>> to this email or by telephone (+44(020
> 7896
>     > 0011)
>     > >>  > and
>     > >>  > > > then
>     > >>  > > > > > >> delete
>     > >>  > > > > > >> >> >>> the
>     > >>  > > > > > >> >> >>>>>> email
>     > >>  > > > > > >> >> >>>>>>>> and any copies of it. Opinions,
> conclusion
>     > (etc)
>     > >>  > that
>     > >>  > > do
>     > >>  > > > > not
>     > >>  > > > > > >> >> relate
>     > >>  > > > > > >> >> >>>> to
>     > >>  > > > > > >> >> >>>>>> the
>     > >>  > > > > > >> >> >>>>>>>> official business of this company shall
> be
>     > >>  > understood
>     > >>  > > as
>     > >>  > > > > > >> neither
>     > >>  > > > > > >> >> >>>> given
>     > >>  > > > > > >> >> >>>>>> nor
>     > >>  > > > > > >> >> >>>>>>>> endorsed by it. IG is a trading name of
> IG
>     > Markets
>     > >>  > > > Limited
>     > >>  > > > > > (a
>     > >>  > > > > > >> >> >>> company
>     > >>  > > > > > >> >> >>>>>>>> registered in England and Wales, company
>     > number
>     > >>  > > > 04008957)
>     > >>  > > > > > and
>     > >>  > > > > > >> IG
>     > >>  > > > > > >> >> >>>> Index
>     > >>  > > > > > >> >> >>>>>>>> Limited (a company registered in
> England and
>     > >>  Wales,
>     > >>  > > > > company
>     > >>  > > > > > >> number
>     > >>  > > > > > >> >> >>>>>>>> 01190902). Registered address at Cannon
> Bridge
>     > >>  > House,
>     > >>  > > 25
>     > >>  > > > > > >> Dowgate
>     > >>  > > > > > >> >> >>>> Hill,
>     > >>  > > > > > >> >> >>>>>>>> London EC4R 2YA. Both IG Markets Limited
>     > (register
>     > >>  > > > number
>     > >>  > > > > > >> 195355)
>     > >>  > > > > > >> >> >>>> and IG
>     > >>  > > > > > >> >> >>>>>>>> Index Limited (register number 114059)
> are
>     > >>  > authorised
>     > >>  > > > and
>     > >>  > > > > > >> >> regulated
>     > >>  > > > > > >> >> >>>> by
>     > >>  > > > > > >> >> >>>>>> the
>     > >>  > > > > > >> >> >>>>>>>> Financial Conduct Authority.
>     > >>  > > > > > >> >> >>>>>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>>> --
>     > >>  > > > > > >> >> >>>>>> Gwen Shapira
>     > >>  > > > > > >> >> >>>>>> Product Manager | Confluent
>     > >>  > > > > > >> >> >>>>>> 650.450.2760 | @gwenshap
>     > >>  > > > > > >> >> >>>>>> Follow us: Twitter | blog
>     > >>  > > > > > >> >> >>>>>>
>     > >>  > > > > > >> >> >>>>
>     > >>  > > > > > >> >> >>>>
>     > >>  > > > > > >> >> >>>>
>     > >>  > > > > > >> >> >>>> --
>     > >>  > > > > > >> >> >>>> Gwen Shapira
>     > >>  > > > > > >> >> >>>> Product Manager | Confluent
>     > >>  > > > > > >> >> >>>> 650.450.2760 | @gwenshap
>     > >>  > > > > > >> >> >>>> Follow us: Twitter | blog
>     > >>  > > > > > >> >> >>>>
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >>> --
>     > >>  > > > > > >> >> >>> Nacho (Ignacio) Solis
>     > >>  > > > > > >> >> >>> Kafka
>     > >>  > > > > > >> >> >>> [email protected]
>     > >>  > > > > > >> >> >>>
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> >
>     > >>  > > > > > >> >> > --
>     > >>  > > > > > >> >> > Gwen Shapira
>     > >>  > > > > > >> >> > Product Manager | Confluent
>     > >>  > > > > > >> >> > 650.450.2760 | @gwenshap
>     > >>  > > > > > >> >> > Follow us: Twitter | blog
>     > >>  > > > > > >> >>
>     > >>  > > > > > >> >>
>     > >>  > > > > > >>
>     > >>  > > > > > >>
>     > >>  > > > > > >>
>     > >>  > > > > > >> --
>     > >>  > > > > > >> Gwen Shapira
>     > >>  > > > > > >> Product Manager | Confluent
>     > >>  > > > > > >> 650.450.2760 | @gwenshap
>     > >>  > > > > > >> Follow us: Twitter | blog
>     > >>  > > > > > >>
>     > >>  > > > > > >
>     > >>  > > > > > >
>     > >>  > > > > > The information contained in this email is strictly
>     > confidential
>     > >>  > and
>     > >>  > > > for
>     > >>  > > > > > the use of the addressee only, unless otherwise
> indicated.
>     > If you
>     > >>  > are
>     > >>  > > > not
>     > >>  > > > > > the intended recipient, please do not read, copy, use
> or
>     > disclose
>     > >>  > to
>     > >>  > > > > others
>     > >>  > > > > > this message or any attachment. Please also notify the
>     > sender by
>     > >>  > > > replying
>     > >>  > > > > > to this email or by telephone (+44(020 7896 0011) and
> then
>     > delete
>     > >>  > the
>     > >>  > > > > email
>     > >>  > > > > > and any copies of it. Opinions, conclusion (etc) that
> do not
>     > >>  relate
>     > >>  > > to
>     > >>  > > > > the
>     > >>  > > > > > official business of this company shall be understood
> as
>     > neither
>     > >>  > > given
>     > >>  > > > > nor
>     > >>  > > > > > endorsed by it. IG is a trading name of IG Markets
> Limited (a
>     > >>  > company
>     > >>  > > > > > registered in England and Wales, company number
> 04008957)
>     > and IG
>     > >>  > > Index
>     > >>  > > > > > Limited (a company registered in England and Wales,
> company
>     > >>  number
>     > >>  > > > > > 01190902). Registered address at Cannon Bridge House,
> 25
>     > Dowgate
>     > >>  > > Hill,
>     > >>  > > > > > London EC4R 2YA. Both IG Markets Limited (register
> number
>     > 195355)
>     > >>  > and
>     > >>  > > > IG
>     > >>  > > > > > Index Limited (register number 114059) are authorised
> and
>     > >>  regulated
>     > >>  > > by
>     > >>  > > > > the
>     > >>  > > > > > Financial Conduct Authority.
>     > >>  > > > > >
>     > >>  > > > >
>     > >>  > > >
>     > >>  > >
>     > >>  >
>     > >>  The information contained in this email is strictly confidential
> and
>     > for
>     > >>  the use of the addressee only, unless otherwise indicated. If
> you are
>     > not
>     > >>  the intended recipient, please do not read, copy, use or
> disclose to
>     > others
>     > >>  this message or any attachment. Please also notify the sender by
>     > replying
>     > >>  to this email or by telephone (+44(020 7896 0011) and then
> delete the
>     > email
>     > >>  and any copies of it. Opinions, conclusion (etc) that do not
> relate to
>     > the
>     > >>  official business of this company shall be understood as neither
> given
>     > nor
>     > >>  endorsed by it. IG is a trading name of IG Markets Limited (a
> company
>     > >>  registered in England and Wales, company number 04008957) and IG
> Index
>     > >>  Limited (a company registered in England and Wales, company
> number
>     > >>  01190902). Registered address at Cannon Bridge House, 25 Dowgate
> Hill,
>     > >>  London EC4R 2YA. Both IG Markets Limited (register number
> 195355) and
>     > IG
>     > >>  Index Limited (register number 114059) are authorised and
> regulated by
>     > the
>     > >>  Financial Conduct Authority.
>     > >
>     > > --
>     > > Nacho - Ignacio Solis - [email protected]
>     > The information contained in this email is strictly confidential and
> for
>     > the use of the addressee only, unless otherwise indicated. If you
> are not
>     > the intended recipient, please do not read, copy, use or disclose to
> others
>     > this message or any attachment. Please also notify the sender by
> replying
>     > to this email or by telephone (+44(020 7896 0011) and then delete
> the email
>     > and any copies of it. Opinions, conclusion (etc) that do not relate
> to the
>     > official business of this company shall be understood as neither
> given nor
>     > endorsed by it. IG is a trading name of IG Markets Limited (a company
>     > registered in England and Wales, company number 04008957) and IG
> Index
>     > Limited (a company registered in England and Wales, company number
>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
> Hill,
>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
> and IG
>     > Index Limited (register number 114059) are authorised and regulated
> by the
>     > Financial Conduct Authority.
>     >
>
>
>
>     --
>     Nacho - Ignacio Solis - [email protected]
>
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>

Re: [DISCUSS] KIP-82 - Add Record Headers

Reply via email to