Re: [DISCUSS] KIP-82 - Add Record Headers

Matthias J. Sax Wed, 14 Dec 2016 09:51:38 -0800

Yes and no. I did overload the term "control message".

EOS control messages are for client-broker communication and thus never
exposed to any application. And I think this is a good design because
broker needs to understand those control messages. Thus, this should be
a protocol change.


The type of control messages I have in mind are for client-client
(application-application) communication and the broker is agnostic to
them. Thus, it should not be a protocol change.


-Matthias



On 12/14/16 9:42 AM, radai wrote:
> arent control messages getting pushed as their own top level protocol
> change (and a fairly massive one) for the transactions KIP ?
> 
> On Tue, Dec 13, 2016 at 5:54 PM, Matthias J. Sax <[email protected]>
> wrote:
> 
>> Hi,
>>
>> I want to add a completely new angle to this discussion. For this, I
>> want to propose an extension for the headers feature that enables new
>> uses cases -- and those new use cases might convince people to support
>> headers (of course including the larger scoped proposal).
>>
>> Extended Proposal:
>>
>> Allow messages with a certain header key to be special "control
>> messages" (w/ o w/o payload) that are not exposed to an application via
>> .poll().
>>
>> Thus, a consumer client would automatically skip over those messages. If
>> an application knows about embedded control messages, it can "sing up"
>> to those messages by the consumer client and either get a callback or
>> the consumer auto-drop for this messages gets disabled (allowing to
>> consumer those messages via poll()).
>>
>> (The details need further considerations/discussion. I just want to
>> sketch the main idea.)
>>
>> Usage:
>>
>> There is a shared topic (ie, used by multiple applications) and a
>> producer application wants to embed a special message in the topic for a
>> dedicated consumer application. Because only one application will
>> understand this message, it cannot be a regular message as this would
>> break all applications that do not understand this message. The producer
>> application would set a special metadata key and no consumer application
>> would see this control message by default because they did not enable
>> their consumer client to return this message in poll() (and the client
>> would just drop this message with special metadata key). Only the single
>> application that should receive this message, will subscribe to this
>> message on its consumer client and process it.
>>
>>
>> Concrete Use Case: Kafka Streams
>>
>> In Kafka Streams, we would like to propagate "control messages" from
>> subtopology to subtopology. There are multiple scenarios for which this
>> would be useful. For example, currently we do not guarantee a
>> "consistent shutdown" of an application. By this, I mean that input
>> records might not be completely processed by the whole topology because
>> the application shutdown happens "in between" and an intermediate result
>> topic gets "stock" in an intermediate topic. Thus, a user would see an
>> committed offset of the source topic of the application, but no
>> corresponding result record in the output topic.
>>
>> Having "shutdown markers" would allow us, to first stop the upstream
>> subtopology and write this marker into the intermediate topic and the
>> downstream subtopology would only shut down itself after is sees the
>> "shutdown marker". Thus, we can guarantee on shutdown, that no
>> "in-flight" messages got stuck in intermediate topics.
>>
>>
>> A similar usage would be for KIP-95 (Incremental Batch Processing).
>> There was a discussion about the proposed metadata topic, and we could
>> avoid this metadata topic if we would have "control messages".
>>
>>
>> Right now, we cannot insert an "application control message" because
>> Kafka Streams does not own all topics it read/writes and thus might
>> break other consumer application (as described above) if we inject
>> random messages that are not understood by other apps.
>>
>>
>> Of course, one can work around "embedded control messaged" by using an
>> additional topic to propagate control messaged between application (as
>> suggestion in KIP-95 via a metadata topic for Kafka Streams). But there
>> are major concerns about adding this metadata topic in the KIP and this
>> shows that other application that need a similar pattern might profit
>> from topic embedded "control messages", too.
>>
>>
>> One last important consideration: those "control messages" are used for
>> client to client communication and are not understood by the broker.
>> Thus, those messages should not be enabled within the message format
>> (c.f. tombstone flag -- KIP-87). However, "client land" record headers
>> would be a nice way to implement them. Because KIP-82 did consider key
>> namespaces for metatdata keys, this extension should not be an own KIP
>> but should be included in KIP-82 to reserve a namespace for "control
>> message" in the first place.
>>
>>
>> Sorry for the long email... Looking forward to your feedback.
>>
>>
>> -Matthias
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On 12/8/16 12:12 AM, Michael Pearce wrote:
>>> Hi Jun
>>>
>>> 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
>> Hopefully one day kafka) the APM tools stich in a unique id (though I
>> believe it contains the end2end uuid embedded in this id), on receiving the
>> message at the receiving JVM the apm code takes this out, and continues its
>> tracing on the that new thread. Both JVM’s (and other languages the APM
>> tool supports) send this data async back to the central controllers where
>> the stiching togeather occurs. For this they need some header space for
>> them to put this id.
>>>
>>> 101) Yes indeed we have a business transaction Id in the payload. Though
>> this is a system level tracing, that we need to have marry up. Also as per
>> note on end2end encryption we’d be unable to prove the flow if the payload
>> is encrypted as we’d not have access to this at certain points of the flow
>> through the infrastructure/platform.
>>>
>>>
>>> 103) As said we use this mechanism in IG very successfully, as stated
>> per key we guarantee the transaction producing app to handle the
>> transaction of a key at one DC unless at point of critical failure where we
>> have to flip processing to another. We care about key ordering.
>>> I disagree on the offset comment for the partition solution unless you
>> do full ISR, or expensive full XA transactions even with partitions you
>> cannot fully guarantee offsets would match.
>>>
>>> 105) Very much so, I need to have access at the platform level to the
>> other meta data all mentioned, without having to need to have access to the
>> encryption keys of the payload.
>>>
>>> 106)
>>> Techincally yes for AZ/Region/Cluster, but then we’d need to have a
>> global producerId register which would be very hard to enforce/ensure is
>> current and correct, just to understand the message origins of its
>> region/az/cluster for routing.
>>> The client wrapper version, producerId can be the same, as obviously the
>> producer could upgrade its wrapper, as such we need to know what wrapper
>> version the message is created with.
>>> Likewise the IP address, as stated we can have our producer move, where
>> its IP would change.
>>>
>>> 107)
>>> UUID is set on the message by interceptors before actual producer
>> transport send. This is for platform level message dedupe guarantee, the
>> business payload should be agnostic to this. Please see
>> https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
>> note this is not touching business payloads.
>>>
>>>
>>>
>>> On 06/12/2016, 18:22, "Jun Rao" <[email protected]> wrote:
>>>
>>>     Hi, Michael,
>>>
>>>     Thanks for the reply. I find it very helpful.
>>>
>>>     Data lineage:
>>>     100. I'd like to understand the APM use case a bit more. It sounds
>> like
>>>     that those APM plugins can generate a transaction id that we could
>>>     potentially put in the header of every message. How would you
>> typically
>>>     make use of such transaction ids? Are there other metadata
>> associated with
>>>     the transaction id and if so, how are they propagated downstream?
>>>
>>>     101. For the finance use case, if the concept of transaction is
>> important,
>>>     wouldn't it be typically included in the message payload instead of
>> as an
>>>     optional header field?
>>>
>>>     102. The data lineage that Altas and Navigator support seems to be
>> at the
>>>     dataset level, not per record level? So, not sure if per message
>> headers
>>>     are relevant there.
>>>
>>>     Mirroring:
>>>     103. The benefit of using separate partitions is that it potentially
>> makes
>>>     it easy to preserve offsets during mirroring. This will make it
>> easier for
>>>     consumer to switch clusters. Currently, the consumers can switch
>> clusters
>>>     by using the timestampToOffset() api, but it has to deal with
>> duplicates.
>>>     Good point on the issue with log compact and I am not sure how to
>> address
>>>     this. However, even if we mirror into the existing partitions, the
>> ordering
>>>     for messages generated from different clusters seems
>> non-deterministic
>>>     anyway. So, it seems that the consumers already have to deal with
>> that? If
>>>     a topic is compacted, does that mean which messages are preserved is
>> also
>>>     non-deterministic across clusters?
>>>
>>>     104. Good point on partition key.
>>>
>>>     End-to-end encryption:
>>>     105. So, it seems end-to-end encryption is useful. Are headers
>> useful there?
>>>
>>>     Auditing:
>>>     106. It seems other than the UUID, all other metadata are per
>> producer?
>>>
>>>     EOS:
>>>     107. How are those UUIDs generated? I am not sure if they can be
>> generated
>>>     in the producer library. An application may send messages through a
>> load
>>>     balancer and on retry, the same message could be routed to a
>> different
>>>     producer instance. So, it seems that the application has to generate
>> the
>>>     UUIDs. In that case, shouldn't the application just put the UUID in
>> the
>>>     payload?
>>>
>>>     Thanks,
>>>
>>>     Jun
>>>
>>>
>>>     On Fri, Dec 2, 2016 at 4:57 PM, Michael Pearce <
>> [email protected]>
>>>     wrote:
>>>
>>>     > Hi Jun.
>>>     >
>>>     > Per Transaction Tracing / Data Lineage.
>>>     >
>>>     > As Stated in the KIP this has the first use case of how many APM
>> tools now
>>>     > work.
>>>     > I would find it impossible for any one to argue this is not
>> important or a
>>>     > niche market as it has its own gartner report for this space. Such
>>>     > companies as Appdynamics, NewRelic, Dynatrace, Hawqular are but a
>> few.
>>>     >
>>>     > Likewise these APM tools can help very rapidly track down issues
>> and
>>>     > automatically capture metrics, perform actions based on unexpected
>> behavior
>>>     > to auto recover services.
>>>     >
>>>     > Before mentioning looking at aggregated stats, in these cases where
>>>     > actually on critical flows we cannot afford to have aggregated
>> rolled up
>>>     > stats only.
>>>     >
>>>     > With the APM tool we use its actually able to detect a single
>> transaction
>>>     > failure and capture the thread traces in the JVM where it failed
>> and
>>>     > everything for us, to the point it sends us alerts where we have
>> this
>>>     > giving the line number of the code that caused it, the transaction
>> trace
>>>     > through all the services and endpoints (supported) upto the point
>> of
>>>     > failure, it can also capture the data in and out (so we can
>> replay).
>>>     > Because atm Kafka doesn’t support us being able to stich in these
>> tracing
>>>     > transaction ids natively, we cannot get these benefits as such is
>> limiting
>>>     > our ability support apps and monitor them to the same standards we
>> come to
>>>     > expect when on a kafka flow.
>>>     >
>>>     > This actually ties in with Data Lineage, as the same tracing can
>> be used
>>>     > to back stich this. Essentially many times due to the sums of money
>>>     > involved there are disputes, and typically as a financial
>> institute the
>>>     > easiest and cleanest way to prove when disputes arise is to
>> present the
>>>     > actual flow and processes involved in a transaction.
>>>     >
>>>     > Likewise as Hadoop matures its evident this case is important, as
>> tools
>>>     > such as Atlas (Hortonworks led) and Navigator (cloudera led) are
>> evident
>>>     > also I believe the importance here is very much NOT just a
>> financial issue.
>>>     >
>>>     > From a MDM point of view any company wanting to care about Data
>> Quality
>>>     > and Data Governance - Data Lineage is a key piece in this puzzle.
>>>     >
>>>     >
>>>     >
>>>     > RE Mirroring,
>>>     >
>>>     > As per the KIP in-fact this is exactly what we do re cluster id,
>> to mirror
>>>     > a network of clusters between AZ’s / Regions. We know a
>> transaction for a
>>>     > key will be done within a  AZ/Region, as such we know the write to
>> kafka
>>>     > would be ordered per key. But we need eventual view of that across
>> in our
>>>     > other regions/az’s. When we have complete AZ or Region failure we
>> know
>>>     > there will be a brief interruption whilst those transactions are
>> moved to
>>>     > another region but we expect after it to continue.
>>>     >
>>>     > As mentioned having separate Partions to do this starts to get
>>>     > ugly/complicated for us:
>>>     > how would I do compaction where a key is in two partitions?
>>>     > How do we balance consumers so where multiple partitions with the
>> same key
>>>     > goto the same consumer
>>>     > What do you do if cluster 1 has 5 partitions but cluster 20 has 10
>> because
>>>     > its larger kit in our more core DC’s, as such key to partition
>> mappings for
>>>     > consumers get even more complicated.
>>>     > What do you do if we add or remove a complete region
>>>     >
>>>     > Where as simple mirror will work we just need to ensure we don’t
>> have a
>>>     > cycle which we can do with clusterId.
>>>     >
>>>     > We even have started to look at shortest path mirror routing based
>> on
>>>     > clusterId, if we also had the region and az info on the originating
>>>     > message, this we have not implemented but some ideas come from
>> network
>>>     > routing, and also the dispatcher router in apache qpid.
>>>     >
>>>     > Also we need to have data perimeters e.g. certain data cannot leave
>>>     > certain countries borders. We want this all automated so that at
>> the
>>>     > platform level without having to touch or look at the business
>> data inside
>>>     > we can have headers we can put tags into so that we can ensure
>> this doesn’t
>>>     > occur when we mirror. (actually links in to data lineage / tracing
>> as again
>>>     > we need to tag messages at a platform level) Examples are we are
>> not
>>>     > allowed Private customer details to leave Switzerland, yet we need
>> those
>>>     > systems integrated.
>>>     >
>>>     > Lastly around mirroring we have a partionKey field, as the key
>> used for
>>>     > portioning logic != compaction key all the time but we want to
>> preserve it
>>>     > for when we mirror so that if source cluster partition count !=
>> destination
>>>     > cluster partition count we can honour the same partitioning logic.
>>>     >
>>>     >
>>>     >
>>>     > RE End 2 End encryption
>>>     >
>>>     > As I believe mentioned just before, the solution you mention just
>> doesn’t
>>>     > cut the mustard these days with many regulators. An operations
>> person with
>>>     > access to the box should not be able to have access to the data.
>> Many now
>>>     > actually impose quite literally the implementation expected being
>> end2end
>>>     > encryption for certain data (Singapore for us is one that I am
>> most aware
>>>     > of). In fact we’re even now needing encrypt the data and store the
>> keys in
>>>     > HSM modules.
>>>     >
>>>     > Likewise the performance penalty on encrypting decrypting as you
>> produce
>>>     > over wire, then again encrypt decrypt as the data is stored on the
>> brokers
>>>     > disks and back again, then again encrypted and decrypted back over
>> the wire
>>>     > each time for each consumer all adds up, ignoring this doubling
>> with mirror
>>>     > makers etc. simply encrypting the value once on write by the
>> client and
>>>     > again decrypting on consume by the consumer is far more
>> performant, but
>>>     > then the routing and platform meta data needs to be separate (thus
>> headers)
>>>     >
>>>     >
>>>     >
>>>     > RE Auditing:
>>>     >
>>>     > Our Auditing needs are:
>>>     > Producer Id,
>>>     > Origin Cluster Id that message first produced into
>>>     > Origin AZ – agreed we can derive this if we have cluster id, but
>> it makes
>>>     > resolving this for audit reporting a lot easier.
>>>     > Origin Region – agreed we can derive this if we have cluster id,
>> but it
>>>     > makes resolving this for audit reporting a lot easier.
>>>     > Unique Message Identification (this is not the same as transaction
>>>     > tracing) – note offset and partition are not the same, as when we
>> mirror or
>>>     > have for what ever system failure duplicate send,
>>>     > Custom Client wrapper version (where organizations have to wrap
>> the kafka
>>>     > client for added features) so we know what version of the wrapper
>> is used
>>>     > Producer IP address (in case of clients being in our vm/open stack
>> infra
>>>     > where they can move around, producer id will stay the same but
>> this would
>>>     > change)
>>>     >
>>>     >
>>>     >
>>>     > RE Once and only once delivery case
>>>     >
>>>     > Using the same Message UUID for auditing we can achieve this quite
>> simply.
>>>     >
>>>     > As per how some other brokers do this (cough qpid, artemis)
>> message uuid
>>>     > are used to dedupe where message is sent and produced but the
>> client didn’t
>>>     > receive the ack, and there for replays the send, by having a
>> unique message
>>>     > id per message, this can be filtered out, on consumers where
>> message
>>>     > delivery may occur twice for what ever reasons a message uuid can
>> be used
>>>     > to remove duplicates being deliverd , like wise we can do this in
>> the
>>>     > mirrormakers so if we detect a dupe message we can avoid
>> replicating it.
>>>     >
>>>     >
>>>     >
>>>     >
>>>     > Cheers
>>>     > Mike
>>>     >
>>>     >
>>>     >
>>>     > On 02/12/2016, 22:09, "Jun Rao" <[email protected]> wrote:
>>>     >
>>>     >     Since this KIP affects message format, wire protocol, apis, I
>> think
>>>     > it's
>>>     >     worth spending a bit more time to nail down the concrete use
>> cases. It
>>>     >     would be bad if we add this feature, but when start
>> implementing it
>>>     > for say
>>>     >     mirroring, we then realize that header is not the best
>> approach.
>>>     > Initially,
>>>     >     I thought I was convinced of the use cases of headers and was
>> trying to
>>>     >     write down a few use cases to convince others. That's when I
>> became
>>>     > less
>>>     >     certain. For me to be convinced, I just want to see two strong
>> use
>>>     > cases
>>>     >     (instead of 10 maybe use cases) in the third-party space. The
>> reason is
>>>     >     that when we discussed the use cases within a company, often
>> it ends
>>>     > with
>>>     >     "we can't force everyone to use this standard since we may
>> have to
>>>     >     integrate with third-party tools".
>>>     >
>>>     >     At present, I am not sure why headers are useful for things
>> like
>>>     > schemaId
>>>     >     or encryption. In order to do anything useful to the value,
>> one needs
>>>     > to
>>>     >     know the schemaId or how data is encrypted, but header is
>> optional.
>>>     > But, I
>>>     >     can be convinced if someone (Radai, Sean, Todd?) provides more
>> details
>>>     > on
>>>     >     the argument.
>>>     >
>>>     >     I am not very sure header is the best approach for mirroring
>> either. If
>>>     >     someone has thought about this more, I'd be happy to hear.
>>>     >
>>>     >     I can see the data lineage use case. I am just not sure how
>> widely
>>>     >     applicable this is. If someone familiar with this space can
>> justify
>>>     > this is
>>>     >     a significant use case, say in the finance industry, this
>> would be a
>>>     > strong
>>>     >     use case.
>>>     >
>>>     >     I can see the auditing use case. I am just not sure if a native
>>>     > producer id
>>>     >     solves that problem. If there are additional metadata that's
>> worth
>>>     >     collecting but not covered by the producer id, that would make
>> this a
>>>     >     strong use case.
>>>     >
>>>     >     Thanks,
>>>     >
>>>     >     Jun
>>>     >
>>>     >
>>>     >     On Fri, Dec 2, 2016 at 1:41 PM, radai <
>> [email protected]>
>>>     > wrote:
>>>     >
>>>     >     > this KIP is about enabling headers, nothing more nothing
>> less - so
>>>     > no,
>>>     >     > broker-side use of headers is not in the KIP scope.
>>>     >     >
>>>     >     > obviously though, once you have headers potential use cases
>> could
>>>     > include
>>>     >     > broker-side header-aware interceptors (which would be the
>> topic of
>>>     > other
>>>     >     > future KIPs).
>>>     >     >
>>>     >     > a trivially clear use case (to me) would be using such
>> broker-side
>>>     >     > interceptors to enforce compliance with organizational
>> policies - it
>>>     > would
>>>     >     > make our SREs lives much easier if instead of retroactively
>>>     > discovering
>>>     >     > "rogue" topics/users those messages would have been rejected
>>>     > up-front.
>>>     >     >
>>>     >     > the kafka broker code is lacking any such extensibility
>> support
>>>     > (beyond
>>>     >     > maybe authorizer) which is why these use cases were left out
>> of the
>>>     > "case
>>>     >     > for headers" doc - broker extensibility is a separate
>> discussion.
>>>     >     >
>>>     >     > On Fri, Dec 2, 2016 at 12:59 PM, Gwen Shapira <
>> [email protected]>
>>>     > wrote:
>>>     >     >
>>>     >     > > Woah, I wasn't aware this is something we'll do. It wasn't
>> in the
>>>     > KIP,
>>>     >     > > right?
>>>     >     > >
>>>     >     > > I guess we could do it the same way ACLs currently work.
>>>     >     > > I had in mind something that will allow admins to apply
>> rules to
>>>     > the
>>>     >     > > new create/delete/config topic APIs. So Todd can decide to
>> reject
>>>     >     > > "create topic" requests that ask for more than 40
>> partitions, or
>>>     >     > > require exactly 3 replicas, or no more than 50GB partition
>> size,
>>>     > etc.
>>>     >     > >
>>>     >     > > ACLs were added a bit ad-hoc, if we are planning to apply
>> more
>>>     > rules
>>>     >     > > to requests (and I think we should), we may want a bit
>> more generic
>>>     >     > > design around that.
>>>     >     > >
>>>     >     > > On Fri, Dec 2, 2016 at 7:16 AM, radai <
>> [email protected]>
>>>     >     > wrote:
>>>     >     > > > "wouldn't you be in the business of making sure everyone
>> uses
>>>     > them
>>>     >     > > > properly?"
>>>     >     > > >
>>>     >     > > > thats where a broker-side plugin would come handy - any
>> incoming
>>>     >     > message
>>>     >     > > > that does not conform to org policy (read - does not
>> have the
>>>     > proper
>>>     >     > > > headers) gets thrown out (with an error returned to user)
>>>     >     > > >
>>>     >     > > > On Thu, Dec 1, 2016 at 8:44 PM, Todd Palino <
>> [email protected]>
>>>     > wrote:
>>>     >     > > >
>>>     >     > > >> Come on, I’ve done at least 2 talks on this one :)
>>>     >     > > >>
>>>     >     > > >> Producing counts to a topic is part of it, but that’s
>> only
>>>     > part. So
>>>     >     > you
>>>     >     > > >> count you have 100 messages in topic A. When you mirror
>> topic A
>>>     > to
>>>     >     > > another
>>>     >     > > >> cluster, you have 99 messages. Where was your problem?
>> Or
>>>     > worse, you
>>>     >     > > have
>>>     >     > > >> 100 messages, but one producer duplicated messages and
>> another
>>>     > one
>>>     >     > lost
>>>     >     > > >> messages. You need details about where the message came
>> from in
>>>     > order
>>>     >     > to
>>>     >     > > >> pinpoint problems when they happen. Source producer
>> info, where
>>>     > it was
>>>     >     > > >> produced into your infrastructure, and when it was
>> produced.
>>>     > This
>>>     >     > > requires
>>>     >     > > >> you to add the information to the message.
>>>     >     > > >>
>>>     >     > > >> And yes, you still need to maintain your clients. So
>> maybe my
>>>     > original
>>>     >     > > >> example was not the best. My thoughts on not wanting to
>> be
>>>     > responsible
>>>     >     > > for
>>>     >     > > >> message formats stands, because that’s very much
>> separate from
>>>     > the
>>>     >     > > client.
>>>     >     > > >> As you know, we have our own internal client library
>> that can
>>>     > insert
>>>     >     > the
>>>     >     > > >> right headers, and right now inserts the right audit
>>>     > information into
>>>     >     > > the
>>>     >     > > >> message fields. If they exist, and assuming the message
>> is Avro
>>>     >     > encoded.
>>>     >     > > >> What if someone wants to use JSON instead for a good
>> reason?
>>>     > What if
>>>     >     > > user X
>>>     >     > > >> wants to encrypt messages, but user Y does not?
>> Maintaining the
>>>     > client
>>>     >     > > >> library is still much easier than maintaining the
>> message
>>>     > formats.
>>>     >     > > >>
>>>     >     > > >> -Todd
>>>     >     > > >>
>>>     >     > > >>
>>>     >     > > >> On Thu, Dec 1, 2016 at 6:21 PM, Gwen Shapira <
>> [email protected]
>>>     > >
>>>     >     > wrote:
>>>     >     > > >>
>>>     >     > > >> > Based on your last sentence, consider me convinced :)
>>>     >     > > >> >
>>>     >     > > >> > I get why headers are critical for Mirroring (you
>> need tags to
>>>     >     > prevent
>>>     >     > > >> > loops and sometimes to route messages to the correct
>>>     > destination).
>>>     >     > > >> > But why do you need headers to audit? We are auditing
>> by
>>>     > producing
>>>     >     > > >> > counts to a side topic (and I was under the
>> impression you do
>>>     > the
>>>     >     > > >> > same), so we never need to modify the message.
>>>     >     > > >> >
>>>     >     > > >> > Another thing - after we added headers, wouldn't you
>> be in the
>>>     >     > > >> > business of making sure everyone uses them properly?
>> Making
>>>     > sure
>>>     >     > > >> > everyone includes the right headers you need, not
>> using the
>>>     > header
>>>     >     > > >> > names you intend to use, etc. I don't think the
>> "policing"
>>>     > business
>>>     >     > > >> > will ever go away.
>>>     >     > > >> >
>>>     >     > > >> > On Thu, Dec 1, 2016 at 5:25 PM, Todd Palino <
>>>     > [email protected]>
>>>     >     > > wrote:
>>>     >     > > >> > > Got it. As an ops guy, I'm not very happy with the
>>>     > workaround.
>>>     >     > Avro
>>>     >     > > >> means
>>>     >     > > >> > > that I have to be concerned with the format of the
>> messages
>>>     > in
>>>     >     > > order to
>>>     >     > > >> > run
>>>     >     > > >> > > the infrastructure (audit, mirroring, etc.). That
>> means
>>>     > that I
>>>     >     > have
>>>     >     > > to
>>>     >     > > >> > > handle the schemas, and I have to enforce rules
>> about good
>>>     >     > formats.
>>>     >     > > >> This
>>>     >     > > >> > is
>>>     >     > > >> > > not something I want to be in the business of,
>> because I
>>>     > should be
>>>     >     > > able
>>>     >     > > >> > to
>>>     >     > > >> > > run a service infrastructure without needing to be
>> in the
>>>     > weeds of
>>>     >     > > >> > dealing
>>>     >     > > >> > > with customer data formats.
>>>     >     > > >> > >
>>>     >     > > >> > > Trust me, a sizable portion of my support time is
>> spent
>>>     > dealing
>>>     >     > with
>>>     >     > > >> > schema
>>>     >     > > >> > > issues. I really would like to get away from that.
>> Maybe
>>>     > I'd have
>>>     >     > > more
>>>     >     > > >> > time
>>>     >     > > >> > > for other hobbies. Like writing. ;)
>>>     >     > > >> > >
>>>     >     > > >> > > -Todd
>>>     >     > > >> > >
>>>     >     > > >> > > On Thu, Dec 1, 2016 at 4:04 PM Gwen Shapira <
>>>     > [email protected]>
>>>     >     > > wrote:
>>>     >     > > >> > >
>>>     >     > > >> > >> I'm pretty satisfied with the current workarounds
>> (Avro
>>>     > container
>>>     >     > > >> > >> format), so I'm not too excited about the extra
>> work
>>>     > required to
>>>     >     > do
>>>     >     > > >> > >> headers in Kafka. I absolutely don't mind it if
>> you do
>>>     > it...
>>>     >     > > >> > >> I think the Apache convention for "good idea, but
>> not
>>>     > willing to
>>>     >     > > put
>>>     >     > > >> > >> any work toward it" is +0.5? anyway, that's what I
>> was
>>>     > trying to
>>>     >     > > >> > >> convey :)
>>>     >     > > >> > >>
>>>     >     > > >> > >> On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <
>>>     > [email protected]>
>>>     >     > > >> wrote:
>>>     >     > > >> > >> > Well I guess my question for you, then, is what
>> is
>>>     > holding you
>>>     >     > > back
>>>     >     > > >> > from
>>>     >     > > >> > >> > full support for headers? What’s the bit that
>> you’re
>>>     > missing
>>>     >     > that
>>>     >     > > >> has
>>>     >     > > >> > you
>>>     >     > > >> > >> > under a full +1?
>>>     >     > > >> > >> >
>>>     >     > > >> > >> > -Todd
>>>     >     > > >> > >> >
>>>     >     > > >> > >> >
>>>     >     > > >> > >> > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <
>>>     >     > [email protected]>
>>>     >     > > >> > wrote:
>>>     >     > > >> > >> >
>>>     >     > > >> > >> >> I know why people who support headers support
>> them, and
>>>     > I've
>>>     >     > > seen
>>>     >     > > >> > what
>>>     >     > > >> > >> >> the discussion is like.
>>>     >     > > >> > >> >>
>>>     >     > > >> > >> >> This is why I'm asking people who are against
>> headers
>>>     >     > > (especially
>>>     >     > > >> > >> >> committers) what will make them change their
>> mind - so
>>>     > we can
>>>     >     > > get
>>>     >     > > >> > this
>>>     >     > > >> > >> >> part over one way or another.
>>>     >     > > >> > >> >>
>>>     >     > > >> > >> >> If I sound frustrated it is not at Radai, Jun
>> or you
>>>     > (Todd)...
>>>     >     > > I am
>>>     >     > > >> > >> >> just looking for something concrete we can do
>> to move
>>>     > the
>>>     >     > > >> discussion
>>>     >     > > >> > >> >> along to the yummy design details (which is the
>>>     > argument I
>>>     >     > > really
>>>     >     > > >> am
>>>     >     > > >> > >> >> looking forward to).
>>>     >     > > >> > >> >>
>>>     >     > > >> > >> >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <
>>>     >     > [email protected]>
>>>     >     > > >> > wrote:
>>>     >     > > >> > >> >> > So, Gwen, to your question (even though I’m
>> not a
>>>     >     > > committer)...
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> > I have always been a strong supporter of
>> introducing
>>>     > the
>>>     >     > > concept
>>>     >     > > >> > of an
>>>     >     > > >> > >> >> > envelope to messages, which headers
>> accomplishes. The
>>>     >     > message
>>>     >     > > key
>>>     >     > > >> > is
>>>     >     > > >> > >> >> > already an example of a piece of envelope
>>>     > information. By
>>>     >     > > >> > providing a
>>>     >     > > >> > >> >> means
>>>     >     > > >> > >> >> > to do this within Kafka itself, and not
>> relying on
>>>     > use-case
>>>     >     > > >> > specific
>>>     >     > > >> > >> >> > implementations, you make it much easier for
>>>     > components to
>>>     >     > > >> > >> interoperate.
>>>     >     > > >> > >> >> It
>>>     >     > > >> > >> >> > simplifies development of all these things
>> (message
>>>     > routing,
>>>     >     > > >> > auditing,
>>>     >     > > >> > >> >> > encryption, etc.) because each one does not
>> have to
>>>     > reinvent
>>>     >     > > the
>>>     >     > > >> > >> wheel.
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> > It also makes it much easier from a client
>> point of
>>>     > view if
>>>     >     > > the
>>>     >     > > >> > >> headers
>>>     >     > > >> > >> >> are
>>>     >     > > >> > >> >> > defined as part of the protocol and/or
>> message format
>>>     > in
>>>     >     > > general
>>>     >     > > >> > >> because
>>>     >     > > >> > >> >> > you can easily produce and consume messages
>> without
>>>     > having
>>>     >     > to
>>>     >     > > >> take
>>>     >     > > >> > >> into
>>>     >     > > >> > >> >> > account specific cases. For example, I want
>> to route
>>>     >     > messages,
>>>     >     > > >> but
>>>     >     > > >> > >> >> client A
>>>     >     > > >> > >> >> > doesn’t support the way audit implemented
>> headers, and
>>>     >     > client
>>>     >     > > B
>>>     >     > > >> > >> doesn’t
>>>     >     > > >> > >> >> > support the way encryption or routing
>> implemented
>>>     > headers,
>>>     >     > so
>>>     >     > > now
>>>     >     > > >> > my
>>>     >     > > >> > >> >> > application has to create some really fragile
>> (my
>>>     >     > autocorrect
>>>     >     > > >> just
>>>     >     > > >> > >> tried
>>>     >     > > >> > >> >> to
>>>     >     > > >> > >> >> > make that “tragic”, which is probably
>> appropriate
>>>     > too) code
>>>     >     > to
>>>     >     > > >> > strip
>>>     >     > > >> > >> >> > everything off, rather than just consuming the
>>>     > messages,
>>>     >     > > picking
>>>     >     > > >> > out
>>>     >     > > >> > >> the
>>>     >     > > >> > >> >> 1
>>>     >     > > >> > >> >> > or 2 headers it’s interested in, and
>> performing its
>>>     >     > function.
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> > Honestly, this discussion has been going on
>> for a
>>>     > long time,
>>>     >     > > and
>>>     >     > > >> > it’s
>>>     >     > > >> > >> >> > always “Oh, you came up with 2 use cases, and
>> yeah,
>>>     > those
>>>     >     > use
>>>     >     > > >> cases
>>>     >     > > >> > >> are
>>>     >     > > >> > >> >> > real things that someone would want to do.
>> Here’s an
>>>     >     > alternate
>>>     >     > > >> way
>>>     >     > > >> > to
>>>     >     > > >> > >> >> > implement them so let’s not do headers.” If
>> we have a
>>>     > few
>>>     >     > use
>>>     >     > > >> cases
>>>     >     > > >> > >> that
>>>     >     > > >> > >> >> we
>>>     >     > > >> > >> >> > actually came up with, you can be sure that
>> over the
>>>     > next
>>>     >     > year
>>>     >     > > >> > >> there’s a
>>>     >     > > >> > >> >> > dozen others that we didn’t think of that
>> someone
>>>     > would like
>>>     >     > > to
>>>     >     > > >> > do. I
>>>     >     > > >> > >> >> > really think it’s time to stop rehashing this
>>>     > discussion and
>>>     >     > > >> > instead
>>>     >     > > >> > >> >> focus
>>>     >     > > >> > >> >> > on a workable standard that we can adopt.
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> > -Todd
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <
>>>     >     > > [email protected]>
>>>     >     > > >> > >> wrote:
>>>     >     > > >> > >> >> >
>>>     >     > > >> > >> >> >> C. per message encryption
>>>     >     > > >> > >> >> >>> One drawback of this approach is that this
>>>     > significantly
>>>     >     > > reduce
>>>     >     > > >> > the
>>>     >     > > >> > >> >> >>> effectiveness of compression, which happens
>> on a
>>>     > set of
>>>     >     > > >> > serialized
>>>     >     > > >> > >> >> >>> messages. An alternative is to enable SSL
>> for wire
>>>     >     > > encryption
>>>     >     > > >> and
>>>     >     > > >> > >> rely
>>>     >     > > >> > >> >> on
>>>     >     > > >> > >> >> >>> the storage system (e.g. LUKS) for at rest
>>>     > encryption.
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >> Jun, this is not sufficient. While this does
>> cover
>>>     > the case
>>>     >     > > of
>>>     >     > > >> > >> removing
>>>     >     > > >> > >> >> a
>>>     >     > > >> > >> >> >> drive from the system, it will not satisfy
>> most
>>>     > compliance
>>>     >     > > >> > >> requirements
>>>     >     > > >> > >> >> for
>>>     >     > > >> > >> >> >> encryption of data as whoever has access to
>> the
>>>     > broker
>>>     >     > itself
>>>     >     > > >> > still
>>>     >     > > >> > >> has
>>>     >     > > >> > >> >> >> access to the unencrypted data. For
>> end-to-end
>>>     > encryption
>>>     >     > you
>>>     >     > > >> > need to
>>>     >     > > >> > >> >> >> encrypt at the producer, before it enters the
>>>     > system, and
>>>     >     > > >> decrypt
>>>     >     > > >> > at
>>>     >     > > >> > >> the
>>>     >     > > >> > >> >> >> consumer, after it exits the system.
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >> -Todd
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <
>>>     >     > > >> [email protected]
>>>     >     > > >> > >
>>>     >     > > >> > >> >> wrote:
>>>     >     > > >> > >> >> >>
>>>     >     > > >> > >> >> >>> another big plus of headers in the protocol
>> is that
>>>     > it
>>>     >     > would
>>>     >     > > >> > enable
>>>     >     > > >> > >> >> rapid
>>>     >     > > >> > >> >> >>> iteration on ideas outside of core kafka
>> and would
>>>     > reduce
>>>     >     > > the
>>>     >     > > >> > >> number of
>>>     >     > > >> > >> >> >>> future wire format changes required.
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> a lot of what is currently a KIP represents
>> use
>>>     > cases that
>>>     >     > > are
>>>     >     > > >> > not
>>>     >     > > >> > >> 100%
>>>     >     > > >> > >> >> >>> relevant to all users, and some of them
>> require
>>>     > rather
>>>     >     > > invasive
>>>     >     > > >> > wire
>>>     >     > > >> > >> >> >>> protocol changes. a thing a good recent
>> example of
>>>     > this is
>>>     >     > > >> > kip-98.
>>>     >     > > >> > >> >> >>> tx-utilizing traffic is expected to be a
>> very small
>>>     >     > > fraction of
>>>     >     > > >> > >> total
>>>     >     > > >> > >> >> >>> traffic and yet the changes are invasive.
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> every such wire format change translates
>> into
>>>     > painful and
>>>     >     > > slow
>>>     >     > > >> > >> >> adoption of
>>>     >     > > >> > >> >> >>> new versions.
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> i think a lot of functionality currently in
>> KIPs
>>>     > could be
>>>     >     > > "spun
>>>     >     > > >> > out"
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> implemented as opt-in plugins transmitting
>> data over
>>>     >     > > headers.
>>>     >     > > >> > this
>>>     >     > > >> > >> >> would
>>>     >     > > >> > >> >> >>> keep the core wire format stable(r), core
>> codebase
>>>     >     > smaller,
>>>     >     > > and
>>>     >     > > >> > >> avoid
>>>     >     > > >> > >> >> the
>>>     >     > > >> > >> >> >>> "burden of proof" thats sometimes required
>> to prove
>>>     > a
>>>     >     > > certain
>>>     >     > > >> > >> feature
>>>     >     > > >> > >> >> is
>>>     >     > > >> > >> >> >>> useful enough for a wide-enough audience to
>> warrant
>>>     > a wire
>>>     >     > > >> format
>>>     >     > > >> > >> >> change
>>>     >     > > >> > >> >> >>> and code complexity additions.
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> (to be clear - kip-98 goes beyond "mere"
>> wire format
>>>     >     > changes
>>>     >     > > >> and
>>>     >     > > >> > im
>>>     >     > > >> > >> not
>>>     >     > > >> > >> >> >>> saying it could have been completely done
>> with
>>>     > headers,
>>>     >     > but
>>>     >     > > >> > >> >> exactly-once
>>>     >     > > >> > >> >> >>> delivery certainly could)
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen
>> Shapira <
>>>     >     > > >> [email protected]
>>>     >     > > >> > >
>>>     >     > > >> > >> >> wrote:
>>>     >     > > >> > >> >> >>>
>>>     >     > > >> > >> >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <
>>>     >     > > >> > >> [email protected]>
>>>     >     > > >> > >> >> >>> wrote:
>>>     >     > > >> > >> >> >>> > > "For use cases within an organization,
>> one could
>>>     >     > always
>>>     >     > > use
>>>     >     > > >> > >> other
>>>     >     > > >> > >> >> >>> > > approaches such as company-wise
>> containers"
>>>     >     > > >> > >> >> >>> > > this is what linkedin has traditionally
>> done
>>>     > but there
>>>     >     > > are
>>>     >     > > >> > now
>>>     >     > > >> > >> >> cases
>>>     >     > > >> > >> >> >>> > (read
>>>     >     > > >> > >> >> >>> > > - topics) where this is not acceptable.
>> this
>>>     > makes
>>>     >     > > headers
>>>     >     > > >> > >> useful
>>>     >     > > >> > >> >> even
>>>     >     > > >> > >> >> >>> > > within single orgs for cases where
>>>     >     > > one-container-fits-all
>>>     >     > > >> > cannot
>>>     >     > > >> > >> >> >>> apply.
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > > as for the particular use cases listed,
>> i dont
>>>     > want
>>>     >     > > this to
>>>     >     > > >> > >> devolve
>>>     >     > > >> > >> >> >>> to a
>>>     >     > > >> > >> >> >>> > > discussion of particular use cases - i
>> think its
>>>     >     > enough
>>>     >     > > >> that
>>>     >     > > >> > >> some
>>>     >     > > >> > >> >> of
>>>     >     > > >> > >> >> >>> them
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > I think a main point of contention is
>> that: We
>>>     >     > identified
>>>     >     > > few
>>>     >     > > >> > >> >> >>> > use-cases where headers are useful, do we
>> want
>>>     > Kafka to
>>>     >     > > be a
>>>     >     > > >> > >> system
>>>     >     > > >> > >> >> >>> > that supports those use-cases?
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > For example, Jun said:
>>>     >     > > >> > >> >> >>> > "Not sure how widely useful record-level
>> lineage
>>>     > is
>>>     >     > though
>>>     >     > > >> > since
>>>     >     > > >> > >> the
>>>     >     > > >> > >> >> >>> > overhead could
>>>     >     > > >> > >> >> >>> > be significant."
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > We know NiFi supports record level
>> lineage. I
>>>     > don't
>>>     >     > think
>>>     >     > > it
>>>     >     > > >> > was
>>>     >     > > >> > >> >> >>> > developed for lols, I think it is safe to
>> assume
>>>     > that
>>>     >     > the
>>>     >     > > NSA
>>>     >     > > >> > >> needed
>>>     >     > > >> > >> >> >>> > that functionality. We also know that
>> certain
>>>     > financial
>>>     >     > > >> > institutes
>>>     >     > > >> > >> >> >>> > need to track tampering with records at a
>> record
>>>     > level
>>>     >     > and
>>>     >     > > >> > there
>>>     >     > > >> > >> are
>>>     >     > > >> > >> >> >>> > federal regulations that absolutely
>> require
>>>     > this.  They
>>>     >     > > also
>>>     >     > > >> > need
>>>     >     > > >> > >> to
>>>     >     > > >> > >> >> >>> > prove that routing apps that "touches" the
>>>     > messages and
>>>     >     > > >> either
>>>     >     > > >> > >> reads
>>>     >     > > >> > >> >> >>> > or updates headers couldn't have possibly
>>>     > modified the
>>>     >     > > >> payload
>>>     >     > > >> > >> >> itself.
>>>     >     > > >> > >> >> >>> > They use record level encryption to do
>> that -
>>>     > apps can
>>>     >     > > read
>>>     >     > > >> and
>>>     >     > > >> > >> >> >>> > (sometimes) modify headers but can't
>> touch the
>>>     > payload.
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > We can totally say "those are corner
>> cases and
>>>     > not worth
>>>     >     > > >> adding
>>>     >     > > >> > >> >> >>> > headers to Kafka for", they should use a
>> different
>>>     >     > pubsub
>>>     >     > > >> > message
>>>     >     > > >> > >> for
>>>     >     > > >> > >> >> >>> > that (Nifi or one of the other 1000 that
>> cater
>>>     >     > > specifically
>>>     >     > > >> to
>>>     >     > > >> > the
>>>     >     > > >> > >> >> >>> > financial industry).
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > But this gets us into a catch 22:
>>>     >     > > >> > >> >> >>> > If we discuss a specific use-case,
>> someone can
>>>     > always
>>>     >     > say
>>>     >     > > it
>>>     >     > > >> > isn't
>>>     >     > > >> > >> >> >>> > interesting enough for Kafka. If we
>> discuss more
>>>     > general
>>>     >     > > >> > trends,
>>>     >     > > >> > >> >> >>> > others can say "well, we are not sure any
>> of them
>>>     > really
>>>     >     > > >> needs
>>>     >     > > >> > >> >> headers
>>>     >     > > >> > >> >> >>> > specifically. This is just hand waving
>> and not
>>>     >     > > interesting.".
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > I think discussing use-cases in specifics
>> is super
>>>     >     > > important
>>>     >     > > >> to
>>>     >     > > >> > >> >> decide
>>>     >     > > >> > >> >> >>> > implementation details for headers (my
>> use-cases
>>>     > lean
>>>     >     > > toward
>>>     >     > > >> > >> >> numerical
>>>     >     > > >> > >> >> >>> > keys with namespaces and object values,
>> others
>>>     > differ),
>>>     >     > > but I
>>>     >     > > >> > >> think
>>>     >     > > >> > >> >> we
>>>     >     > > >> > >> >> >>> > need to answer the general "Are we going
>> to have
>>>     >     > headers"
>>>     >     > > >> > question
>>>     >     > > >> > >> >> >>> > first.
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > I'd love to hear from the other
>> committers in the
>>>     >     > > discussion:
>>>     >     > > >> > >> >> >>> > What would it take to convince you that
>> headers
>>>     > in Kafka
>>>     >     > > are
>>>     >     > > >> a
>>>     >     > > >> > >> good
>>>     >     > > >> > >> >> >>> > idea in general, so we can move ahead and
>> try to
>>>     > agree
>>>     >     > on
>>>     >     > > the
>>>     >     > > >> > >> >> details?
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > I feel like we keep moving the goal posts
>> and
>>>     > this is
>>>     >     > > truly
>>>     >     > > >> > >> >> exhausting.
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > For the record, I mildly support adding
>> headers
>>>     > to Kafka
>>>     >     > > >> > (+0.5?).
>>>     >     > > >> > >> >> >>> > The community can continue to find
>> workarounds to
>>>     > the
>>>     >     > > issue
>>>     >     > > >> and
>>>     >     > > >> > >> there
>>>     >     > > >> > >> >> >>> > are some benefits to keeping the message
>> format
>>>     > and
>>>     >     > > clients
>>>     >     > > >> > >> simpler.
>>>     >     > > >> > >> >> >>> > But I see the usefulness of headers to
>> many
>>>     > use-cases
>>>     >     > and
>>>     >     > > if
>>>     >     > > >> we
>>>     >     > > >> > >> can
>>>     >     > > >> > >> >> >>> > find a good and generally useful way to
>> add it to
>>>     > Kafka,
>>>     >     > > it
>>>     >     > > >> > will
>>>     >     > > >> > >> make
>>>     >     > > >> > >> >> >>> > Kafka easier to use for many - worthy
>> goal in my
>>>     > eyes.
>>>     >     > > >> > >> >> >>> >
>>>     >     > > >> > >> >> >>> > > are interesting/feasible, but:
>>>     >     > > >> > >> >> >>> > > A+B. i think there are use cases for
>> polyglot
>>>     > topics.
>>>     >     > > >> > >> especially if
>>>     >     > > >> > >> >> >>> kafka
>>>     >     > > >> > >> >> >>> > > is being used to "trunk" something else.
>>>     >     > > >> > >> >> >>> > > D. multiple topics would make it harder
>> to write
>>>     >     > > portable
>>>     >     > > >> > >> consumer
>>>     >     > > >> > >> >> >>> code.
>>>     >     > > >> > >> >> >>> > > partition remapping would mess with
>> locality of
>>>     >     > > consumption
>>>     >     > > >> > >> >> >>> guarantees.
>>>     >     > > >> > >> >> >>> > > E+F. a use case I see for
>> lineage/metadata is
>>>     >     > > >> > >> billing/chargeback.
>>>     >     > > >> > >> >> for
>>>     >     > > >> > >> >> >>> > that
>>>     >     > > >> > >> >> >>> > > use case it is not enough to simply
>> record the
>>>     > point
>>>     >     > of
>>>     >     > > >> > origin,
>>>     >     > > >> > >> but
>>>     >     > > >> > >> >> >>> every
>>>     >     > > >> > >> >> >>> > > replication stop (think mirror maker)
>> must also
>>>     > add a
>>>     >     > > >> record
>>>     >     > > >> > to
>>>     >     > > >> > >> >> form a
>>>     >     > > >> > >> >> >>> > > "transit log".
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > > as for stream processing on top of
>> kafka - i
>>>     > know
>>>     >     > samza
>>>     >     > > >> has a
>>>     >     > > >> > >> >> metadata
>>>     >     > > >> > >> >> >>> > map
>>>     >     > > >> > >> >> >>> > > which they carry around in addition to
>> user
>>>     > values.
>>>     >     > > headers
>>>     >     > > >> > are
>>>     >     > > >> > >> the
>>>     >     > > >> > >> >> >>> > perfect
>>>     >     > > >> > >> >> >>> > > fit for these things.
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun
>> Rao <
>>>     >     > > [email protected]
>>>     >     > > >> >
>>>     >     > > >> > >> wrote:
>>>     >     > > >> > >> >> >>> > >
>>>     >     > > >> > >> >> >>> > >> Hi, Michael,
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> In order to answer the first two
>> questions, it
>>>     > would
>>>     >     > be
>>>     >     > > >> > helpful
>>>     >     > > >> > >> >> if we
>>>     >     > > >> > >> >> >>> > could
>>>     >     > > >> > >> >> >>> > >> identify 1 or 2 strong use cases for
>> headers
>>>     > in the
>>>     >     > > space
>>>     >     > > >> > for
>>>     >     > > >> > >> >> >>> > third-party
>>>     >     > > >> > >> >> >>> > >> vendors. For use cases within an
>> organization,
>>>     > one
>>>     >     > > could
>>>     >     > > >> > always
>>>     >     > > >> > >> >> use
>>>     >     > > >> > >> >> >>> > other
>>>     >     > > >> > >> >> >>> > >> approaches such as company-wise
>> containers to
>>>     > get
>>>     >     > > around
>>>     >     > > >> w/o
>>>     >     > > >> > >> >> >>> headers. I
>>>     >     > > >> > >> >> >>> > >> went through the use cases in the KIP
>> and in
>>>     > Radai's
>>>     >     > > wiki
>>>     >     > > >> (
>>>     >     > > >> > >> >> >>> > >> https://cwiki.apache.org/confl
>>>     > uence/display/KAFKA/A+
>>>     >     > > >> > >> >> >>> > Case+for+Kafka+Headers
>>>     >     > > >> > >> >> >>> > >> ).
>>>     >     > > >> > >> >> >>> > >> The following are the ones that that I
>>>     > understand and
>>>     >     > > >> could
>>>     >     > > >> > be
>>>     >     > > >> > >> in
>>>     >     > > >> > >> >> the
>>>     >     > > >> > >> >> >>> > >> third-party use case category.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> A. content-type
>>>     >     > > >> > >> >> >>> > >> It seems that in general, content-type
>> should
>>>     > be set
>>>     >     > at
>>>     >     > > >> the
>>>     >     > > >> > >> topic
>>>     >     > > >> > >> >> >>> level.
>>>     >     > > >> > >> >> >>> > >> Not sure if mixing messages with
>> different
>>>     > content
>>>     >     > > types
>>>     >     > > >> > >> should be
>>>     >     > > >> > >> >> >>> > >> encouraged.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> B. schema id
>>>     >     > > >> > >> >> >>> > >> Since the value is mostly useless
>> without
>>>     > schema id,
>>>     >     > it
>>>     >     > > >> > seems
>>>     >     > > >> > >> that
>>>     >     > > >> > >> >> >>> > storing
>>>     >     > > >> > >> >> >>> > >> the schema id together with serialized
>> bytes
>>>     > in the
>>>     >     > > value
>>>     >     > > >> is
>>>     >     > > >> > >> >> better?
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> C. per message encryption
>>>     >     > > >> > >> >> >>> > >> One drawback of this approach is that
>> this
>>>     >     > > significantly
>>>     >     > > >> > reduce
>>>     >     > > >> > >> >> the
>>>     >     > > >> > >> >> >>> > >> effectiveness of compression, which
>> happens on
>>>     > a set
>>>     >     > of
>>>     >     > > >> > >> serialized
>>>     >     > > >> > >> >> >>> > >> messages. An alternative is to enable
>> SSL for
>>>     > wire
>>>     >     > > >> > encryption
>>>     >     > > >> > >> and
>>>     >     > > >> > >> >> >>> rely
>>>     >     > > >> > >> >> >>> > on
>>>     >     > > >> > >> >> >>> > >> the storage system (e.g. LUKS) for at
>> rest
>>>     >     > encryption.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> D. cluster ID for mirroring across
>> Kafka
>>>     > clusters
>>>     >     > > >> > >> >> >>> > >> This is actually interesting. Today,
>> to avoid
>>>     >     > > introducing
>>>     >     > > >> > >> cycles
>>>     >     > > >> > >> >> when
>>>     >     > > >> > >> >> >>> > doing
>>>     >     > > >> > >> >> >>> > >> mirroring across data centers, one
>> would
>>>     > either have
>>>     >     > to
>>>     >     > > >> set
>>>     >     > > >> > up
>>>     >     > > >> > >> two
>>>     >     > > >> > >> >> >>> Kafka
>>>     >     > > >> > >> >> >>> > >> clusters (a local and an aggregate)
>> per data
>>>     > center
>>>     >     > or
>>>     >     > > >> > rename
>>>     >     > > >> > >> >> topics.
>>>     >     > > >> > >> >> >>> > >> Neither is ideal. With headers, the
>> producer
>>>     > could
>>>     >     > tag
>>>     >     > > >> each
>>>     >     > > >> > >> >> message
>>>     >     > > >> > >> >> >>> with
>>>     >     > > >> > >> >> >>> > >> the producing cluster ID in the header.
>>>     > MirrorMaker
>>>     >     > > could
>>>     >     > > >> > then
>>>     >     > > >> > >> >> avoid
>>>     >     > > >> > >> >> >>> > >> mirroring messages to a cluster if
>> they are
>>>     > tagged
>>>     >     > with
>>>     >     > > >> the
>>>     >     > > >> > >> same
>>>     >     > > >> > >> >> >>> cluster
>>>     >     > > >> > >> >> >>> > >> id.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> However, an alternative approach is to
>>>     > introduce sth
>>>     >     > > like
>>>     >     > > >> > >> >> >>> hierarchical
>>>     >     > > >> > >> >> >>> > >> topic and store messages from different
>>>     > clusters in
>>>     >     > > >> > different
>>>     >     > > >> > >> >> >>> partitions
>>>     >     > > >> > >> >> >>> > >> under the same topic. This approach
>> avoids
>>>     > filtering
>>>     >     > > out
>>>     >     > > >> > >> unneeded
>>>     >     > > >> > >> >> >>> data
>>>     >     > > >> > >> >> >>> > and
>>>     >     > > >> > >> >> >>> > >> makes offset preserving easier to
>> support. It
>>>     > may
>>>     >     > make
>>>     >     > > >> > >> compaction
>>>     >     > > >> > >> >> >>> > trickier
>>>     >     > > >> > >> >> >>> > >> though since the same key may show up
>> in
>>>     > different
>>>     >     > > >> > partitions.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> E. record-level lineage
>>>     >     > > >> > >> >> >>> > >> For example, a source connector could
>> store in
>>>     > the
>>>     >     > > message
>>>     >     > > >> > the
>>>     >     > > >> > >> >> >>> metadata
>>>     >     > > >> > >> >> >>> > >> (e.g. UUID) of the source record.
>> Similarly,
>>>     > if a
>>>     >     > > stream
>>>     >     > > >> job
>>>     >     > > >> > >> >> >>> transforms
>>>     >     > > >> > >> >> >>> > >> messages from topic A to topic B, the
>> library
>>>     > could
>>>     >     > > >> include
>>>     >     > > >> > the
>>>     >     > > >> > >> >> >>> source
>>>     >     > > >> > >> >> >>> > >> message offset in each of the
>> transformed
>>>     > message in
>>>     >     > > the
>>>     >     > > >> > >> header.
>>>     >     > > >> > >> >> Not
>>>     >     > > >> > >> >> >>> > sure
> 
>>>     >     > > >> > >> >> >>> > >> how widely useful record-level lineage
>> is
>>>     > though
>>>     >     > since
>>>     >     > > the
>>>     >     > > >> > >> >> overhead
>>>     >     > > >> > >> >> >>> > could
>>>     >     > > >> > >> >> >>> > >> be significant.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> F. auditing metadata
>>>     >     > > >> > >> >> >>> > >> We could put things like
>> clientId/host/user in
>>>     > the
>>>     >     > > header
>>>     >     > > >> in
>>>     >     > > >> > >> each
>>>     >     > > >> > >> >> >>> > message
>>>     >     > > >> > >> >> >>> > >> for auditing. These metadata are
>> really at the
>>>     >     > producer
>>>     >     > > >> > level
>>>     >     > > >> > >> >> though.
>>>     >     > > >> > >> >> >>> > So, a
>>>     >     > > >> > >> >> >>> > >> more efficient way is to only include a
>>>     > "producerId"
>>>     >     > > per
>>>     >     > > >> > >> message
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> > send
>>>     >     > > >> > >> >> >>> > >> the producerId -> metadata mapping
>>>     > independently.
>>>     >     > > KIP-98
>>>     >     > > >> is
>>>     >     > > >> > >> >> actually
>>>     >     > > >> > >> >> >>> > >> proposing including such a producerId
>> natively
>>>     > in the
>>>     >     > > >> > message.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> So, overall, I not sure that I am fully
>>>     > convinced of
>>>     >     > > the
>>>     >     > > >> > strong
>>>     >     > > >> > >> >> >>> > third-party
>>>     >     > > >> > >> >> >>> > >> use cases of headers yet. Perhaps we
>> could
>>>     > discuss a
>>>     >     > > bit
>>>     >     > > >> > more
>>>     >     > > >> > >> to
>>>     >     > > >> > >> >> make
>>>     >     > > >> > >> >> >>> > one
>>>     >     > > >> > >> >> >>> > >> or two really convincing use cases.
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> Another orthogonal  question is
>> whether header
>>>     > should
>>>     >     > > be
>>>     >     > > >> > >> exposed
>>>     >     > > >> > >> >> in
>>>     >     > > >> > >> >> >>> > stream
>>>     >     > > >> > >> >> >>> > >> processing systems such Kafka stream,
>> Samza,
>>>     > and
>>>     >     > Spark
>>>     >     > > >> > >> streaming.
>>>     >     > > >> > >> >> >>> > >> Currently, those systems just deal with
>>>     > key/value
>>>     >     > > pairs.
>>>     >     > > >> > >> Should we
>>>     >     > > >> > >> >> >>> > expose a
>>>     >     > > >> > >> >> >>> > >> third thing header there too or
>> somehow map
>>>     > header to
>>>     >     > > key
>>>     >     > > >> or
>>>     >     > > >> > >> >> value?
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> Thanks,
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> Jun
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM,
>> Michael
>>>     > Pearce <
>>>     >     > > >> > >> >> >>> [email protected]>
>>>     >     > > >> > >> >> >>> > >> wrote:
>>>     >     > > >> > >> >> >>> > >>
>>>     >     > > >> > >> >> >>> > >> > I assume, that after a period of a
>> week,
>>>     > that there
>>>     >     > > is
>>>     >     > > >> no
>>>     >     > > >> > >> >> concerns
>>>     >     > > >> > >> >> >>> now
>>>     >     > > >> > >> >> >>> > >> > with points 1, and 2 and now we have
>>>     > agreement that
>>>     >     > > >> > headers
>>>     >     > > >> > >> are
>>>     >     > > >> > >> >> >>> useful
>>>     >     > > >> > >> >> >>> > >> and
>>>     >     > > >> > >> >> >>> > >> > needed in Kafka. As such if put to a
>> KIP
>>>     > vote, this
>>>     >     > > >> > wouldn’t
>>>     >     > > >> > >> be
>>>     >     > > >> > >> >> a
>>>     >     > > >> > >> >> >>> > reason
>>>     >     > > >> > >> >> >>> > >> to
>>>     >     > > >> > >> >> >>> > >> > reject.
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> > @
>>>     >     > > >> > >> >> >>> > >> > Ignacio on point 4).
>>>     >     > > >> > >> >> >>> > >> > I think for purpose of getting this
>> KIP
>>>     > moving past
>>>     >     > > >> this,
>>>     >     > > >> > we
>>>     >     > > >> > >> can
>>>     >     > > >> > >> >> >>> state
>>>     >     > > >> > >> >> >>> > >> the
>>>     >     > > >> > >> >> >>> > >> > key will be a 4 bytes space that can
>> will be
>>>     >     > > naturally
>>>     >     > > >> > >> >> interpreted
>>>     >     > > >> > >> >> >>> as
>>>     >     > > >> > >> >> >>> > an
>>>     >     > > >> > >> >> >>> > >> > Int32 (if namespacing is later
>> wanted you can
>>>     >     > easily
>>>     >     > > >> split
>>>     >     > > >> > >> this
>>>     >     > > >> > >> >> >>> into
>>>     >     > > >> > >> >> >>> > two
>>>     >     > > >> > >> >> >>> > >> > int16 spaces), from the wire protocol
>>>     >     > implementation
>>>     >     > > >> this
>>>     >     > > >> > >> makes
>>>     >     > > >> > >> >> no
>>>     >     > > >> > >> >> >>> > >> > difference I don’t believe. Is this
>>>     > reasonable to
>>>     >     > > all?
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> > On 5) as per point 4 therefor happy
>> we keep
>>>     > with 32
>>>     >     > > >> bits.
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> > On 18/11/2016, 20:34, "
>>>     > [email protected] on
>>>     >     > > >> behalf
>>>     >     > > >> > of
>>>     >     > > >> > >> >> >>> Ignacio
>>>     >     > > >> > >> >> >>> > >> > Solis" <[email protected] on
>> behalf of
>>>     >     > > >> > [email protected]
>>>     >     > > >> > >> >
>>>     >     > > >> > >> >> >>> wrote:
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     Summary:
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     3) Yes - Header value as byte[]
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     4a) Int,Int - No
>>>     >     > > >> > >> >> >>> > >> >     4b) Int - Yes
>>>     >     > > >> > >> >> >>> > >> >     4c) String - Reluctant maybe
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     5) I believe the header system
>> should
>>>     > take a
>>>     >     > > single
>>>     >     > > >> > >> int.  I
>>>     >     > > >> > >> >> >>> think
>>>     >     > > >> > >> >> >>> > >> > 32bits is
>>>     >     > > >> > >> >> >>> > >> >     a good size, if you want to
>> interpret
>>>     > this as
>>>     >     > to
>>>     >     > > >> 16bit
>>>     >     > > >> > >> >> numbers
>>>     >     > > >> > >> >> >>> in
>>>     >     > > >> > >> >> >>> > the
>>>     >     > > >> > >> >> >>> > >> > layer
>>>     >     > > >> > >> >> >>> > >> >     above go right ahead.  If
>> somebody wants
>>>     > to
>>>     >     > argue
>>>     >     > > >> for
>>>     >     > > >> > 16
>>>     >     > > >> > >> >> bits
>>>     >     > > >> > >> >> >>> or
>>>     >     > > >> > >> >> >>> > 64
>>>     >     > > >> > >> >> >>> > >> > bits of
>>>     >     > > >> > >> >> >>> > >> >     header key space I would listen.
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     Discussion:
>>>     >     > > >> > >> >> >>> > >> >     Dividing the key space into
>> sub_key_1 and
>>>     >     > > sub_key_2
>>>     >     > > >> > >> makes no
>>>     >     > > >> > >> >> >>> > sense to
>>>     >     > > >> > >> >> >>> > >> > me at
>>>     >     > > >> > >> >> >>> > >> >     this layer.  Are we going to
>> start
>>>     > providing
>>>     >     > > APIs to
>>>     >     > > >> > get
>>>     >     > > >> > >> all
>>>     >     > > >> > >> >> >>> the
>>>     >     > > >> > >> >> >>> > >> >     sub_key_1s? or all the
>> sub_key_2s?  If
>>>     > there is
>>>     >     > > no
>>>     >     > > >> > >> >> >>> distinguishing
>>>     >     > > >> > >> >> >>> > >> > functions
>>>     >     > > >> > >> >> >>> > >> >     that are applied to each one
>> then they
>>>     > should
>>>     >     > be
>>>     >     > > a
>>>     >     > > >> > single
>>>     >     > > >> > >> >> >>> value.
>>>     >     > > >> > >> >> >>> > At
>>>     >     > > >> > >> >> >>> > >> > this
>>>     >     > > >> > >> >> >>> > >> >     layer all we're doing is
>> equality.
>>>     >     > > >> > >> >> >>> > >> >     If the above layer wants to
>> interpret
>>>     > this as
>>>     >     > 2,
>>>     >     > > 3
>>>     >     > > >> or
>>>     >     > > >> > >> more
>>>     >     > > >> > >> >> >>> values
>>>     >     > > >> > >> >> >>> > >> > that's a
>>>     >     > > >> > >> >> >>> > >> >     different question.  I
>> personally think
>>>     > it's
>>>     >     > all
>>>     >     > > one
>>>     >     > > >> > >> >> keyspace
>>>     >     > > >> > >> >> >>> > that is
>>>     >     > > >> > >> >> >>> > >> >     getting assigned using some
>> structure,
>>>     > but if
>>>     >     > you
>>>     >     > > >> > want to
>>>     >     > > >> > >> >> >>> > sub-assign
>>>     >     > > >> > >> >> >>> > >> > parts
>>>     >     > > >> > >> >> >>> > >> >     of it then that's fine.
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     The same discussion applies to
>> strings.
>>>     > If
>>>     >     > > somebody
>>>     >     > > >> > >> argued
>>>     >     > > >> > >> >> for
>>>     >     > > >> > >> >> >>> > >> > strings,
>>>     >     > > >> > >> >> >>> > >> >     would we be arguing to divide the
>>>     > strings with
>>>     >     > > dots
>>>     >     > > >> > ('.')
>>>     >     > > >> > >> >> as a
>>>     >     > > >> > >> >> >>> > >> > requirement?
>>>     >     > > >> > >> >> >>> > >> >     Would we want them to give us the
>>>     > different
>>>     >     > name
>>>     >     > > >> > segments
>>>     >     > > >> > >> >> >>> > separately?
>>>     >     > > >> > >> >> >>> > >> >     Would we be performing any
>> actions on
>>>     > this key
>>>     >     > > other
>>>     >     > > >> > than
>>>     >     > > >> > >> >> >>> > matching?
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     Nacho
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     On Fri, Nov 18, 2016 at 9:30 AM,
>> Michael
>>>     >     > Pearce <
>>>     >     > > >> > >> >> >>> > >> [email protected]
>>>     >     > > >> > >> >> >>> > >> > >
>>>     >     > > >> > >> >> >>> > >> >     wrote:
>>>     >     > > >> > >> >> >>> > >> >
>>>     >     > > >> > >> >> >>> > >> >     > #jay #jun any concerns on 1
>> and 2
>>>     > still?
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > @all
>>>     >     > > >> > >> >> >>> > >> >     > To get this moving along a bit
>> more
>>>     > I'd also
>>>     >     > > like
>>>     >     > > >> to
>>>     >     > > >> > >> ask
>>>     >     > > >> > >> >> to
>>>     >     > > >> > >> >> >>> get
>>>     >     > > >> > >> >> >>> > >> > clarity on
>>>     >     > > >> > >> >> >>> > >> >     > the below last points:
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > 3) I believe we're all roughly
>> happy
>>>     > with the
>>>     >     > > >> header
>>>     >     > > >> > >> value
>>>     >     > > >> > >> >> >>> > being a
>>>     >     > > >> > >> >> >>> > >> > byte[]?
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > 4) I believe consensus has
>> been for an
>>>     >     > > namespace
>>>     >     > > >> > based
>>>     >     > > >> > >> int
>>>     >     > > >> > >> >> >>> > approach
>>>     >     > > >> > >> >> >>> > >> >     > {int,int} for the key. Any
>> objections
>>>     > if this
>>>     >     > > is
>>>     >     > > >> > what
>>>     >     > > >> > >> we
>>>     >     > > >> > >> >> go
>>>     >     > > >> > >> >> >>> > with?
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > 5) as we have if assumption in
>> (4)  is
>>>     >     > correct,
>>>     >     > > >> > >> {int,int}
>>>     >     > > >> > >> >> >>> keys.
>>>     >     > > >> > >> >> >>> > >> >     > Should both int's be int16 or
>> int32?
>>>     >     > > >> > >> >> >>> > >> >     > I'm for them being int16(2
>> bytes) as
>>>     > combined
>>>     >     > > is
>>>     >     > > >> > space
>>>     >     > > >> > >> of
>>>     >     > > >> > >> >> >>> > 4bytes as
>>>     >     > > >> > >> >> >>> > >> > per
>>>     >     > > >> > >> >> >>> > >> >     > original and gives plenty of
>>>     > combinations for
>>>     >     > > the
>>>     >     > > >> > >> >> >>> foreseeable,
>>>     >     > > >> > >> >> >>> > and
>>>     >     > > >> > >> >> >>> > >> > keeps
>>>     >     > > >> > >> >> >>> > >> >     > the overhead small.
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > Do we see any benefit in
>> another kip
>>>     > call to
>>>     >     > > >> discuss
>>>     >     > > >> > >> >> these at
>>>     >     > > >> > >> >> >>> > all?
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > Cheers
>>>     >     > > >> > >> >> >>> > >> >     > Mike
>>>     >     > > >> > >> >> >>> > >> >     > ______________________________
>>>     > __________
>>>     >     > > >> > >> >> >>> > >> >     > From: K Burstev <
>> [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > Sent: Friday, November 18, 2016
>>>     > 7:07:07 AM
>>>     >     > > >> > >> >> >>> > >> >     > To: [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > Subject: Re: [DISCUSS] KIP-82
>> - Add
>>>     > Record
>>>     >     > > Headers
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > For what it is worth also i
>> agree. As
>>>     > a user:
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     >  1) Yes - Headers are
>> worthwhile
>>>     >     > > >> > >> >> >>> > >> >     >  2) Yes - Headers should be a
>> top level
>>>     >     > option
>>>     >     > > >> > >> >> >>> > >> >     >
>>>     >     > > >> > >> >> >>> > >> >     > 14.11.2016, 21:15, "Ignacio
>> Solis" <
>>>     >     > > >> [email protected]
>>>     >     > > >> > >:
>>>     >     > > >> > >> >> >>> > >> >     > > 1) Yes - Headers are
>> worthwhile
>>>     >     > > >> > >> >> >>> > >> >     > > 2) Yes - Headers should be a
>> top
>>>     > level
>>>     >     > option
>>>     >     > > >> > >> >> >>> > >> >     > >
>>>     >     > > >> > >> >> >>> > >> >     > > On Mon, Nov 14, 2016 at 9:16
>> AM,
>>>     > Michael
>>>     >     > > Pearce
>>>     >     > > >> <
>>>     >     > > >> > >> >> >>> > >> > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  Hi Roger,
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  The kip details/examples
>> the
>>>     > original
>>>     >     > > proposal
>>>     >     > > >> > for
>>>     >     > > >> > >> key
>>>     >     > > >> > >> >> >>> > spacing
>>>     >     > > >> > >> >> >>> > >> ,
>>>     >     > > >> > >> >> >>> > >> > not
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  new mentioned as per
>> discussion
>>>     > namespace
>>>     >     > > >> idea.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  We will need to update the
>> kip,
>>>     > when we
>>>     >     > get
>>>     >     > > >> > >> agreement
>>>     >     > > >> > >> >> >>> this
>>>     >     > > >> > >> >> >>> > is a
>>>     >     > > >> > >> >> >>> > >> > better
>>>     >     > > >> > >> >> >>> > >> >     > >>  approach (which seems to
>> be the
>>>     > case if I
>>>     >     > > have
>>>     >     > > >> > >> >> understood
>>>     >     > > >> > >> >> >>> > the
>>>     >     > > >> > >> >> >>> > >> > general
>>>     >     > > >> > >> >> >>> > >> >     > >>  feeling in the
>> conversation)
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Re the variable ints, at
>> very
>>>     > early stage
>>>     >     > > we
>>>     >     > > >> did
>>>     >     > > >> > >> think
>>>     >     > > >> > >> >> >>> about
>>>     >     > > >> > >> >> >>> > >> > this. I
>>>     >     > > >> > >> >> >>> > >> >     > think
>>>     >     > > >> > >> >> >>> > >> >     > >>  the added complexity for
>> the
>>>     > saving isn't
>>>     >     > > >> worth
>>>     >     > > >> > it.
>>>     >     > > >> > >> >> I'd
>>>     >     > > >> > >> >> >>> > rather
>>>     >     > > >> > >> >> >>> > >> go
>>>     >     > > >> > >> >> >>> > >> >     > with, if
>>>     >     > > >> > >> >> >>> > >> >     > >>  we want to reduce
>> overheads and
>>>     > size
>>>     >     > int16
>>>     >     > > >> > (2bytes)
>>>     >     > > >> > >> >> keys
>>>     >     > > >> > >> >> >>> as
>>>     >     > > >> > >> >> >>> > it
>>>     >     > > >> > >> >> >>> > >> > keeps it
>>>     >     > > >> > >> >> >>> > >> >     > >>  simple.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  On the note of no headers,
>> there
>>>     > is as
>>>     >     > per
>>>     >     > > the
>>>     >     > > >> > kip
>>>     >     > > >> > >> as
>>>     >     > > >> > >> >> we
>>>     >     > > >> > >> >> >>> > use an
>>>     >     > > >> > >> >> >>> > >> >     > attribute
>>>     >     > > >> > >> >> >>> > >> >     > >>  bit to denote if headers
>> are
>>>     > present or
>>>     >     > > not as
>>>     >     > > >> > such
>>>     >     > > >> > >> >> >>> > provides a
>>>     >     > > >> > >> >> >>> > >> > zero
>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead currently if
>> headers are
>>>     > not
>>>     >     > used.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  I think as radai mentions
>> would be
>>>     > good
>>>     >     > > first
>>>     >     > > >> > if we
>>>     >     > > >> > >> >> can
>>>     >     > > >> > >> >> >>> get
>>>     >     > > >> > >> >> >>> > >> > clarity if
>>>     >     > > >> > >> >> >>> > >> >     > do
>>>     >     > > >> > >> >> >>> > >> >     > >>  we now have general
>> consensus that
>>>     > (1)
>>>     >     > > headers
>>>     >     > > >> > are
>>>     >     > > >> > >> >> >>> > worthwhile
>>>     >     > > >> > >> >> >>> > >> and
>>>     >     > > >> > >> >> >>> > >> >     > useful,
>>>     >     > > >> > >> >> >>> > >> >     > >>  and (2) we want it as a
>> top level
>>>     > entity.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Just to state the obvious i
>>>     > believe (1)
>>>     >     > > >> headers
>>>     >     > > >> > are
>>>     >     > > >> > >> >> >>> > worthwhile
>>>     >     > > >> > >> >> >>> > >> > and (2)
>>>     >     > > >> > >> >> >>> > >> >     > >>  agree as a top level
>> entity.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers
>>>     >     > > >> > >> >> >>> > >> >     > >>  Mike
>>>     >     > > >> > >> >> >>> > >> >     > >>
>> ______________________________
>>>     > __________
>>>     >     > > >> > >> >> >>> > >> >     > >>  From: Roger Hoover <
>>>     >     > [email protected]
>>>     >     > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  Sent: Wednesday, November
>> 9, 2016
>>>     > 9:10:47
>>>     >     > > PM
>>>     >     > > >> > >> >> >>> > >> >     > >>  To: [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > >>  Subject: Re: [DISCUSS]
>> KIP-82 - Add
>>>     >     > Record
>>>     >     > > >> > Headers
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Sorry for going a little
>> in the
>>>     > weeds but
>>>     >     > > >> thanks
>>>     >     > > >> > >> for
>>>     >     > > >> > >> >> the
>>>     >     > > >> > >> >> >>> > >> replies
>>>     >     > > >> > >> >> >>> > >> >     > regarding
>>>     >     > > >> > >> >> >>> > >> >     > >>  varint.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Agreed that a prefix and
>> {int,
>>>     > int} can
>>>     >     > be
>>>     >     > > the
>>>     >     > > >> > >> same.
>>>     >     > > >> > >> >> It
>>>     >     > > >> > >> >> >>> > doesn't
>>>     >     > > >> > >> >> >>> > >> > look
>>>     >     > > >> > >> >> >>> > >> >     > like
>>>     >     > > >> > >> >> >>> > >> >     > >>  that's what the KIP is
>> saying the
>>>     > "Open"
>>>     >     > > >> > section.
>>>     >     > > >> > >> The
>>>     >     > > >> > >> >> >>> > example
>>>     >     > > >> > >> >> >>> > >> > shows
>>>     >     > > >> > >> >> >>> > >> >     > >>  2100001
>>>     >     > > >> > >> >> >>> > >> >     > >>  for New Relic and 210002
>> for App
>>>     > Dynamics
>>>     >     > > >> > implying
>>>     >     > > >> > >> >> that
>>>     >     > > >> > >> >> >>> the
>>>     >     > > >> > >> >> >>> > New
>>>     >     > > >> > >> >> >>> > >> > Relic
>>>     >     > > >> > >> >> >>> > >> >     > >>  organization will have
>> only a
>>>     > single
>>>     >     > > header id
>>>     >     > > >> > to
>>>     >     > > >> > >> work
>>>     >     > > >> > >> >> >>> > with. Or
>>>     >     > > >> > >> >> >>> > >> > is
>>>     >     > > >> > >> >> >>> > >> >     > 2100001
>>>     >     > > >> > >> >> >>> > >> >     > >>  a prefix? The main point
>> of a
>>>     > namespace
>>>     >     > or
>>>     >     > > >> > prefix
>>>     >     > > >> > >> is
>>>     >     > > >> > >> >> to
>>>     >     > > >> > >> >> >>> > reduce
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead of config mapping
>> or
>>>     >     > registration
>>>     >     > > >> > >> depending
>>>     >     > > >> > >> >> on
>>>     >     > > >> > >> >> >>> how
>>>     >     > > >> > >> >> >>> > >> >     > >>  namespaces/prefixes are
>> managed.
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Would love to hear more
>> feedback
>>>     > on the
>>>     >     > > >> > >> higher-level
>>>     >     > > >> > >> >> >>> > questions
>>>     >     > > >> > >> >> >>> > >> >     > though...
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers,
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  Roger
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  On Wed, Nov 9, 2016 at
>> 11:38 AM,
>>>     > radai <
>>>     >     > > >> > >> >> >>> > >> > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > I think this discussion
>> is
>>>     > getting a
>>>     >     > bit
>>>     >     > > >> into
>>>     >     > > >> > the
>>>     >     > > >> > >> >> >>> weeds on
>>>     >     > > >> > >> >> >>> > >> > technical
>>>     >     > > >> > >> >> >>> > >> >     > >>  > implementation details.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > I'd liek to step back a
>> minute
>>>     > and try
>>>     >     > > and
>>>     >     > > >> > >> establish
>>>     >     > > >> > >> >> >>> > where we
>>>     >     > > >> > >> >> >>> > >> > are in
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > larger picture:
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > (re-wording nacho's last
>>>     > paragraph)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > 1. are we all in
>> agreement that
>>>     > headers
>>>     >     > > are
>>>     >     > > >> a
>>>     >     > > >> > >> >> >>> worthwhile
>>>     >     > > >> > >> >> >>> > and
>>>     >     > > >> > >> >> >>> > >> > useful
>>>     >     > > >> > >> >> >>> > >> >     > >>  > addition to have? this
>> was
>>>     > contested
>>>     >     > > early
>>>     >     > > >> on
>>>     >     > > >> > >> >> >>> > >> >     > >>  > 2. are we all in
>> agreement on
>>>     > headers
>>>     >     > as
>>>     >     > > top
>>>     >     > > >> > >> level
>>>     >     > > >> > >> >> >>> entity
>>>     >     > > >> > >> >> >>> > vs
>>>     >     > > >> > >> >> >>> > >> > headers
>>>     >     > > >> > >> >> >>> > >> >     > >>  > squirreled-away in V?
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > if there are still
>> concerns
>>>     > around
>>>     >     > these
>>>     >     > > #2
>>>     >     > > >> > >> points
>>>     >     > > >> > >> >> >>> (#jay?
>>>     >     > > >> > >> >> >>> > >> > #jun?)?
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > (and now back to our
>> normal
>>>     > programming
>>>     >     > > ...)
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > varints are nice. having
>> said
>>>     > that, its
>>>     >     > > >> adding
>>>     >     > > >> > >> >> >>> complexity
>>>     >     > > >> > >> >> >>> > >> (see
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>> https://github.com/addthis/
>>>     >     > > >> > >> >> stream-lib/blob/master/src/
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>> main/java/com/clearspring/
>>>     >     > > >> > >> >> analytics/util/Varint.java
>>>     >     > > >> > >> >> >>> > >> >     > >>  > as 1st google result)
>> and would
>>>     > require
>>>     >     > > >> anyone
>>>     >     > > >> > >> >> writing
>>>     >     > > >> > >> >> >>> > other
>>>     >     > > >> > >> >> >>> > >> > clients
>>>     >     > > >> > >> >> >>> > >> >     > (C?
>>>     >     > > >> > >> >> >>> > >> >     > >>  > Python? Go? Bash? ;-) )
>> to
>>>     >     > get/implement
>>>     >     > > the
>>>     >     > > >> > >> same,
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> for
>>>     >     > > >> > >> >> >>> > >> > relatively
>>>     >     > > >> > >> >> >>> > >> >     > >>  > little gain (int vs
>> string is
>>>     > order of
>>>     >     > > >> > magnitude,
>>>     >     > > >> > >> >> this
>>>     >     > > >> > >> >> >>> > isnt).
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > int namespacing vs {int,
>> int}
>>>     >     > namespacing
>>>     >     > > >> are
>>>     >     > > >> > >> >> basically
>>>     >     > > >> > >> >> >>> > the
>>>     >     > > >> > >> >> >>> > >> > same
>>>     >     > > >> > >> >> >>> > >> >     > thing -
>>>     >     > > >> > >> >> >>> > >> >     > >>  > youre just namespacing
>> an int64
>>>     > and
>>>     >     > > giving
>>>     >     > > >> > people
>>>     >     > > >> > >> >> while
>>>     >     > > >> > >> >> >>> > 2^32
>>>     >     > > >> > >> >> >>> > >> > ranges
>>>     >     > > >> > >> >> >>> > >> >     > at a
>>>     >     > > >> > >> >> >>> > >> >     > >>  > time. the part i like
>> about this
>>>     > is
>>>     >     > > letting
>>>     >     > > >> > >> people
>>>     >     > > >> > >> >> >>> have a
>>>     >     > > >> > >> >> >>> > >> large
>>>     >     > > >> > >> >> >>> > >> >     > swath of
>>>     >     > > >> > >> >> >>> > >> >     > >>  > numbers with one
>> registration so
>>>     > they
>>>     >     > > dont
>>>     >     > > >> > have
>>>     >     > > >> > >> to
>>>     >     > > >> > >> >> come
>>>     >     > > >> > >> >> >>> > back
>>>     >     > > >> > >> >> >>> > >> > for
>>>     >     > > >> > >> >> >>> > >> >     > every
>>>     >     > > >> > >> >> >>> > >> >     > >>  > single plugin/header
>> they want to
>>>     >     > > "reserve".
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > On Wed, Nov 9, 2016 at
>> 11:01 AM,
>>>     > Roger
>>>     >     > > >> Hoover
>>>     >     > > >> > <
>>>     >     > > >> > >> >> >>> > >> >     > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > Since some of the
>> debate has
>>>     > been
>>>     >     > about
>>>     >     > > >> > >> overhead +
>>>     >     > > >> > >> >> >>> > >> > performance, I'm
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wondering if we have
>>>     > considered a
>>>     >     > > varint
>>>     >     > > >> > >> encoding
>>>     >     > > >> > >> >> (
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>> https://developers.google.com/
>>>     >     > > >> > >> >> protocol-buffers/docs/
>>>     >     > > >> > >> >> >>> > >> >     > encoding#varints)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > the header length
>> field (int32
>>>     > in the
>>>     >     > > >> > proposal)
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> for
>>>     >     > > >> > >> >> >>> > >> > header
>>>     >     > > >> > >> >> >>> > >> >     > ids? If
>>>     >     > > >> > >> >> >>> > >> >     > >>  > you
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > don't use headers, the
>>>     > overhead would
>>>     >     > > be a
>>>     >     > > >> > >> single
>>>     >     > > >> > >> >> >>> byte
>>>     >     > > >> > >> >> >>> > and
>>>     >     > > >> > >> >> >>> > >> > for each
>>>     >     > > >> > >> >> >>> > >> >     > >>  > header
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > id < 128 would also
>> need only a
>>>     >     > single
>>>     >     > > >> byte?
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > On Wed, Nov 9, 2016 at
>> 6:43 AM,
>>>     >     > radai <
>>>     >     > > >> > >> >> >>> > >> > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > @magnus - and very
>> dangerous
>>>     > (youre
>>>     >     > > >> > >> essentially
>>>     >     > > >> > >> >> >>> > >> > downloading and
>>>     >     > > >> > >> >> >>> > >> >     > >>  > executing
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > arbitrary code off
>> the
>>>     > internet on
>>>     >     > > your
>>>     >     > > >> > >> servers
>>>     >     > > >> > >> >> ...
>>>     >     > > >> > >> >> >>> > bad
>>>     >     > > >> > >> >> >>> > >> > idea
>>>     >     > > >> > >> >> >>> > >> >     > without
>>>     >     > > >> > >> >> >>> > >> >     > >>  a
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sandbox, even with)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > as for it being a
>> purely
>>>     >     > > administrative
>>>     >     > > >> > task
>>>     >     > > >> > >> - i
>>>     >     > > >> > >> >> >>> > >> disagree.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > i wish it would,
>> really,
>>>     > because
>>>     >     > > then my
>>>     >     > > >> > >> earlier
>>>     >     > > >> > >> >> >>> > point on
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > complexity
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the remapping
>> process would
>>>     > be
>>>     >     > > invalid,
>>>     >     > > >> > but
>>>     >     > > >> > >> at
>>>     >     > > >> > >> >> >>> > linkedin,
>>>     >     > > >> > >> >> >>> > >> > for
>>>     >     > > >> > >> >> >>> > >> >     > example,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > we
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (the team im in) run
>> kafka
>>>     > as a
>>>     >     > > service.
>>>     >     > > >> > we
>>>     >     > > >> > >> dont
>>>     >     > > >> > >> >> >>> > really
>>>     >     > > >> > >> >> >>> > >> > know
>>>     >     > > >> > >> >> >>> > >> >     > what our
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > users
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (developing
>> applications
>>>     > that use
>>>     >     > > kafka)
>>>     >     > > >> > are
>>>     >     > > >> > >> up
>>>     >     > > >> > >> >> to
>>>     >     > > >> > >> >> >>> at
>>>     >     > > >> > >> >> >>> > any
>>>     >     > > >> > >> >> >>> > >> > given
>>>     >     > > >> > >> >> >>> > >> >     > >>  moment.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > it
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > is very possible
>> (given the
>>>     >     > > existance of
>>>     >     > > >> > >> headers
>>>     >     > > >> > >> >> >>> and a
>>>     >     > > >> > >> >> >>> > >> >     > corresponding
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugin
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > ecosystem) for some
>>>     > application to
>>>     >     > > >> "equip"
>>>     >     > > >> > >> their
>>>     >     > > >> > >> >> >>> > >> producers
>>>     >     > > >> > >> >> >>> > >> > and
>>>     >     > > >> > >> >> >>> > >> >     > >>  > consumers
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > with the required
>> plugin
>>>     > without us
>>>     >     > > >> > knowing.
>>>     >     > > >> > >> i
>>>     >     > > >> > >> >> dont
>>>     >     > > >> > >> >> >>> > mean
>>>     >     > > >> > >> >> >>> > >> > to imply
>>>     >     > > >> > >> >> >>> > >> >     > >>  thats
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > bad, i just want to
>> make the
>>>     > point
>>>     >     > > that
>>>     >     > > >> > its
>>>     >     > > >> > >> not
>>>     >     > > >> > >> >> as
>>>     >     > > >> > >> >> >>> > simple
>>>     >     > > >> > >> >> >>> > >> >     > keeping it
>>>     >     > > >> > >> >> >>> > >> >     > >>  in
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sync across a
>> large-enough
>>>     >     > > organization.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > On Wed, Nov 9, 2016
>> at 6:17
>>>     > AM,
>>>     >     > > Magnus
>>>     >     > > >> > >> Edenhill
>>>     >     > > >> > >> >> <
>>>     >     > > >> > >> >> >>> > >> >     > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > I think there is a
>> piece
>>>     > missing
>>>     >     > in
>>>     >     > > >> the
>>>     >     > > >> > >> >> Strings
>>>     >     > > >> > >> >> >>> > >> > discussion,
>>>     >     > > >> > >> >> >>> > >> >     > where
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > pro-Stringers
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > reason that by
>> providing
>>>     > unique
>>>     >     > > string
>>>     >     > > >> > >> >> >>> identifiers
>>>     >     > > >> > >> >> >>> > for
>>>     >     > > >> > >> >> >>> > >> > each
>>>     >     > > >> > >> >> >>> > >> >     > header
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > everything will
>> just
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > magically work for
>> all
>>>     > parts of
>>>     >     > the
>>>     >     > > >> > stream
>>>     >     > > >> > >> >> >>> pipeline.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > But the strings
>> dont mean
>>>     >     > anything
>>>     >     > > by
>>>     >     > > >> > >> >> themselves,
>>>     >     > > >> > >> >> >>> > and
>>>     >     > > >> > >> >> >>> > >> > while we
>>>     >     > > >> > >> >> >>> > >> >     > >>  could
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > probably envision
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some auto plugin
>> loader
>>>     > that
>>>     >     > > >> downloads,
>>>     >     > > >> > >> >> compiles,
>>>     >     > > >> > >> >> >>> > links
>>>     >     > > >> > >> >> >>> > >> > and
>>>     >     > > >> > >> >> >>> > >> >     > runs
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > on-demand
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > as soon as they're
>> seen by
>>>     > a
>>>     >     > > >> consumer, I
>>>     >     > > >> > >> dont
>>>     >     > > >> > >> >> >>> really
>>>     >     > > >> > >> >> >>> > >> see
>>>     >     > > >> > >> >> >>> > >> > a
>>>     >     > > >> > >> >> >>> > >> >     > use-case
>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > something
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > so dynamic (and
>> fragile) in
>>>     >     > > practice.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > In the real world
>> an
>>>     > application
>>>     >     > > will
>>>     >     > > >> be
>>>     >     > > >> > >> >> >>> configured
>>>     >     > > >> > >> >> >>> > >> with
>>>     >     > > >> > >> >> >>> > >> > a set
>>>     >     > > >> > >> >> >>> > >> >     > of
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > to either add
>> (producer)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > or read (consumer)
>> headers.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > This is an
>> administrative
>>>     > task
>>>     >     > > based
>>>     >     > > >> on
>>>     >     > > >> > >> what
>>>     >     > > >> > >> >> >>> > features a
>>>     >     > > >> > >> >> >>> > >> > client
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > needs/provides and
>> results
>>>     > in
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some sort of
>> configuration
>>>     > to
>>>     >     > > enable
>>>     >     > > >> and
>>>     >     > > >> > >> >> >>> configure
>>>     >     > > >> > >> >> >>> > the
>>>     >     > > >> > >> >> >>> > >> > desired
>>>     >     > > >> > >> >> >>> > >> >     > >>  > plugins.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > Since this needs
>> to be kept
>>>     >     > > somewhat
>>>     >     > > >> in
>>>     >     > > >> > >> sync
>>>     >     > > >> > >> >> >>> across
>>>     >     > > >> > >> >> >>> > an
>>>     >     > > >> > >> >> >>> > >> >     > organisation
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (there
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > is no point in
>> having
>>>     > producers
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > add headers no
>> consumers
>>>     > will
>>>     >     > read,
>>>     >     > > >> and
>>>     >     > > >> > >> vice
>>>     >     > > >> > >> >> >>> versa),
>>>     >     > > >> > >> >> >>> > >> the
>>>     >     > > >> > >> >> >>> > >> > added
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > complexity
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > of assigning an id
>>>     > namespace
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > for each plugin as
>> it is
>>>     > being
>>>     >     > > >> > configured
>>>     >     > > >> > >> >> should
>>>     >     > > >> > >> >> >>> be
>>>     >     > > >> > >> >> >>> > >> > tolerable.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > /Magnus
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > 2016-11-09 13:06
>> GMT+01:00
>>>     >     > Michael
>>>     >     > > >> > Pearce <
>>>     >     > > >> > >> >> >>> > >> >     > [email protected]>:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Just
>> following/catching
>>>     > up on
>>>     >     > > what
>>>     >     > > >> > seems
>>>     >     > > >> > >> to
>>>     >     > > >> > >> >> be
>>>     >     > > >> > >> >> >>> an
>>>     >     > > >> > >> >> >>> > >> > active
>>>     >     > > >> > >> >> >>> > >> >     > night :)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > @Radai sorry if
>> it may
>>>     > seem
>>>     >     > > obvious
>>>     >     > > >> > but
>>>     >     > > >> > >> what
>>>     >     > > >> > >> >> >>> does
>>>     >     > > >> > >> >> >>> > MD
>>>     >     > > >> > >> >> >>> > >> > stand
>>>     >     > > >> > >> >> >>> > >> >     > for?
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > My take on
>> String vs Int:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I will state
>> first I am
>>>     > pro Int
>>>     >     > > (16
>>>     >     > > >> or
>>>     >     > > >> > >> 32).
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I do though
>> playing
>>>     > devils
>>>     >     > > advocate
>>>     >     > > >> > see a
>>>     >     > > >> > >> >> big
>>>     >     > > >> > >> >> >>> plus
>>>     >     > > >> > >> >> >>> > >> > with the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > argument
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > String keys,
>> this is
>>>     > around
>>>     >     > > >> > integrating
>>>     >     > > >> > >> >> into an
>>>     >     > > >> > >> >> >>> > >> > existing
>>>     >     > > >> > >> >> >>> > >> >     > >>  > eco-system.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > As many other
>> systems use
>>>     >     > String
>>>     >     > > >> based
>>>     >     > > >> > >> >> headers
>>>     >     > > >> > >> >> >>> > >> (Flume,
>>>     >     > > >> > >> >> >>> > >> > JMS)
>>>     >     > > >> > >> >> >>> > >> >     > it
>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > it
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > much easier for
>> these to
>>>     > be
>>>     >     > > >> > >> >> >>> > incorporated/integrated
>>>     >     > > >> > >> >> >>> > >> > into.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > How with Int
>> based
>>>     > headers
>>>     >     > could
>>>     >     > > we
>>>     >     > > >> > >> provide
>>>     >     > > >> > >> >> a
>>>     >     > > >> > >> >> >>> > >> > way/guidence to
>>>     >     > > >> > >> >> >>> > >> >     > >>  make
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > this
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > integration
>> simple /
>>>     > easy with
>>>     >     > > >> > transition
>>>     >     > > >> > >> >> flows
>>>     >     > > >> > >> >> >>> > over
>>>     >     > > >> > >> >> >>> > >> to
>>>     >     > > >> > >> >> >>> > >> >     > kafka?
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * tough luck
>> buddy
>>>     > you're on
>>>     >     > your
>>>     >     > > >> own
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * simply hash
>> the string
>>>     > into
>>>     >     > int
>>>     >     > > >> code
>>>     >     > > >> > >> and
>>>     >     > > >> > >> >> hope
>>>     >     > > >> > >> >> >>> > for
>>>     >     > > >> > >> >> >>> > >> no
>>>     >     > > >> > >> >> >>> > >> >     > collisions
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > (how
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > convert back
>> though?)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * http2 style as
>>>     > mentioned by
>>>     >     > > nacho.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > cheers,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Mike
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     > ______________________________
>>>     >     > > >> > __________
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > From: radai <
>>>     >     > > >> > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Sent: Wednesday,
>>>     > November 9,
>>>     >     > 2016
>>>     >     > > >> > 8:12 AM
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > To:
>> [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Subject: Re:
>> [DISCUSS]
>>>     > KIP-82 -
>>>     >     > > Add
>>>     >     > > >> > >> Record
>>>     >     > > >> > >> >> >>> Headers
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > thinking about
>> it some
>>>     > more,
>>>     >     > the
>>>     >     > > >> best
>>>     >     > > >> > >> way to
>>>     >     > > >> > >> >> >>> > transmit
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > header
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > remapping
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > data to
>> consumers would
>>>     > be to
>>>     >     > > put it
>>>     >     > > >> > in
>>>     >     > > >> > >> the
>>>     >     > > >> > >> >> MD
>>>     >     > > >> > >> >> >>> > >> response
>>>     >     > > >> > >> >> >>> > >> >     > payload,
>>>     >     > > >> > >> >> >>> > >> >     > >>  so
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > maybe
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > it should be
>> discussed
>>>     > now.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > On Wed, Nov 9,
>> 2016 at
>>>     > 12:09
>>>     >     > AM,
>>>     >     > > >> > radai <
>>>     >     > > >> > >> >> >>> > >> >     > >>  [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > im not opposed
>> to the
>>>     > idea of
>>>     >     > > >> > namespace
>>>     >     > > >> > >> >> >>> mapping.
>>>     >     > > >> > >> >> >>> > >> all
>>>     >     > > >> > >> >> >>> > >> > im
>>>     >     > > >> > >> >> >>> > >> >     > saying
>>>     >     > > >> > >> >> >>> > >> >     > >>  is
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > that
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > its
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > not part of
>> the "mvp"
>>>     > and,
>>>     >     > > since
>>>     >     > > >> it
>>>     >     > > >> > >> >> requires
>>>     >     > > >> > >> >> >>> no
>>>     >     > > >> > >> >> >>> > >> wire
>>>     >     > > >> > >> >> >>> > >> > format
>>>     >     > > >> > >> >> >>> > >> >     > >>  > change,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > can
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > always be
>> added later.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also, its not
>> as
>>>     > simple as
>>>     >     > just
>>>     >     > > >> > >> >> configuring
>>>     >     > > >> > >> >> >>> MM
>>>     >     > > >> > >> >> >>> > to
>>>     >     > > >> > >> >> >>> > >> do
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > transform:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > lets
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > say i've
>> implemented
>>>     > large
>>>     >     > > message
>>>     >     > > >> > >> >> support as
>>>     >     > > >> > >> >> >>> > >> > {666,1} and
>>>     >     > > >> > >> >> >>> > >> >     > on
>>>     >     > > >> > >> >> >>> > >> >     > >>  some
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > mirror
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > target cluster
>> its been
>>>     >     > > remapped
>>>     >     > > >> to
>>>     >     > > >> > >> >> {999,1}.
>>>     >     > > >> > >> >> >>> the
>>>     >     > > >> > >> >> >>> > >> > consumer
>>>     >     > > >> > >> >> >>> > >> >     > >>  plugin
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > code
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also need to
>> be told
>>>     > to look
>>>     >     > > for
>>>     >     > > >> the
>>>     >     > > >> > >> large
>>>     >     > > >> > >> >> >>> > message
>>>     >     > > >> > >> >> >>> > >> > "part X
>>>     >     > > >> > >> >> >>> > >> >     > of
>>>     >     > > >> > >> >> >>> > >> >     > >>  Y"
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > header
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > under {999,1}.
>> doable,
>>>     > but
>>>     >     > > tricky.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > On Tue, Nov 8,
>> 2016 at
>>>     > 10:29
>>>     >     > > PM,
>>>     >     > > >> > Gwen
>>>     >     > > >> > >> >> >>> Shapira <
>>>     >     > > >> > >> >> >>> > >> >     > >>  [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> While you can
>> do
>>>     > whatever
>>>     >     > you
>>>     >     > > >> want
>>>     >     > > >> > >> with a
>>>     >     > > >> > >> >> >>> > >> namespace
>>>     >     > > >> > >> >> >>> > >> > and
>>>     >     > > >> > >> >> >>> > >> >     > your
>>>     >     > > >> > >> >> >>> > >> >     > >>  > code,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> what I'd
>> expect is
>>>     > for each
>>>     >     > > app
>>>     >     > > >> to
>>>     >     > > >> > >> >> >>> namespaces
>>>     >     > > >> > >> >> >>> > >> >     > configurable...
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> So if I
>> accidentally
>>>     > used
>>>     >     > 666
>>>     >     > > for
>>>     >     > > >> > my
>>>     >     > > >> > >> HR
>>>     >     > > >> > >> >> >>> > >> department,
>>>     >     > > >> > >> >> >>> > >> > and
>>>     >     > > >> > >> >> >>> > >> >     > still
>>>     >     > > >> > >> >> >>> > >> >     > >>  > want
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> run RadaiApp,
>> I can
>>>     > config
>>>     >     > > >> > >> "namespace=42"
>>>     >     > > >> > >> >> >>> for
>>>     >     > > >> > >> >> >>> > >> > RadaiApp and
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > everything
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> will look
>> normal.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> This means
>> you only
>>>     > need to
>>>     >     > > sync
>>>     >     > > >> > usage
>>>     >     > > >> > >> >> >>> inside
>>>     >     > > >> > >> >> >>> > your
>>>     >     > > >> > >> >> >>> > >> > own
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > organization.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> Still hard,
>> but
>>>     > somewhat
>>>     >     > > easier
>>>     >     > > >> > than
>>>     >     > > >> > >> >> syncing
>>>     >     > > >> > >> >> >>> > with
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > entire
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > world.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> On Tue, Nov
>> 8, 2016
>>>     > at 10:07
>>>     >     > > PM,
>>>     >     > > >> > >> radai <
>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>> [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > and we can
>> start
>>>     > with
>>>     >     > > >> {namespace,
>>>     >     > > >> > >> id}
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> no
>>>     >     > > >> > >> >> >>> > >> > re-mapping
>>>     >     > > >> > >> >> >>> > >> >     > >>  > support
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > and
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> always
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > add it
>> later on
>>>     > if/when
>>>     >     > > >> > collisions
>>>     >     > > >> > >> >> >>> actually
>>>     >     > > >> > >> >> >>> > >> > happen (i
>>>     >     > > >> > >> >> >>> > >> >     > dont
>>>     >     > > >> > >> >> >>> > >> >     > >>  > think
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > they'd
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> be
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > a problem).
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > every
>> interested
>>>     > party (so
>>>     >     > > orgs
>>>     >     > > >> > or
>>>     >     > > >> > >> >> >>> > individuals)
>>>     >     > > >> > >> >> >>> > >> > could
>>>     >     > > >> > >> >> >>> > >> >     > then
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > register
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > a
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > prefix (0 =
>>>     > reserved, 1 =
>>>     >     > > >> > confluent
>>>     >     > > >> > >> ...
>>>     >     > > >> > >> >> >>> 666
>>>     >     > > >> > >> >> >>> > = me
>>>     >     > > >> > >> >> >>> > >> > :-) )
>>>     >     > > >> > >> >> >>> > >> >     > and
>>>     >     > > >> > >> >> >>> > >> >     > >>  do
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > whatever
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> with
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > the 2nd ID
>> - so once
>>>     >     > > linkedin
>>>     >     > > >> > >> >> registers,
>>>     >     > > >> > >> >> >>> say
>>>     >     > > >> > >> >> >>> > 3,
>>>     >     > > >> > >> >> >>> > >> > then
>>>     >     > > >> > >> >> >>> > >> >     > >>  linkedin
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > devs
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > are
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> free
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > to use {3,
>> *} with a
>>>     >     > > reasonable
>>>     >     > > >> > >> >> >>> expectation
>>>     >     > > >> > >> >> >>> > to
>>>     >     > > >> > >> >> >>> > >> to
>>>     >     > > >> > >> >> >>> > >> >     > collide
>>>     >     > > >> > >> >> >>> > >> >     > >>  with
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > anything
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > else.
>> further
>>>     > partitioning
>>>     >     > > of
>>>     >     > > >> > that *
>>>     >     > > >> > >> >> >>> becomes
>>>     >     > > >> > >> >> >>> > >> > linkedin's
>>>     >     > > >> > >> >> >>> > >> >     > >>  > problem,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > but
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > "upstream
>>>     > registration"
>>>     >     > of a
>>>     >     > > >> > >> namespace
>>>     >     > > >> > >> >> >>> only
>>>     >     > > >> > >> >> >>> > has
>>>     >     > > >> > >> >> >>> > >> to
>>>     >     > > >> > >> >> >>> > >> >     > happen
>>>     >     > > >> > >> >> >>> > >> >     > >>  > once.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > On Tue, Nov
>> 8, 2016
>>>     > at
>>>     >     > 9:03
>>>     >     > > PM,
>>>     >     > > >> > >> James
>>>     >     > > >> > >> >> >>> Cheng <
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > [email protected]
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Nov
>> 8, 2016,
>>>     > at 5:54
>>>     >     > > PM,
>>>     >     > > >> > Gwen
>>>     >     > > >> > >> >> >>> Shapira <
>>>     >     > > >> > >> >> >>> > >> >     > >>  > [email protected]>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Thank
>> you so
>>>     > much for
>>>     >     > > this
>>>     >     > > >> > clear
>>>     >     > > >> > >> and
>>>     >     > > >> > >> >> >>> fair
>>>     >     > > >> > >> >> >>> > >> > summary of
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > arguments.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > I'm in
>> favor of
>>>     > ints.
>>>     >     > > Not a
>>>     >     > > >> > >> >> >>> deal-breaker,
>>>     >     > > >> > >> >> >>> > but
>>>     >     > > >> > >> >> >>> > >> > in
>>>     >     > > >> > >> >> >>> > >> >     > favor.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Even
>> more in
>>>     > favor of
>>>     >     > > >> Magnus's
>>>     >     > > >> > >> >> >>> > decentralized
>>>     >     > > >> > >> >> >>> > >> >     > suggestion
>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Roger's
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > tweak:
>> add a
>>>     > namespace
>>>     >     > > for
>>>     >     > > >> > >> headers.
>>>     >     > > >> > >> >> >>> This
>>>     >     > > >> > >> >> >>> > will
>>>     >     > > >> > >> >> >>> > >> > allow
>>>     >     > > >> > >> >> >>> > >> >     > each
>>>     >     > > >> > >> >> >>> > >> >     > >>  > app
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > to
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > just
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > use
>> whatever IDs
>>>     > it
>>>     >     > wants
>>>     >     > > >> > >> >> internally,
>>>     >     > > >> > >> >> >>> and
>>>     >     > > >> > >> >> >>> > >> then
>>>     >     > > >> > >> >> >>> > >> > let
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > admin
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deploying
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > the app
>> figure
>>>     > out an
>>>     >     > > >> > available
>>>     >     > > >> > >> >> >>> namespace
>>>     >     > > >> > >> >> >>> > ID
>>>     >     > > >> > >> >> >>> > >> > for the
>>>     >     > > >> > >> >> >>> > >> >     > app
>>>     >     > > >> > >> >> >>> > >> >     > >>  to
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > live
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > So
>>>     >     > > >> > io.confluent.schema-registry
>>>     >     > > >> > >> can
>>>     >     > > >> > >> >> be
>>>     >     > > >> > >> >> >>> > >> > namespace
>>>     >     > > >> > >> >> >>> > >> >     > 0x01 on
>>>     >     > > >> > >> >> >>> > >> >     > >>  my
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deployment
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > and 0x57
>> on
>>>     > yours, and
>>>     >     > > the
>>>     >     > > >> > poor
>>>     >     > > >> > >> guys
>>>     >     > > >> > >> >> >>> > >> > developing the
>>>     >     > > >> > >> >> >>> > >> >     > app
>>>     >     > > >> > >> >> >>> > >> >     > >>  > don't
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > need
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > to
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > worry
>> about that.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> Gwen, if I
>>>     > understand
>>>     >     > your
>>>     >     > > >> > example
>>>     >     > > >> > >> >> >>> right, an
>>>     >     > > >> > >> >> >>> > >> >     > application
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > deployer
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > might
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> decide to
>> use 0x01
>>>     > in one
>>>     >     > > >> > >> deployment,
>>>     >     > > >> > >> >> and
>>>     >     > > >> > >> >> >>> > that
>>>     >     > > >> > >> >> >>> > >> > means
>>>     >     > > >> > >> >> >>> > >> >     > that
>>>     >     > > >> > >> >> >>> > >> >     > >>  > once
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> message
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> is written
>> into the
>>>     >     > > broker, it
>>>     >     > > >> > >> will be
>>>     >     > > >> > >> >> >>> > saved on
>>>     >     > > >> > >> >> >>> > >> > the
>>>     >     > > >> > >> >> >>> > >> >     > broker
>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > that
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> specific
>> namespace
>>>     >     > (0x01).
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> If you
>> were to
>>>     > mirror
>>>     >     > that
>>>     >     > > >> > message
>>>     >     > > >> > >> >> into
>>>     >     > > >> > >> >> >>> > another
>>>     >     > > >> > >> >> >>> > >> >     > cluster,
>>>     >     > > >> > >> >> >>> > >> >     > >>  the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > 0x01
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> accompany
>> the
>>>     > message,
>>>     >     > > right?
>>>     >     > > >> > What
>>>     >     > > >> > >> if
>>>     >     > > >> > >> >> the
>>>     >     > > >> > >> >> >>> > >> > deployers of
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > same
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > app
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> other
>> cluster uses
>>>     > 0x57?
>>>     >     > > They
>>>     >     > > >> > won't
>>>     >     > > >> > >> >> >>> > understand
>>>     >     > > >> > >> >> >>> > >> > each
>>>     >     > > >> > >> >> >>> > >> >     > other?
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> I'm not
>> sure
>>>     > that's an
>>>     >     > > >> avoidable
>>>     >     > > >> > >> >> >>> problem. I
>>>     >     > > >> > >> >> >>> > >> > think it
>>>     >     > > >> > >> >> >>> > >> >     > simply
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > means
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > that
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> in
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> order to
>> share
>>>     > data, you
>>>     >     > > have
>>>     >     > > >> to
>>>     >     > > >> > >> also
>>>     >     > > >> > >> >> >>> have a
>>>     >     > > >> > >> >> >>> > >> > shared
>>>     >     > > >> > >> >> >>> > >> >     > (agreed
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > upon)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>> understanding of
>>>     > what the
>>>     >     > > >> > >> namespaces
>>>     >     > > >> > >> >> >>> mean.
>>>     >     > > >> > >> >> >>> > >> Which
>>>     >     > > >> > >> >> >>> > >> > I
>>>     >     > > >> > >> >> >>> > >> >     > think
>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > sense,
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> because the
>>>     > alternate
>>>     >     > > (sharing
>>>     >     > > >> > >> >> *nothing*
>>>     >     > > >> > >> >> >>> at
>>>     >     > > >> > >> >> >>> > >> all)
>>>     >     > > >> > >> >> >>> > >> > would
>>>     >     > > >> > >> >> >>> > >> >     > mean
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > that
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > there
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> would be
>> no way to
>>>     >     > > understand
>>>     >     > > >> > each
>>>     >     > > >> > >> >> other.
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> -James
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Gwen
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Tue,
>> Nov 8,
>>>     > 2016 at
>>>     >     > > 4:23
>>>     >     > > >> > PM,
>>>     >     > > >> > >> >> radai <
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>> [email protected]
>>>     > >
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> wrote:
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> +1 for
>> sean's
>>>     >     > document.
>>>     >     > > it
>>>     >     > > >> > >> covers
>>>     >     > > >> > >> >> >>> pretty
>>>     >     > > >> > >> >> >>> > >> much
>>>     >     > > >> > >> >> >>> > >> > all
>>>     >     > > >> > >> >> >>> > >> >     > the
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > trade-offs
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > and
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> provides
>>>     > concrete
>>>     >     > > figures
>>>     >     > > >> to
>>>     >     > > >> > >> argue
>>>     >     > > >> > >> >> >>> about
>>>     >     > > >> > >> >> >>> > :-)
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >>
>> (nit-picking -
>>>     > used
>>>     >     > the
>>>     >     > > >> same
>>>     >     > > >> > >> xkcd
>>>     >     > > >> > >> >> >>> twice,
>>>     >     > > >> > >> >> >>> > >> also
>>>     >     > > >> > >> >> >>> > >> > trove
>>>     >     > > >> > >> >> >>> > >> >     > has
>>>     >     > > >> > >> >> >>> > >> >     > >>  > been
>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> superceded
>>>     >     > > >> > >> >
>>>     >     > > >> >
>>>     >     > > >> >
>>>     >     > > >> >
>>>     >     > > >> > --
>>>     >     > > >> > Gwen Shapira
>>>     >     > > >> > Product Manager | Confluent
>>>     >     > > >> > 650.450.2760 | @gwenshap
>>>     >     > > >> > Follow us: Twitter | blog
>>>     >     > > >> >
>>>     >     > > >>
>>>     >     > > >>
>>>     >     > > >>
>>>     >     > > >> --
>>>     >     > > >> *Todd Palino*
>>>     >     > > >> Staff Site Reliability Engineer
>>>     >     > > >> Data Infrastructure Streaming
>>>     >     > > >>
>>>     >     > > >>
>>>     >     > > >>
>>>     >     > > >> linkedin.com/in/toddpalino
>>>     >     > > >>
>>>     >     > >
>>>     >     > >
>>>     >     > >
>>>     >     > > --
>>>     >     > > Gwen Shapira
>>>     >     > > Product Manager | Confluent
>>>     >     > > 650.450.2760 | @gwenshap
>>>     >     > > Follow us: Twitter | blog
>>>     >     > >
>>>     >     >
>>>     >
>>>     >
>>>     > The information contained in this email is strictly confidential
>> and for
>>>     > the use of the addressee only, unless otherwise indicated. If you
>> are not
>>>     > the intended recipient, please do not read, copy, use or disclose
>> to others
>>>     > this message or any attachment. Please also notify the sender by
>> replying
>>>     > to this email or by telephone (+44(020 7896 0011) and then delete
>> the email
>>>     > and any copies of it. Opinions, conclusion (etc) that do not
>> relate to the
>>>     > official business of this company shall be understood as neither
>> given nor
>>>     > endorsed by it. IG is a trading name of IG Markets Limited (a
>> company
>>>     > registered in England and Wales, company number 04008957) and IG
>> Index
>>>     > Limited (a company registered in England and Wales, company number
>>>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>> Hill,
>>>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
>> and IG
>>>     > Index Limited (register number 114059) are authorised and
>> regulated by the
>>>     > Financial Conduct Authority.
>>>     >
>>>
>>>
>>> The information contained in this email is strictly confidential and for
>> the use of the addressee only, unless otherwise indicated. If you are not
>> the intended recipient, please do not read, copy, use or disclose to others
>> this message or any attachment. Please also notify the sender by replying
>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>> official business of this company shall be understood as neither given nor
>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>> registered in England and Wales, company number 04008957) and IG Index
>> Limited (a company registered in England and Wales, company number
>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>> Index Limited (register number 114059) are authorised and regulated by the
>> Financial Conduct Authority.
>>>
>>
>>
>

signature.asc
Description: OpenPGP digital signature

Re: [DISCUSS] KIP-82 - Add Record Headers

Reply via email to