Hi On 1, How can you guarantee two separate implemented clients would add the headers in the same order we are not specifying an order at the protocol level (nor should we) with regards to keyA being ordered before keyB? We shouldn’t be expecting keyA to be always set before keyB.
On 2, I believe we have changed the naming based on feedback from Jason already, e.g. we don’t have “get” method that inferred O(1) performance, like wise nor “put” but we have an “append” On 3, in the KafkaProducer, I think we have mutability already, the value for time is changed if it is null, at the point of send: “ long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp(); “ As such the timestamp is already mutable, so what’s the difference here, we already have some mixed semantics. On timestamp. e.g. currently if I send to records with timestamp not set, the wire binary sent the value for the timestamp would be different, as such we have mutation for the same record. On 4, I think we should not expect not 1 or 2 headers, but infact 10’s of headers. This is the concern on immutable headers, whilst the append self-reference works nicely, what if someone needs to remove a header? Trying to get this moving: If we really wanted Immutable Headers and essentially you guys wont give +1 for it without. Whats the feeling for adding methods to ProducerRecord that does the boiler plate code or creating a new ProducerRecord with the altered new headers (appended or removed) inside. E.g. ProducerRecord { ProducerRecord append(Iterable<Headers> headersToAppend){ return new ProducerRecord(key, value, headers.append(headersToAppend), ….) } ProducerRecord remove(Iterable<Headers> headersToAppend){ return new ProducerRecord(key, value, headers.remove(headersToAppend), ….) } } Were the headers methods actually returns new objects, and the producer records methods create a new producer record with all the current values, but with the new modified headers. Then interceptors / code return this new object? Cheers Mike On 24/02/2017, 16:02, "isma...@gmail.com on behalf of Ismael Juma" <isma...@gmail.com on behalf of ism...@juma.me.uk> wrote: Hi Michael, Did you mean that you were happy to compromise to keep it mutable or immutable? You wrote the former, but it sounded from the sentence that it could have been a typo. So, my thoughts on this is that there are a few things to take into account: 1. Semantics 2. Simplicity of use (the common operations should be easy to do) 3. If it's easy to reason about and safe (immutability helps with this) 4. Efficiency (both memory and CPU usage) Regarding 1, I think it would be good to be very clear about the guarantees that we are providing. It seems that we are saying that keys are unordered, but what about the case where there are multiple values for the same key? It seems that for some use cases (e.g. lineage), it may be useful to add values to the same key while preserving the order. Regarding 2, I agree that it's useful to have methods in `Headers` for the very common use cases although we have to be careful with the naming to avoid giving the wrong impression. Also, when it comes to `Map.Entry`, I think I'd prefer a `toMap` method that simply gives the user a Map if that's what they want (it makes it clear that there's a conversion happening). Regarding 3, the concern I have if we make the headers mutable is that it seems to introduce some inconsistencies and potential edge cases. At the moment, it's safe to keep a ProducerRecord and resend it, for example. If the record is mutated by an interceptor, then this can lead to weird behaviour. Also, it seems inconsistent that one has to create a new ProducerRecord to modify the record timestamp, but that one has to mutate the record to add headers. It seems like one should either embrace immutability or mutability, but mixing both is not ideal. Regarding 4, for the cases where there are a small number of headers and/or 1 or 2 interceptors, it doesn't seem difficult to come up with reasonable implementations for mutable and immutable cases that do a good enough job. However, if the number of headers and interceptors is large, then care is needed for the immutable case to avoid unnecessary copying. A simple option that adds little memory overhead if we have Header instances is to simply add a self-reference to the previous Header in the linked list. Ismael On Thu, Feb 23, 2017 at 1:09 AM, Michael Pearce <michael.pea...@ig.com> wrote: > Im happy to compromise to keep it mutable but move to an append style api. > (as in guava interables concat) > > class Headers { > Headers append(Iterable<Header> headers); > } > > > I don’t think we’d want prepend, this would give the idea of guaranteed > ordering, when in actual fact we don’t provide that guarantee (.e.g one > client can put headerA, then headerB, but another could put headerB then > headerA, this shouldn’t cause issues), Also what if we changed to a hashmap > for the internal implementation, its just a bucket of entries no ordering. > I think we just need to provide an api to add/append headers. > > This ok? If so ill update KIP to record this. > > Cheers > Mike > > On 23/02/2017, 00:37, "Jason Gustafson" <ja...@confluent.io> wrote: > > The point about usability is fair. It's also reasonable to expect that > common use cases such as appending headers should be done efficiently. > > Perhaps we could compromise with something like this? > > class Headers { > Headers append(Iterable<Header> headers); > Headers prepend(Iterable<Header> headers); > } > > That retains ease of use while still giving ourselves some flexibility > in > the implementation. > > -Jason > > > On Wed, Feb 22, 2017 at 3:03 PM, Michael Pearce <michael.pea...@ig.com > > > wrote: > > > I wasn’t referring to the headers needing to be copied, im meaning > the > > fact we’d be forcing a new producer record to be created, with all > the > > contents copied. > > > > i.e what will happen is utility method will be created or end up > being > > used, which does this, and returns the new ProducerRecord instance. > > > > ProducerRecord addHeader(ProducerRecord record, Header header){ > > Return New ProducerRecord(record.key, record.value, > record.timestamp….., > > record.headers.concat(header)) > > } > > > > To me this seems ugly, but will be inevitable if we don’t make adding > > headers to existing records a simple clean method call. > > > > > > > > On 22/02/2017, 22:57, "Michael Pearce" <michael.pea...@ig.com> > wrote: > > > > Lazy init can achieve/avoid that. > > > > Re the concat, why don’t we implement that inside the Headers > rather > > than causing everyone to implement this as adding headers in > interceptors > > will be a dominant use case. We want a user friendly API. Having as > a user > > having to code this instead of having the headers handle this for me > seems > > redundant. > > > > On 22/02/2017, 22:34, "Jason Gustafson" <ja...@confluent.io> > wrote: > > > > I thought the argument was against creating the extra objects > > unnecessarily > > (i.e. if they were not accessed). And note that making the > Headers > > immutable doesn't necessarily mean that they need to be > copied: > > you can do > > a trick like Guava's Iterables.concat to add additional > headers > > without > > changing the underlying collections. > > > > -Jason > > > > On Wed, Feb 22, 2017 at 2:22 PM, Michael Pearce < > > michael.pea...@ig.com> > > wrote: > > > > > If the argument for not having a map holding the key, value > > pairs is due > > > to garbage creation of HashMap entry's, forcing the > creation of > > a whole new > > > producer record to simply add a head, surely is creating > a-lot > > more? > > > ________________________________________ > > > From: Jason Gustafson <ja...@confluent.io> > > > Sent: Wednesday, February 22, 2017 10:09 PM > > > To: dev@kafka.apache.org > > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > > > > > > The current producer interceptor API is this: > > > > > > ProducerRecord<K, V> onSend(ProducerRecord<K, V> record); > > > > > > So adding a header means creating a new ProducerRecord > with a > > new header > > > added to the current headers and returning it. Would that > not > > work? > > > > > > -Jason > > > > > > On Wed, Feb 22, 2017 at 1:45 PM, Michael Pearce < > > michael.pea...@ig.com> > > > wrote: > > > > > > > So how would you have this work if not mutable where > > interceptors would > > > > add headers? > > > > > > > > Sent using OWA for iPhone > > > > ________________________________________ > > > > From: Jason Gustafson <ja...@confluent.io> > > > > Sent: Wednesday, February 22, 2017 8:42:27 PM > > > > To: dev@kafka.apache.org > > > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > > > > > > > > I think the point on the mutability of Headers is worth > > discussing a > > > little > > > > more. As far as I can tell, once the ProducerRecord (or > > ConsumerRecord) > > > is > > > > constructed, there should be no need to further change > the > > headers. Is > > > that > > > > correct? If so, then why not enforce that that is the > case > > through the > > > API? > > > > One problem with mutability it that it constrains the > > implementation of > > > > Headers. For example, if we were backing with a byte > slice, > > would we > > > recopy > > > > the bytes if a header is added or would we maintain a > satellite > > > collection > > > > of added records. Seems not great either way. If we > really > > think > > > mutability > > > > is needed, perhaps we could add a method to headers to > convert > > it to a > > > > mutable type (e.g. a Headers.toMap)? > > > > > > > > I'm also with Ismael about exposing Headers.get(). I > thought > > it might > > > make > > > > sense to have a method like this instead: > > > > > > > > Iterable<Header> findMatching(Pattern pattern); > > > > > > > > This makes the (potential) need to scan the headers > clear in > > the API. I'd > > > > also be fine exposing no getter at all. In general, Ï > think > > it's good to > > > > start with a minimalistic API and work from there since > it's > > always > > > easier > > > > to add methods than remove them. > > > > > > > > -Jason > > > > > > > > On Wed, Feb 22, 2017 at 9:16 AM, Michael Pearce < > > michael.pea...@ig.com> > > > > wrote: > > > > > > > > > > > > > > Hi Ismael > > > > > > > > > > On point 1, > > > > > > > > > > Sure makes sense will update shortly. > > > > > > > > > > On point 2, > > > > > > > > > > Setter/getter typical to properties/headers api’s > > traditionally are map > > > > > styled interfaces and what I believe is most expected > styled > > thus the > > > > Key, > > > > > Value setter. > > > > > Also it would mean rather than an interface, we would > be > > making our > > > > > internal header impl object we have for the array, > exposed. > > E.g. if we > > > > had > > > > > a Map really this would be Map.Entry interface, this > is the > > same > > > reasons > > > > on > > > > > the map interface I cannot actually make the > underlying Node > > object > > > > that’s > > > > > the implementation for HashMap, so that internals can > be > > changed. > > > > > > > > > > > > > > > On point 3, > > > > > > > > > > I think it people do expect it to be performant, thus > why > > originally > > > > > concern I raised with using an array for to me is an > early > > memory > > > > > optimisation. I think the user experience of > > properties/headers is on a > > > > > get/set model. This is why its important we have > > encapsulated logic > > > that > > > > > then allows us to change this to a map, if this > becomes and > > issue, and > > > > the > > > > > memory overhead of hashmap is less so. > > > > > > > > > > > > > > > > > > > > > > > > > On 22/02/2017, 15:56, "isma...@gmail.com on behalf of > > Ismael Juma" < > > > > > isma...@gmail.com on behalf of ism...@juma.me.uk> > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > Great to see the progress that has been achieved > on this > > one. :) A > > > > few > > > > > comments regarding the APIs (I'm still reviewing > the > > message format > > > > > changes): > > > > > > > > > > 1. Nit: `getHeaders` in `ProducerRecord` and > > `ConsumerRecord` > > > should > > > > be > > > > > named `headers` (we avoid the `get` prefix in > Kafka) > > > > > > > > > > 2. The `Headers` class is mutable (there's an `add` > > method). Does > > > it > > > > > need > > > > > to be? If so, it would be good to explain why. > Related > > to that, we > > > > > should > > > > > also explain the thinking around thread-safety. If > we > > keep the > > > `add` > > > > > method, it may make sense for it to take a `Header` > > (that way we > > > can > > > > > add > > > > > things to `Header` without changing the interface). > > > > > > > > > > 3. Do we need the `Headers.get()` method? People > usually > > assume > > > that > > > > > `get` > > > > > would be efficient, but depending on the > implementation > > (the > > > current > > > > > proposal states that an array would be used), it > may not > > be. If we > > > > > expect > > > > > the number of headers to be small, it doesn't > matter > > though. > > > > > > > > > > Ismael > > > > > > > > > > On Tue, Feb 21, 2017 at 6:38 PM, Michael Pearce < > > > > michael.pea...@ig.com > > > > > > > > > > > wrote: > > > > > > > > > > > Hi Jason, > > > > > > > > > > > > Have converted the interface/api bullets into > > interface code > > > > > snippets. > > > > > > > > > > > > Agreed implementation won’t take too long. We > have > > early versions > > > > > already. > > > > > > Maybe a week before you think about merging I > would > > assume it > > > would > > > > > be more > > > > > > stabilised? I was thinking then we could fork > from > > your confluent > > > > > branch, > > > > > > making and then holding KIP-82 changes in a patch > > file, that we > > > can > > > > > then > > > > > > re-fork from apache once KIP98 final is merged, > and > > apply patch > > > > with > > > > > last > > > > > > minute changes. > > > > > > > > > > > > Cheers > > > > > > Mike > > > > > > > > > > > > > > > > > > On 22/02/2017, 00:56, "Jason Gustafson" < > > ja...@confluent.io> > > > > wrote: > > > > > > > > > > > > Hey Michael, > > > > > > > > > > > > Awesome. I have a minor request. The APIs are > > currently > > > > > documented as a > > > > > > wiki list. Would you mind adding a code > snippet > > instead? > > > It's a > > > > > bit > > > > > > easier > > > > > > to process. > > > > > > > > > > > > How will be best to manage this, as we will > > obviously build > > > off > > > > > your > > > > > > KIP’s > > > > > > > protocol changes, to avoid a merge hell, > should > > we branch > > > > from > > > > > your > > > > > > branch > > > > > > > in the confluent repo or is it worth > having a > > KIP-98 > > > special > > > > > branch > > > > > > in the > > > > > > > apache git, that we can branch/fork from? > > > > > > > > > > > > > > > > > > I was thinking about this also. Ideally we'd > like > > to get the > > > > > changes > > > > > > in as > > > > > > close together as possible since we only > want one > > magic bump > > > > and > > > > > some > > > > > > users > > > > > > deploy trunk. The level of effort to change > the > > format for > > > > > headers > > > > > > seems > > > > > > not too high. Do you agree? My guess is that > the > > KIP-98 > > > message > > > > > format > > > > > > patch will take 2-3 weeks to review before we > > merge to trunk, > > > > so > > > > > you > > > > > > could > > > > > > hold off implementing until that patch has > somewhat > > > stabilized. > > > > > That > > > > > > would > > > > > > save some potential rebase pain. > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > > > > > The information contained in this email is > strictly > > confidential > > > > and > > > > > for > > > > > > the use of the addressee only, unless otherwise > > indicated. If you > > > > > are not > > > > > > the intended recipient, please do not read, > copy, use > > or disclose > > > > to > > > > > others > > > > > > this message or any attachment. Please also > notify the > > sender by > > > > > replying > > > > > > to this email or by telephone (+44(020 7896 0011) > and > > then delete > > > > the > > > > > > email and any copies of it. Opinions, conclusion > (etc) > > that do > > > not > > > > > relate > > > > > > to the official business of this company shall be > > understood as > > > > > neither > > > > > > given nor endorsed by it. IG is a trading name > of IG > > Markets > > > > Limited > > > > > (a > > > > > > company registered in England and Wales, company > > number 04008957) > > > > > and IG > > > > > > Index Limited (a company registered in England > and > > Wales, company > > > > > number > > > > > > 01190902). Registered address at Cannon Bridge > House, > > 25 Dowgate > > > > > Hill, > > > > > > London EC4R 2YA. Both IG Markets Limited > (register > > number 195355) > > > > > and IG > > > > > > Index Limited (register number 114059) are > authorised > > and > > > regulated > > > > > by the > > > > > > Financial Conduct Authority. > > > > > > > > > > > > > > > > > > > > > The information contained in this email is strictly > > confidential and > > > for > > > > > the use of the addressee only, unless otherwise > indicated. > > If you are > > > not > > > > > the intended recipient, please do not read, copy, use > or > > disclose to > > > > others > > > > > this message or any attachment. Please also notify the > > sender by > > > replying > > > > > to this email or by telephone (+44(020 7896 0011) and > then > > delete the > > > > email > > > > > and any copies of it. Opinions, conclusion (etc) that > do not > > relate to > > > > the > > > > > official business of this company shall be understood > as > > neither given > > > > nor > > > > > endorsed by it. IG is a trading name of IG Markets > Limited > > (a company > > > > > registered in England and Wales, company number > 04008957) > > and IG Index > > > > > Limited (a company registered in England and Wales, > company > > number > > > > > 01190902). Registered address at Cannon Bridge House, > 25 > > Dowgate Hill, > > > > > London EC4R 2YA. Both IG Markets Limited (register > number > > 195355) and > > > IG > > > > > Index Limited (register number 114059) are authorised > and > > regulated by > > > > the > > > > > Financial Conduct Authority. > > > > > > > > > The information contained in this email is strictly > > confidential and for > > > > the use of the addressee only, unless otherwise > indicated. If > > you are not > > > > the intended recipient, please do not read, copy, use or > > disclose to > > > others > > > > this message or any attachment. Please also notify the > sender > > by replying > > > > to this email or by telephone (+44(020 7896 0011) and > then > > delete the > > > email > > > > and any copies of it. Opinions, conclusion (etc) that do > not > > relate to > > > the > > > > official business of this company shall be understood as > > neither given > > > nor > > > > endorsed by it. IG is a trading name of IG Markets > Limited (a > > company > > > > registered in England and Wales, company number > 04008957) and > > IG Index > > > > Limited (a company registered in England and Wales, > company > > number > > > > 01190902). Registered address at Cannon Bridge House, 25 > > Dowgate Hill, > > > > London EC4R 2YA. Both IG Markets Limited (register number > > 195355) and IG > > > > Index Limited (register number 114059) are authorised and > > regulated by > > > the > > > > Financial Conduct Authority. > > > > > > > The information contained in this email is strictly > confidential > > and for > > > the use of the addressee only, unless otherwise indicated. > If > > you are not > > > the intended recipient, please do not read, copy, use or > > disclose to others > > > this message or any attachment. Please also notify the > sender by > > replying > > > to this email or by telephone (+44(020 7896 0011) and then > > delete the email > > > and any copies of it. Opinions, conclusion (etc) that do > not > > relate to the > > > official business of this company shall be understood as > neither > > given nor > > > endorsed by it. IG is a trading name of IG Markets Limited > (a > > company > > > registered in England and Wales, company number 04008957) > and IG > > Index > > > Limited (a company registered in England and Wales, company > > number > > > 01190902). Registered address at Cannon Bridge House, 25 > Dowgate > > Hill, > > > London EC4R 2YA. Both IG Markets Limited (register number > > 195355) and IG > > > Index Limited (register number 114059) are authorised and > > regulated by the > > > Financial Conduct Authority. > > > > > > > > > > > > > The information contained in this email is strictly confidential and > for > > the use of the addressee only, unless otherwise indicated. If you > are not > > the intended recipient, please do not read, copy, use or disclose to > others > > this message or any attachment. Please also notify the sender by > replying > > to this email or by telephone (+44(020 7896 0011) and then delete > the email > > and any copies of it. Opinions, conclusion (etc) that do not relate > to the > > official business of this company shall be understood as neither > given nor > > endorsed by it. IG is a trading name of IG Markets Limited (a company > > registered in England and Wales, company number 04008957) and IG > Index > > Limited (a company registered in England and Wales, company number > > 01190902). Registered address at Cannon Bridge House, 25 Dowgate > Hill, > > London EC4R 2YA. Both IG Markets Limited (register number 195355) > and IG > > Index Limited (register number 114059) are authorised and regulated > by the > > Financial Conduct Authority. > > > > > The information contained in this email is strictly confidential and for > the use of the addressee only, unless otherwise indicated. If you are not > the intended recipient, please do not read, copy, use or disclose to others > this message or any attachment. Please also notify the sender by replying > to this email or by telephone (+44(020 7896 0011) and then delete the email > and any copies of it. Opinions, conclusion (etc) that do not relate to the > official business of this company shall be understood as neither given nor > endorsed by it. IG is a trading name of IG Markets Limited (a company > registered in England and Wales, company number 04008957) and IG Index > Limited (a company registered in England and Wales, company number > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG > Index Limited (register number 114059) are authorised and regulated by the > Financial Conduct Authority. > The information contained in this email is strictly confidential and for the use of the addressee only, unless otherwise indicated. If you are not the intended recipient, please do not read, copy, use or disclose to others this message or any attachment. Please also notify the sender by replying to this email or by telephone (+44(020 7896 0011) and then delete the email and any copies of it. Opinions, conclusion (etc) that do not relate to the official business of this company shall be understood as neither given nor endorsed by it. IG is a trading name of IG Markets Limited (a company registered in England and Wales, company number 04008957) and IG Index Limited (a company registered in England and Wales, company number 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG Index Limited (register number 114059) are authorised and regulated by the Financial Conduct Authority.