Hi Ewan,

So on the point of JMS the predefined/standardised JMS and JMSX headers have 
predefined types. So these can be serialised/deserialised accordingly.

Custom jms headers agreed could be a bit more difficult but on the 80/20 rule I 
would agree mostly they're string values and as anyhow you can hold bytes as a 
string it wouldn't cause any issue, defaulting to that.

But I think easily we maybe able to do one better.

Obviously can override the/config the headers converter but we can supply a 
default converter could take a config file with key to type mapping?

Allowing people to maybe define/declare a header key with the expected type in 
some property file? To support string, byte[] and primitives? And undefined 
headers just either default to String or byte[]

We could also pre define known headers like the jms ones mentioned above.

E.g

AwesomeHeader1=boolean 
AwesomeHeader2=long
JMSCorrelationId=String
JMSXGroupId=String


What you think?


Cheers
Mike






Sent from my iPhone

> On 2 May 2017, at 18:45, Ewen Cheslack-Postava <e...@confluent.io> wrote:
> 
> A couple of thoughts:
> 
> First, agreed that we definitely want to expose header functionality. Thank
> you Mike for starting the conversation! Even if Connect doesn't do anything
> special with it, there's value in being able to access/set headers.
> 
> On motivation -- I think there are much broader use cases. When thinking
> about exposing headers, I'd actually use Replicator as only a minor
> supporting case. The reason is that it is a very uncommon case where there
> is zero impedance mismatch between the source and sink of the data since
> they are both Kafka. This means you don't need to think much about data
> formats/serialization. I think the JMS use case is a better example since
> JMS headers and Kafka headers don't quite match up. Here's a quick list of
> use cases I can think of off the top of my head:
> 
> 1. Include headers from other systems that support them: JMS (or really any
> MQ), HTTP
> 2. Other connector-specific headers. For example, from JDBC maybe the table
> the data comes from is a header; for a CDC connector you might include the
> binlog offset as a header.
> 3. Interceptor/SMT-style use cases for annotating things like provenance of
> data:
> 3a. Generically w/ user-supplied data like data center, host, app ID, etc.
> 3b. Kafka Connect framework level info, such as the connector/task
> generating the data
> 
> On deviation from Connect's model -- to be honest, the KIP-82 also deviates
> quite substantially from how Kafka handles data already, so we may struggle
> a bit to rectify the two. (In particular, headers specify some structure
> and enforce strings specifically for header keys, but then require you to
> do serialization of header values yourself...).
> 
> I think the use cases I mentioned above may also need different approaches
> to how the data in headers are handled. As Gwen mentions, if we expose the
> headers to Connectors, they need to have some idea of the format and the
> reason for byte[] values in KIP-82 is to leave that decision up to the
> organization using them. But without knowing the format, connectors can't
> really do anything with them -- if a source connector assumes a format,
> they may generate data incompatible with the format used by the rest of the
> organization. On the other hand, I have a feeling most people will just use
> <String, String> headers, so allowing connectors to embed arbitrarily
> complex data may not work out well in practice. Or maybe we leave it
> flexible, most people default to using StringConverter for the serializer
> and Connectors will end up defaulting to that just for compatibility...
> 
> I'm not sure I have a real proposal yet, but I do think understanding the
> impact of using a Converter for headers would be useful, and we might want
> to think about how this KIP would fit in with transformations (or if that
> is something that can be deferred, handled separately from the existing
> transformations, etc).
> 
> -Ewen
> 
> On Mon, May 1, 2017 at 11:52 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
> 
>> Hi Gwen,
>> 
>> Then intent here was to allow tools that perform similar role to mirror
>> makers of replicating the messaging from one cluster to another.  Eg like
>> mirror make should just be taking and transferring the headers as is.
>> 
>> We don't actually use this inside our company, so not exposing this isn't
>> an issue for us. Just believe there are companies like confluent who have
>> tools like replicator that do.
>> 
>> And as good citizens think we should complete the work and expose the
>> headers same as in the record to at least allow them to replicate the
>> messages as is. Note Steph seems to want it.
>> 
>> Cheers
>> Mike
>> 
>> Sent using OWA for iPhone
>> ________________________________________
>> From: Gwen Shapira <g...@confluent.io>
>> Sent: Monday, May 1, 2017 2:36:34 PM
>> To: dev@kafka.apache.org
>> Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect
>> 
>> Hi,
>> 
>> I'm excited to see the community expanding Connect in this direction!
>> Headers + Transforms == Fun message routing.
>> 
>> I like how clean the proposal is, but I'm concerned that it kinda deviates
>> from how Connect handles data elsewhere.
>> Unlike Kafka, Connect doesn't look at all data as byte-arrays, we have
>> converters that take data in specific formats (JSON, Avro) and turns it
>> into Connect data types (defined in the data api). I think it will be more
>> consistent for connector developers to also get headers as some kind of
>> structured or semi-structured data (and to expand the converters to handle
>> header conversions as well).
>> This will allow for Connect's separation of concerns - Connector developers
>> don't worry about data formats (because they get the internal connect
>> objects) and Converters do all the data format work.
>> 
>> Another thing, in my experience, APIs work better if they are put into use
>> almost immediately - so difficulties in using the APIs are immediately
>> surfaced. Are you planning any connectors that will use this feature (not
>> necessarily in Kafka, just in general)? Or perhaps we can think of a way to
>> expand Kafka's file connectors so they'll use headers somehow (can't think
>> of anything, but maybe?).
>> 
>> Gwen
>> 
>> On Sat, Apr 29, 2017 at 12:12 AM, Michael Pearce <michael.pea...@ig.com>
>> wrote:
>> 
>>> Hi All,
>>> 
>>> Now KIP-82 is committed I would like to discuss extending the work to
>>> expose it in Kafka Connect, its primary focus being so connectors that
>> may
>>> do similar tasks as MirrorMakers, either Kafka->Kafka or JMS-Kafka would
>> be
>>> able to replicate the headers.
>>> It would be ideal but not mandatory for this to go in 0.11 release so is
>>> available on day one of headers being available.
>>> 
>>> Please find the KIP here:
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 145+-+Expose+Record+Headers+in+Kafka+Connect
>>> 
>>> Please find an initial implementation as a PR here:
>>> https://github.com/apache/kafka/pull/2942
>>> 
>>> Kind Regards
>>> Mike
>>> The information contained in this email is strictly confidential and for
>>> the use of the addressee only, unless otherwise indicated. If you are not
>>> the intended recipient, please do not read, copy, use or disclose to
>> others
>>> this message or any attachment. Please also notify the sender by replying
>>> to this email or by telephone (+44(020 7896 0011) and then delete the
>> email
>>> and any copies of it. Opinions, conclusion (etc) that do not relate to
>> the
>>> official business of this company shall be understood as neither given
>> nor
>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>> registered in England and Wales, company number 04008957) and IG Index
>>> Limited (a company registered in England and Wales, company number
>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>> Index Limited (register number 114059) are authorised and regulated by
>> the
>>> Financial Conduct Authority.
>>> 
>> 
>> 
>> 
>> --
>> *Gwen Shapira*
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
>> <http://www.confluent.io/blog>
>> The information contained in this email is strictly confidential and for
>> the use of the addressee only, unless otherwise indicated. If you are not
>> the intended recipient, please do not read, copy, use or disclose to others
>> this message or any attachment. Please also notify the sender by replying
>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>> official business of this company shall be understood as neither given nor
>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>> registered in England and Wales, company number 04008957) and IG Index
>> Limited (a company registered in England and Wales, company number
>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>> Index Limited (register number 114059) are authorised and regulated by the
>> Financial Conduct Authority.
>> 

Reply via email to