As others have laid out, I see strong reasons for a common message
metadata structure for the Kafka ecosystem.  In particular, I've seen that
even within a single organization, infrastructure teams often own the
message metadata while application teams own the application-level data
format.  Allowing metadata and content to have different structure and
evolve separately is very helpful for this.  Also, I think there's a lot of
value to having a common metadata structure shared across the Kafka
ecosystem so that tools which leverage metadata can more easily be shared
across organizations and integrated together.

The question is, where does the metadata structure belong?  Here's my take:

We change the Kafka wire and on-disk format to from a (key, value) model to
a (key, metadata, value) model where all three are byte arrays from the
brokers point of view.  The primary reason for this is that it provides a
backward compatible migration path forward.  Producers can start populating
metadata fields before all consumers understand the metadata structure.
For people who already have custom envelope structures, they can populate
their existing structure and the new structure for a while as they make the
transition.

We could stop there and let the clients plug in a KeySerializer,
MetadataSerializer, and ValueSerializer but I think it is also be useful to
have a default MetadataSerializer that implements a key-value model similar
to AMQP or HTTP headers.  Or we could go even further and prescribe a
Map<String, byte[]> or Map<String, String> data model for headers in the
clients (while still allowing custom serialization of the header data
model).

I think this would address Radai's concerns:
1. All client code would not need to be updated to know about the container.
2. Middleware friendly clients would have a standard header data model to
work with.
3. KIP is required both b/c of broker changes and because of client API
changes.

Cheers,

Roger


On Wed, Nov 2, 2016 at 4:38 PM, radai <radai.rosenbl...@gmail.com> wrote:

> my biggest issues with a "standard" wrapper format:
>
> 1. _ALL_ client _CODE_ (as opposed to kafka lib version) must be updated to
> know about the container, because any old naive code trying to directly
> deserialize its own payload would keel over and die (it needs to know to
> deserialize a container, and then dig in there for its payload).
> 2. in order to write middleware-friendly clients that utilize such a
> container one would basically have to write their own producer/consumer API
> on top of the open source kafka one.
> 3. if you were going to go with a wrapper format you really dont need to
> bother with a kip (just open source your own client stack from #2 above so
> others could stop re-inventing it)
>
> On Wed, Nov 2, 2016 at 4:25 PM, James Cheng <wushuja...@gmail.com> wrote:
>
> > How exactly would this work? Or maybe that's out of scope for this email.
>

Reply via email to