Hi Noam, see my answers in-line below.

Would an option in the spec "includeHeaders" make sense?
> Then user could indicate he wish to include headers as fields.
>

I don't think that would be enough. The header name is a string, but the
header value is a byte array.
So we would still need to be able to define a spec for each individual
header.
In that sense a header is no different than the key, except there are
multiple, and each has a name.

Now, in many cases there will be a whole range of headers that all use the
same spec, or will be simple utf-8 encoded string.
In this case we might want to have an easy way to apply a spec to multiple
headers, or pre-defined specs for common encodings.
There are standards out there such as CloudEvents, which leverage Kafka
headers already, so it might be useful to think how we would
want to express those in Druid (see
https://github.com/cloudevents/spec/blob/v1.0.1/kafka-protocol-binding.md)

The part that might be confusing in this setup that the input specs
> existing today have repetition in them like timestampSpec which make sense
> only in the root Spec, but that could be simply ignored if provided (Not
> very friendly but again overcoming it will require a bigger refactoring)
>

Agree, but it might be worth digging a little deeper here. Keep in mind
that Kafka records define their own timestamp, so it might be useful to
allow users to specify to use the record timestamp when needed.
This is especially useful when consuming the output of Kafka-Streams
applications, which by default only output the Kafka timestamp, and don't
include it in the record value or key.
Ideally, we wouldn't want to have to make drastic API changes to support
that in the future.

Reply via email to