Hi Noam, see my answers in-line below. Would an option in the spec "includeHeaders" make sense? > Then user could indicate he wish to include headers as fields. >
I don't think that would be enough. The header name is a string, but the header value is a byte array. So we would still need to be able to define a spec for each individual header. In that sense a header is no different than the key, except there are multiple, and each has a name. Now, in many cases there will be a whole range of headers that all use the same spec, or will be simple utf-8 encoded string. In this case we might want to have an easy way to apply a spec to multiple headers, or pre-defined specs for common encodings. There are standards out there such as CloudEvents, which leverage Kafka headers already, so it might be useful to think how we would want to express those in Druid (see https://github.com/cloudevents/spec/blob/v1.0.1/kafka-protocol-binding.md) The part that might be confusing in this setup that the input specs > existing today have repetition in them like timestampSpec which make sense > only in the root Spec, but that could be simply ignored if provided (Not > very friendly but again overcoming it will require a bigger refactoring) > Agree, but it might be worth digging a little deeper here. Keep in mind that Kafka records define their own timestamp, so it might be useful to allow users to specify to use the record timestamp when needed. This is especially useful when consuming the output of Kafka-Streams applications, which by default only output the Kafka timestamp, and don't include it in the record value or key. Ideally, we wouldn't want to have to make drastic API changes to support that in the future.