[jira] [Created] (KAFKA-5632) Message headers not supported by Kafka Streams

CJ Woolard (JIRA) Mon, 24 Jul 2017 08:19:25 -0700

CJ Woolard created KAFKA-5632:
---------------------------------

             Summary: Message headers not supported by Kafka Streams
                 Key: KAFKA-5632
                 URL: https://issues.apache.org/jira/browse/KAFKA-5632
             Project: Kafka
          Issue Type: Bug
          Components: consumer
    Affects Versions: 0.11.0.0
            Reporter: CJ Woolard
            Priority: Minor

The new message headers functionality introduced in Kafka 0.11.0.0
(https://cwiki.apache.org/confluence/display/KAFKA/KIP-82+-+Add+Record+Headers)
do not appear to be respected by Kafka Streams, specifically message headers
set on input topics to a Kafka Streams topology do not get propagated to the
corresponding output topics of the topology.

It appears that it's at least partially due to the SourceNodeRecordDeserializer
not properly respecting message headers here:

https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/SourceNodeRecordDeserializer.java#L60

where it isn't using the new ConsumerRecord constructor which supports headers:

https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecord.java#L122

For additional background here is the line before which we noticed that we
still have the message headers, and after which we no longer have them:

https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordQueue.java#L93

In terms of a potential solution there are a few different scenarios to
consider:
1. A stream processor with one input and one output, i.e. 1-to-1, (A
map/transformation for example). This is the simplest case, and one proposal
would be to directly propagate any message headers from input to output.
2. A stream processor with one input and many outputs, i.e. 1-to-many, (A
flatmap step for example).
3. A stream processor with multiple inputs per output, i.e. many-to-1, (A join
step for example).
One proposal for supporting all possible scenarios would be to expose overloads
in the Kafka Streams DSL methods to allow the user the ability to specify logic
for handling of message headers.

For additional background the use case is similar to a distributed tracing use
case, where the following previous work may be useful for aiding in design
discussions:
Dapper
(https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf)

or
Zipkin (https://github.com/openzipkin/zipkin)

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KAFKA-5632) Message headers not supported by Kafka Streams

Reply via email to