[
https://issues.apache.org/jira/browse/KAFKA-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989989#comment-15989989
]
Matthias J. Sax commented on KAFKA-4785:
----------------------------------------
IMHO, KAFKA-4219 is quite different:
This JIRA is a "bug guard" -- currently, if a custom TS extractor is provide,
users need to make sure that they return embedded record timestamp for internal
topics -- with this JIRA, we do this automatically and thus preventing
potential bugs in custom TS extractors. Thus, it's about _reading_.
KAFKA-4219 however, is about a "new feature" that allows to assign timestamp
differently on _write_. Not sue if this would be some API or config, or just a
change to how Streams does assign timestamps internally.
> Records from internal repartitioning topics should always use
> RecordMetadataTimestampExtractor
> ----------------------------------------------------------------------------------------------
>
> Key: KAFKA-4785
> URL: https://issues.apache.org/jira/browse/KAFKA-4785
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.10.2.0
> Reporter: Matthias J. Sax
> Assignee: Jeyhun Karimov
>
> Users can specify what timestamp extractor should be used to decode the
> timestamp of input topic records. As long as RecordMetadataTimestamp or
> WallclockTime is use this is fine.
> However, for custom timestamp extractors it might be invalid to apply this
> custom extractor to records received from internal repartitioning topics. The
> reason is that Streams sets the current "stream time" as record metadata
> timestamp explicitly before writing to intermediate repartitioning topics
> because this timestamp should be use by downstream subtopologies. A custom
> timestamp extractor might return something different breaking this assumption.
> Thus, for reading data from intermediate repartitioning topic, the configured
> timestamp extractor should not be used, but the record's metadata timestamp
> should be extracted as record timestamp.
> In order to leverage the same behavior for intermediate user topic (ie, used
> in {{through()}}) we can leverage KAFKA-4144 and internally set an extractor
> for those "intermediate sources" that returns the record's metadata timestamp
> in order to overwrite the global extractor from {{StreamsConfig}} (ie, set
> {{FailOnInvalidTimestampExtractor}}).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)