[
https://issues.apache.org/jira/browse/KAFKA-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018071#comment-16018071
]
ASF GitHub Bot commented on KAFKA-4785:
---------------------------------------
GitHub user jeyhunkarimov opened a pull request:
https://github.com/apache/kafka/pull/3106
KAFKA-4785: Records from internal repartitioning topics should always use
RecordMetadataTimestampExtractor
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4785
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/3106.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3106
----
commit 98278baae63638efc2b750b1d3f12e936d845f18
Author: Jeyhun Karimov <[email protected]>
Date: 2017-05-19T21:39:50Z
Records from internal repartitioning topics should always use
RecordMetadataTimestampExtractor
----
> Records from internal repartitioning topics should always use
> RecordMetadataTimestampExtractor
> ----------------------------------------------------------------------------------------------
>
> Key: KAFKA-4785
> URL: https://issues.apache.org/jira/browse/KAFKA-4785
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.10.2.0
> Reporter: Matthias J. Sax
> Assignee: Jeyhun Karimov
>
> Users can specify what timestamp extractor should be used to decode the
> timestamp of input topic records. As long as RecordMetadataTimestamp or
> WallclockTime is use this is fine.
> However, for custom timestamp extractors it might be invalid to apply this
> custom extractor to records received from internal repartitioning topics. The
> reason is that Streams sets the current "stream time" as record metadata
> timestamp explicitly before writing to intermediate repartitioning topics
> because this timestamp should be use by downstream subtopologies. A custom
> timestamp extractor might return something different breaking this assumption.
> Thus, for reading data from intermediate repartitioning topic, the configured
> timestamp extractor should not be used, but the record's metadata timestamp
> should be extracted as record timestamp.
> In order to leverage the same behavior for intermediate user topic (ie, used
> in {{through()}}) we can leverage KAFKA-4144 and internally set an extractor
> for those "intermediate sources" that returns the record's metadata timestamp
> in order to overwrite the global extractor from {{StreamsConfig}} (ie, set
> {{FailOnInvalidTimestampExtractor}}).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)