[ https://issues.apache.org/jira/browse/FLINK-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139005#comment-15139005 ]
Shikhar Bhushan commented on FLINK-3375: ---------------------------------------- This would be super useful for me, as I currently have to unnecessarily use a parallelism of 30 since there are 30 partitions, when even parallelism=1 would suffice and works more efficiently. It would be great if the existing {{TimestampExtractor}} interface can be supported, or any other interface to allow for watermark to be determined in a different way than simply ascending -- in my case, the timestamps on a partition should be mostly ascending but the messages are produced from different machines so I need to account for small inconsistencies in their system clocks. Currently using this extractor: https://gist.github.com/shikhar/2d9306e2ebd8ca89728c > Allow Watermark Generation in the Kafka Source > ---------------------------------------------- > > Key: FLINK-3375 > URL: https://issues.apache.org/jira/browse/FLINK-3375 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector > Affects Versions: 1.0.0 > Reporter: Stephan Ewen > Fix For: 1.0.0 > > > It is a common case that event timestamps are ascending inside one Kafka > Partition. Ascending timestamps are easy for users, because they are handles > by ascending timestamp extraction. > If the Kafka source has multiple partitions per source task, then the records > become out of order before timestamps can be extracted and watermarks can be > generated. > If we make the FlinkKafkaConsumer an event time source function, it can > generate watermarks itself. It would internally implement the same logic as > the regular operators that merge streams, keeping track of event time > progress per partition and generating watermarks based on the current > guaranteed event time progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)