[ 
https://issues.apache.org/jira/browse/FLINK-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139005#comment-15139005
 ] 

Shikhar Bhushan commented on FLINK-3375:
----------------------------------------

This would be super useful for me, as I currently have to unnecessarily use a 
parallelism of 30 since there are 30 partitions, when even parallelism=1 would 
suffice and works more efficiently.

It would be great if the existing {{TimestampExtractor}} interface can be 
supported, or any other interface to allow for watermark to be determined in a 
different way than simply ascending -- in my case, the timestamps on a 
partition should be mostly ascending but the messages are produced from 
different machines so I need to account for small inconsistencies in their 
system clocks. Currently using this extractor: 
https://gist.github.com/shikhar/2d9306e2ebd8ca89728c

> Allow Watermark Generation in the Kafka Source
> ----------------------------------------------
>
>                 Key: FLINK-3375
>                 URL: https://issues.apache.org/jira/browse/FLINK-3375
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kafka Connector
>    Affects Versions: 1.0.0
>            Reporter: Stephan Ewen
>             Fix For: 1.0.0
>
>
> It is a common case that event timestamps are ascending inside one Kafka 
> Partition. Ascending timestamps are easy for users, because they are handles 
> by ascending timestamp extraction.
> If the Kafka source has multiple partitions per source task, then the records 
> become out of order before timestamps can be extracted and watermarks can be 
> generated.
> If we make the FlinkKafkaConsumer an event time source function, it can 
> generate watermarks itself. It would internally implement the same logic as 
> the regular operators that merge streams, keeping track of event time 
> progress per partition and generating watermarks based on the current 
> guaranteed event time progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to