Re: How to understand create watermark for Kafka partitions

2019-12-13 Thread vino yang
Hi Alex,

>> But why also say created watermark for each Kafka topic partitions ?

IMO, the official documentation has explained the reason. Just copied here:

When using Apache Kafka

as
a data source, each Kafka partition may have a simple event time pattern
(ascending timestamps or bounded out-of-orderness). However, when consuming
streams from Kafka, multiple partitions often get consumed in parallel,
interleaving the events from the partitions and destroying the
per-partition patterns (this is inherent in how Kafka’s consumer clients
work).

In that case, you can use Flink’s Kafka-partition-aware watermark
generation. Using that feature, watermarks are generated inside the Kafka
consumer, per Kafka partition, and the per-partition watermarks are merged
in the same way as watermarks are merged on stream shuffles.


>> As I tested, watermarks also created by global, even I run my job with
parallels. And assign watermarks on Kafka consumer .


Did you follow the official example? Can you share your program?


Best,

Vino

qq <471237...@qq.com> 于2019年12月13日周五 上午9:57写道:

> Hi all,
>
>   I confused with watermark for each Kafka partitions.  As I know
> watermark  created by data stream level. But why also say created watermark
> for each Kafka topic partitions ? As I tested, watermarks also created by
> global, even I run my job with parallels. And assign watermarks on Kafka
> consumer . Thanks .
>
> Below text copied from flink web.
>
>
> you can use Flink’s Kafka-partition-aware watermark generation. Using that
> feature, watermarks are generated inside the Kafka consumer, per Kafka
> partition, and the per-partition watermarks are merged in the same way as
> watermarks are merged on stream shuffles.
>
> For example, if event timestamps are strictly ascending per Kafka
> partition, generating per-partition watermarks with the ascending
> timestamps watermark generator
> 
>  will
> result in perfect overall watermarks.
>
> The illustrations below show how to use the per-Kafka-partition watermark
> generation, and how watermarks propagate through the streaming dataflow in
> that case.
>
>
>
> Thanks
> Alex Fu
>


How to understand create watermark for Kafka partitions

2019-12-12 Thread qq
Hi all,

  I confused with watermark for each Kafka partitions.  As I know watermark 
 created by data stream level. But why also say created watermark for each 
Kafka topic partitions ? As I tested, watermarks also created by global, even I 
run my job with parallels. And assign watermarks on Kafka consumer . Thanks .

Below text copied from flink web.


you can use Flink’s Kafka-partition-aware watermark generation. Using that 
feature, watermarks are generated inside the Kafka consumer, per Kafka 
partition, and the per-partition watermarks are merged in the same way as 
watermarks are merged on stream shuffles.

For example, if event timestamps are strictly ascending per Kafka partition, 
generating per-partition watermarks with the ascending timestamps watermark 
generator 

 will result in perfect overall watermarks.

The illustrations below show how to use the per-Kafka-partition watermark 
generation, and how watermarks propagate through the streaming dataflow in that 
case.




Thanks 
Alex Fu