Kafka topic partition skewness causes watermark not being emitted

2017-01-12 Thread tao xiao
Hi team, I have a topic with 2 partitions in Kafka. I produced all data to partition 0 and no data to partition 1. I created a Flink job with parallelism to 1 that consumes that topic and count the events with session event window (5 seconds gap). It turned out that the session event window was ne

Re: Kafka topic partition skewness causes watermark not being emitted

2017-01-13 Thread Tzu-Li (Gordon) Tai
Hi, This is expected behaviour due to how the per-partition watermarks are designed in the Kafka consumer, but I think it’s probably a good idea to handle idle partitions also when the Kafka consumer itself emits watermarks. I’ve filed a JIRA issue for this: https://issues.apache.org/jira/brows

Re: Kafka topic partition skewness causes watermark not being emitted

2017-01-14 Thread tao xiao
The case I described was for experiment only but data skewness would happen in production. The current implementation will block the watermark emission to downstream until all partition move forward which has great impact on latency. It may be a good idea to expose an API to users to decide what th

Re: Kafka topic partition skewness causes watermark not being emitted

2018-04-16 Thread Juho Autio
A possible workaround while waiting for FLINK-5479, if someone is hitting the same problem: we chose to send "heartbeat" messages periodically to all topics & partitions found on our Kafka. We do that through the service that normally writes to our Kafka. This way every partition always has some ~r

Re: Kafka topic partition skewness causes watermark not being emitted

2017-12-12 Thread gerardg
I'm also affected by this behavior. There are no updates in FLINK-5479 but did you manage to find a way to workaround this? Thanks, Gerard -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Kafka topic partition skewness causes watermark not being emitted

2017-12-13 Thread Tzu-Li (Gordon) Tai
Hi, I've just elevated FLINK-5479 to BLOCKER for 1.5. Unfortunately, AFAIK there is no easy workaround solution for this issue yet in the releases so far. The min watermark logic that controls per-partition watermark emission is hidden inside the consumer, making it hard to work around it. One p

Re: Kafka topic partition skewness causes watermark not being emitted

2017-12-13 Thread Gerard Garcia
Thanks Gordon. Don't worry, I'll be careful to not have empty partitions until the next release. Also, I'll keep an eye to FLINK-5479 and if at some point I see that there is a fix and the issue bothers us too much I'll try to apply the patch myself to the latest stable release. Gerard On Wed, D