HI, Theo.
I'm wondering what the Event-Time-Windowed Query you are using looks like.
For example, how do you define the watermark?
Considering you read records from the 10 partitions, and it may well that the 
records will arrive the window process operator out of order. 
Is it possible that the records exceed the watermark, but there're still some 
records will arrive?

If that's the case, every time, the records used to calculate result may well 
different and then result in non-determinism result.

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Theodor Wübker" <theo.wueb...@inside-m2m.de>
收件人: "User" <user@flink.apache.org>
发送时间: 星期日, 2023年 2 月 12日 下午 4:25:45
主题: Non-Determinism in Table-API with Kafka and Event Time

Hey everyone,

I experience non-determinism in my Table API Program at the moment and (as a 
relatively unexperienced Flink and Kafka user) I can’t really explain to myself 
why it happens. So, I have a topic with 10 Partitions and a bit of Data on it. 
Now I run a simple SELECT * query on this, that moves some attributes around 
and writes everything on another topic with 10 partitions. Then, on this topic 
I run a Event-Time-Windowed Query. Now I experience Non-Determinism: The 
results of the windowed query differ with every execution. 
I thought this might be, because the SELECT query wrote the data to the 
partitioned topic without keys. So I tried it again with the same key I used 
for the original topic. It resulted in the exact same topic structure. Now when 
I run the Event-Time-Windowed query, I get incorrect results (too few 
result-entries). 

I have already read a lot of the Docs on this and can’t seem to figure it out. 
I would much appreciate, if someone could shed a bit of light on this. Is there 
anything in particular I should be aware of, when reading partitioned topics 
and running an event time query on that? Thanks :)


Best,
Theo

Reply via email to