Hi all,

It's true that the problem can be handled by caching records in state.
However, there is an alternative using `watermark alignment` with Flink
1.15+ [1] which does the desired synchronization that you described while
reducing the size of state from the former approach.

To use this with two topics of different speeds, you would need to define
two Kafka sources, each corresponding to a topic. This limitation is
documented in [1]. This limitation is resolved in Flink 1.17 by split level
(partition level in the case of Kafka) watermark alignment, so one Kafka
source reading various topics can align on the partitions of the different
topics.

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment-_beta_

Best,
Mason

On Mon, Feb 27, 2023 at 8:11 AM Alexis Sarda-Espinosa <
sarda.espin...@gmail.com> wrote:

> Hello,
>
> I had this question myself and I've seen it a few times, the answer is
> always the same, there's currently no official way to handle it without
> state.
>
> Regards,
> Alexis.
>
> On Mon, 27 Feb 2023, 14:09 Remigiusz Janeczek, <capi...@gmail.com> wrote:
>
>> Hi,
>>
>> How to handle a case where one of the Kafka topics used for interval join
>> is slower than the other? (Or a case where one topic lags behind)
>> Is there a way to stop consuming from the fast topic and wait for the
>> slow one to catch up? I want to avoid running out of memory (or keeping a
>> very large state) and I don't want to discard any data from the fast topic
>> until a watermark from the slow topic allows that.
>>
>> Best Regards
>>
>

Reply via email to