Hi Mason,

Very interesting, is it possible to apply both types of alignment? I.e.,
considering watermark skew across splits from within one source & also from
another source?

Regards,
Alexis.

On Tue, 28 Feb 2023, 05:26 Mason Chen, <mas.chen6...@gmail.com> wrote:

> Hi all,
>
> It's true that the problem can be handled by caching records in state.
> However, there is an alternative using `watermark alignment` with Flink
> 1.15+ [1] which does the desired synchronization that you described while
> reducing the size of state from the former approach.
>
> To use this with two topics of different speeds, you would need to define
> two Kafka sources, each corresponding to a topic. This limitation is
> documented in [1]. This limitation is resolved in Flink 1.17 by split level
> (partition level in the case of Kafka) watermark alignment, so one Kafka
> source reading various topics can align on the partitions of the different
> topics.
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment-_beta_
>
> Best,
> Mason
>
> On Mon, Feb 27, 2023 at 8:11 AM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>>
>> I had this question myself and I've seen it a few times, the answer is
>> always the same, there's currently no official way to handle it without
>> state.
>>
>> Regards,
>> Alexis.
>>
>> On Mon, 27 Feb 2023, 14:09 Remigiusz Janeczek, <capi...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> How to handle a case where one of the Kafka topics used for interval
>>> join is slower than the other? (Or a case where one topic lags behind)
>>> Is there a way to stop consuming from the fast topic and wait for the
>>> slow one to catch up? I want to avoid running out of memory (or keeping a
>>> very large state) and I don't want to discard any data from the fast topic
>>> until a watermark from the slow topic allows that.
>>>
>>> Best Regards
>>>
>>

Reply via email to