Will keys that are out dated disappear?

It is infact a kind of sessions window that can start at any time.
Constantly new keys will appear.






On Mon, Feb 14, 2022, 15:57 Francesco Guardiani <france...@ververica.com>
wrote:

> Hi,
>
> - bounded out of orderness: This means that you have essentially a stream
> where events can come late of a certain amount of time, compared to the
> "newest" event received. For example, with a bounded out of orderness of 5
> minutes, you essentially say to Flink that your stream can receive an event
> of time 1PM today, and then immediately after that you can still receive
> another one of time 1PM - 5 minutes, and Flink should consider it. But if
> you rather receive one with time 1PM - 6mins, then Flink will consider this
> one as "late" and drop it. This is essentially the way Flink is able to not
> retain indefinitely your events.
> - with idleness: Because the stream generator needs new records to come in
> before advancing the stream, if your stream is stale, then no watermark is
> produced, that means that records after that watermark will not be
> processed.
>
> Reading your requirement, my understanding is that your input stream, that
> is InputTable, requires a bounded out of orderness of 5 minutes. For
> idleness, it really depends on whether your load can become stale at some
> point in time or not: if your stream can be stale for long period of times
> (say for a couple of days nothing is produced), then you should set an
> idleness which after that, a watermark is produced.
>
> Idleness is
>
> On Fri, Feb 11, 2022 at 2:53 PM HG <hanspeter.sl...@gmail.com> wrote:
>
>> Hi,
>>
>> I am getting a headache when thinking about watermarks and timestamps.
>> My application reads events from Kafka  (they are in json format) as a
>> Datastream
>> Events can be keyed by a transactionId and have a event timestamp
>> (handlingTime)
>>
>> All events belonging to a single transactionId will arrive in a window of
>> a couple of minutes (say max 5 minutes).
>> As soon as this 5 minutes has passed it should calculate the differences
>> in timestamp between the ordered events, add that elapsed time to every
>> event and emit them to the Sink.
>>
>> I basically want to use the table api to do
>> "SELECT transactionId, handlingTime, handlingTime - lag(handlingTime)
>> over (partition by transactionId order by handlingTime) as elapsedTime,
>> originalEvent FROM InputTable"
>>
>> After the result of this query has been pushed to the Sink all data with
>> respect to this transactionId can be discarded.
>>
>> What kind of watermark do I need to use?
>> - bounded out of orderness?
>> - with idleness?
>> - ...
>>
>> Late events can be ignored. They will rarely happen.
>>
>> Regards Hans-Peter
>>
>>
>>
>>

Reply via email to