Hi Martijn:
Thanks for the reference.
My understanding was that if we use watermark then any event with event time
(in the above example) < event_time - 30 seconds will be dropped automatically.
My question [1] is will the downstream (ALL_EVENTS) view which is selecting the
events from the table receive events which are late ? If late events are
dropped at the table level then do we still need the second predicate check (ts
> CURRENT_WATERMARK(ts)) to filter out late events at the view level.
If the table does not drop late events, then will all downstream views/etc need
to add this check (ts > CURRENT_WATERMARK(ts)) ?
I am still not clear on this concept of whether downstream view need to check
for late events with this predicate or will they never receive late events.
Thanks again for your time.
On Friday, February 11, 2022, 01:55:09 PM EST, Martijn Visser
<[email protected]> wrote:
Hi,
There's a Flink SQL Cookbook recipe on CURRENT_WATERMARK, I think this would
cover your questions [1].
Best regards,
Martijn
[1]
https://github.com/ververica/flink-sql-cookbook/blob/main/other-builtin-functions/03_current_watermark/03_current_watermark.md
On Fri, 11 Feb 2022 at 16:45, M Singh <[email protected]> wrote:
Hi:
The flink docs
(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/)
indicates that the CURRENT_WATERMARK(rowtime) can return null:
Note that this function can return NULL, and you may have to consider this
case. For example, if you want to filter out late data you can use:
WHERE
CURRENT_WATERMARK(ts) IS NULL
OR ts > CURRENT_WATERMARK(ts)
I have the following questions that if the table is defined with a watermark eg:
CREATE TABLE `MYEVENTS` (`name` STRING, `event_time` TIMESTAMP_LTZ(3),
...WATERMARK FOR event_time AS event_time - INTERVAL '30' SECONDS)WITH (...)
1. If we define the water mark as above, will the late events still be
propagated to a view or table which is selecting from MYEVENTS table:
CREATE TEMPORARY VIEW `ALL_EVENTS` AS SELECT * FROM MYEVENTS;
2. Can CURRENT_WATERMARK(event_time) still return null ? If so, what are the
conditions for returning null ?
Thanks