Lee Y S created FLINK-29048:
-------------------------------
Summary: WatermarksWithIdleness does not work with FLIP-27 sources
Key: FLINK-29048
URL: https://issues.apache.org/jira/browse/FLINK-29048
Project: Flink
Issue Type: Bug
Components: API / DataStream
Affects Versions: 1.15.1
Reporter: Lee Y S
In org.apache.flink.streaming.api.operators.SourceOperator, there are separate
instances of WatermarksWithIdleness created for each split output and the main
output. There is multiplexing of watermarks between split outputs but no
multiplexing between split output and main output in
org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks.
For a source such as org.apache.flink.connector.kafka.source.KafkaSource,
{color:#353833}there is only output from splits and no output from main. Hence
the main output will (after an initial timeout) be marked as idle.{color}
{color:#353833} {color}
{color:#353833}The implementation of {color}WatermarksWithIdleness is such that
once an output is idle, it will periodically re-mark the output as idle. Since
there is no multiplexing between split outputs and main output, the idle marks
coming from main output will repeatedly set the output to idle even though
there are events from the splits. Result is that the entire source is
repeatedly marked as idle.
One solution i can think of is to multiplex split and main output in
org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks
but I am not sure if there are other considerations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)