[ 
https://issues.apache.org/jira/browse/FLINK-39238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-39238:
-----------------------------------
    Labels: pull-request-available  (was: )

> Support watermark alignment in flink-connector-kafka in dynamic Kafka source 
> mode
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-39238
>                 URL: https://issues.apache.org/jira/browse/FLINK-39238
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kafka
>            Reporter: Xin Gao
>            Priority: Critical
>              Labels: pull-request-available
>
> Flink source ingestion is with watermark alignment config like 
> `scan.watermark.alignment.max-drift`. The operator would then control
>  * checkWatermarkAlignment for whole reader
>  * checkSplitWatermarkAlignment for the split reader
> In dynamic Kafka source mode, the split reader pause/resume is missed 
> (`pauseOrResumeSplits` not implemented) and the exceptions might raise once 
> such config enabled.
> This feature is important for workflows like ingesting data from Kafka to 
> Data Lake. Missing the alignment could create far more number of small files 
> once the ingestion progress diverges and runs our of controls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to