[ https://issues.apache.org/jira/browse/FLINK-23426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383706#comment-17383706 ]
Jark Wu edited comment on FLINK-23426 at 7/20/21, 2:41 AM: ----------------------------------------------------------- This was planned in FLINK-18825. I prefer to have a materialize operator to materialize changelog stream into insert-only stream as the source is bounded. I think this is a straightforward and simple way. Regarding "no missing UPDATE_AFTER", I think all the CDC formats and connectors we supported have "complete" CDC logs, e.g. debezium, canal, maxwell, mysql-cdc, UPDATE_BEFORE and UPDATE_AFTER are always contained in a single UPDATE event. was (Author: jark): This was planned in FLINK-18825. I prefer to have a materialize operator to materialize changelog stream into insert-only stream as the source is bounded. Regarding "no missing UPDATE_AFTER", I think all the CDC formats and connectors we supported have "complete" CDC logs, e.g. debezium, canal, maxwell, mysql-cdc, UPDATE_BEFORE and UPDATE_AFTER are always contained in a single UPDATE event. > Support changelog processing in batch mode > ------------------------------------------ > > Key: FLINK-23426 > URL: https://issues.apache.org/jira/browse/FLINK-23426 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API > Reporter: Timo Walther > Priority: Major > > The DataStream API can execute arbitrary DataStream programs when running in > batch mode. However, this is not the case for the Table API batch mode. E.g. > a source with non-insert only changes is not supported and updates/deletes > cannot be emitted. > In theory, we could make this work by running the "stream mode" of the > planner (CDC transformations) on top of the "batch mode" of DataStream API > (specialized state backend, sorted inputs). It is up for discussion if and > how we expose such functionality. > If we don't allow enabling incremental updates, we can also add a special > batch operator that materializes the incoming changes for a batch pipeline. > However, it would require "complete" CDC logs (i.e. no missing UPDATE_AFTER). -- This message was sent by Atlassian Jira (v8.3.4#803005)