cansakiroglu opened a new pull request, #4436:
URL: https://github.com/apache/flink-cdc/pull/4436

   ## SUMMARY
   
   Add the scan.newly-added-table.enabled YAML option to the Postgres Pipeline 
connector. The underlying SnapshotSplitAssigner.captureNewlyAddedTables() 
mechanism + PostgresSourceBuilder.scanNewlyAddedTableEnabled() builder method 
already exist in the postgres-cdc source; this PR adds the missing YAML-side 
wiring.
   
   Mirrors the same option already exposed by the MySQL Pipeline connector 
(MySqlDataSourceOptions.SCAN_NEWLY_ADDED_TABLE_ENABLED).
   
   Default is false, so the change is no-op for existing pipelines. When set to 
true, restoring from a savepoint will discover tables that match the source 
tables: pattern but were not part of the captured set at savepoint time — 
enabling DMS-style 'add a new table without re-snapshotting existing tables' 
workflows.
   
   ## JIRA
   
   [FLINK-34806](https://issues.apache.org/jira/browse/FLINK-34806)
   *"[Feature][Postgres] Support automatically identify newly added tables"*
   
   ## What changes
   
   Adds the `scan.newly-added-table.enabled` YAML option to the Postgres 
Pipeline
   connector.
   
   No behaviour change unless the user opts in (`defaultValue(false)`).
   
   ## Why
   
   The MySQL Pipeline connector exposes the same option via 
`MySqlDataSourceOptions.SCAN_NEWLY_ADDED_TABLE_ENABLED` and reads it in 
`MySqlDataSourceFactory`. This PR brings the Postgres Pipeline connector to 
parity.
   
   ## When this matters
   
   Adding a new table to a long-running pipeline. Today the only way to capture 
a newly-created PG table on an already-running Pipeline job is to cancel the 
job and re-snapshot every captured table from scratch. With this option set to 
`true`, on savepoint+restore the source compares the saved snapshot's table set 
against PG's current table set, picks up newly-matching tables, snapshots only 
the new ones, and resumes the existing captured tables from their saved WAL 
offsets. No re-snapshot of existing tables, no source-side load spike.
   
   ## Default
   
   `false` — preserves current behaviour for existing pipelines. Opt-in only.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to