JNSimba opened a new pull request, #4435: URL: https://github.com/apache/flink-cdc/pull/4435
## What is the purpose of the change During the snapshot phase, the PostgreSQL connector reads column structure via JDBC `DatabaseMetaData#getColumns(catalog, schemaPattern, tableNamePattern, columnNamePattern)`. Per the JDBC spec, **both** `schemaPattern` and `tableNamePattern` are LIKE patterns, where `_` matches any single character and `%` matches any sequence of characters. Both are legal identifier characters in PostgreSQL, so `getColumns` can return columns from other schemas/tables that were never meant to match. An exact filter on the **table name** already exists, but the **schema name was never validated**. When two schemas have names that are wildcard matches of each other (e.g. `sch_test` and `schxtest`, where `_` matches `x`) and both contain a same-named table, capturing `sch_test.<table>` also pulls in the look-alike schema's columns. The table-name filter cannot tell them apart, so the snapshot fails with `IllegalStateException: Duplicate key Optional.empty` (or, when columns differ, silently merges columns from the wrong schema). ## Brief change log - `PostgresConnection#doReadTableColumn`: also compare the result-set `TABLE_SCHEM` (column 2) against `TableId.schema()`, in addition to the existing `TABLE_NAME` check. The schema check is skipped when the requested schema is `null`, so columns are not dropped when no schema is specified. This is a pure after-the-fact filter; it does not change the metadata query and does not affect normal (non-wildcard) schemas/tables. - `SimilarTableNamesITCase` / `similar_names.sql`: add a cross-schema case with two schemas (`sch_test` / `schxtest`) that are wildcard matches of each other, each holding a same-named table, verifying that only the target schema's snapshot and incremental data are captured. ## Verifying this change This change added tests and can be verified as follows: - Added `SimilarTableNamesITCase#testReadTableWithSimilarSchemaNameUnderscore`, which fails (snapshot `IllegalStateException: Duplicate key Optional.empty`) without the fix and passes with it. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The connector code base: yes (postgres-cdc snapshot column reading) ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
