JNSimba opened a new pull request, #4452:
URL: https://github.com/apache/flink-cdc/pull/4452
## What is the purpose of the change
The Postgres incremental source uses the Debezium connector config's snapshot
fetch size (`connectorConfig.getSnapshotFetchSize()`, default `10240`) for
the
snapshot split read, instead of the Flink CDC option
`scan.snapshot.fetch.size`
(default `1024`). As a result:
1. `scan.snapshot.fetch.size` has no effect for the Postgres source.
2. When a snapshot chunk's row count is `<=` the fetch size, the JDBC
server-side cursor returns the whole chunk in a single batch, loading
every
row of the chunk into memory at once. On wide tables (many columns / large
rows) this can exhaust the heap and OOM.
The MySQL source already reads `sourceConfig.getFetchSize()` for its snapshot
read. This PR makes the Postgres snapshot read task do the same, so that
`scan.snapshot.fetch.size` is honored consistently across connectors.
## Brief change log
- `PostgresSnapshotSplitReadTask` now takes the `PostgresSourceConfig` and
uses
`sourceConfig.getFetchSize()` for the snapshot select statement, instead of
`connectorConfig.getSnapshotFetchSize()`.
## Verifying this change
This change is covered by existing Postgres source tests; the snapshot fetch
size now follows the `scan.snapshot.fetch.size` option.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API: no
- The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]