JNSimba opened a new pull request, #64511: URL: https://github.com/apache/doris/pull/64511
### What problem does this PR solve? The cdc_client builds debezium's `ChangeEventQueue` with only a count-based bound (`max.queue.size=8192`) while the byte bound (`max.queue.size.in.bytes`) defaults to `0` (disabled). With wide rows (e.g. ~2MB each), the in-memory queue can grow to `2MB * 8192 ≈ 16GB` and OOM the process. Both PostgreSQL and MySQL paths build the queue from `getMaxQueueSizeInBytes()`, so a single property covers both, and it applies to both the snapshot and streaming phases. ### What this PR does Set a heap-adaptive byte cap on the queue buffer in `ConfigUtil.getDefaultDebeziumProps()`, which is shared by the Postgres and MySQL source readers: - Default cap is `clamp(heap/16, 64MB, 256MB)`: heap 1G -> 64MB, 2G -> 128MB, >= 4G -> 256MB. - The cap is intentionally conservative because a single cdc_client JVM can run many queues concurrently (one per split, across multiple jobs), and the real batching/backpressure happens downstream in the sink rather than in this queue. - Escape hatch: `-Dcdc.max.queue.size.in.bytes=<bytes>` overrides the adaptive value (absolute bytes; `<= 0` disables the byte bound). Narrow tables are unaffected: 8192 rows stay well under 64MB, so the count bound is reached first and behavior is unchanged. ### Release note None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
