[ 
https://issues.apache.org/jira/browse/FLINK-39169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18061660#comment-18061660
 ] 

Luca Occhipinti edited comment on FLINK-39169 at 3/3/26 9:02 AM:
-----------------------------------------------------------------

Slowing down snapshot reads or using rate limiting can help reduce load on the 
writer, but in Aurora/RDS the snapshot still hits the writer instance. 
The behavior I’m proposing would be optional and Aurora/RDS-specific, so it 
wouldn’t affect other MySQL deployments. 
This would complement rate limiting rather than replace it, providing a safer 
and more efficient way to handle large snapshots in these environments.
Happy to share a draft implementation of this if helpful.


was (Author: JIRAUSER310333):
Thanks for the clarification. Slowing down snapshot reads or using rate 
limiting can help reduce load on the writer, but in Aurora/RDS the snapshot 
still hits the writer instance. 
The behavior I’m proposing — offloading snapshot reads to reader replicas — 
would be optional and Aurora/RDS-specific, so it wouldn’t affect other MySQL 
deployments. 
This would complement rate limiting rather than replace it, providing a safer 
and more efficient way to handle large snapshots in these environments.
Happy to share a draft implementation of this if helpful.

> [mysql-connector] Use reader instances to run snapshots
> -------------------------------------------------------
>
>                 Key: FLINK-39169
>                 URL: https://issues.apache.org/jira/browse/FLINK-39169
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>            Reporter: Luca Occhipinti
>            Priority: Major
>              Labels: mysql-cdc-connector
>
> When running MySQL CDC in snapshot or initial mode (both streaming and batch) 
> In cloud environments like AWS Aurora/RDS, the connector requires to be in 
> the primary/writer database instance to retrieve the binlog position and then 
> continues running snapshot queries. 
> This creates unnecessary load on the primary/writer instance when performing 
> large snapshot reads, which can impact production workloads.
> Usually this there are read replicas specifically designed to offload read 
> traffic.
> However, the current implementation cannot leverage these replicas for 
> snapshot data reading.
> The proposal is to use writer instance to get binlog position, use the reader 
> replica to run the snapshot queries, and if running in streaming mode, keep 
> using the writer to track binlog changes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to