Dear Flink CDC Community,
My name is LW and I am a user of Flink CDC. I am currently using Flink CDC with 
a YAML configuration file to synchronize data from MySQL to Apache Paimon, and 
I am very impressed with its capabilities.
I am writing to propose a new feature that I believe would be a valuable 
addition. Currently, when synchronizing multiple tables defined in a single 
YAML file, it appears there is no way to reset the synchronization progress for 
a single, specific table without affecting the progress of the other tables.
For instance, if one of the target tables in Paimon needs to be re-initialized 
due to data corruption or business requirements, the current approach would 
require restarting the entire pipeline. This would cause all other tables to 
resynchronize from their beginning, which is inefficient and can be disruptive 
for the overall data flow.
Therefore, I would like to request the addition of a new feature that allows 
users to reset the progress for a specified table (or a selection of tables) 
and have it re-read from the beginning, while the synchronization for all other 
tables in the same job continues unaffected from their last recorded checkpoint.
This could potentially be implemented via a command-line interface, a REST API 
call to the running job, or through a dynamic configuration update.
I believe this feature would greatly enhance the flexibility and manageability 
of Flink CDC in production environments where targeted data reprocessing is 
often necessary.
Thank you for your time and for developing this great tool. I look forward to 
hearing your thoughts on this proposal.
Best regards,
LW
________________________________


Reply via email to