saligia-tju opened a new pull request, #4122: URL: https://github.com/apache/flink-cdc/pull/4122
## Introducing AsyncScheduler with PartitionedDeserializationScheduler ### 🚀 Core Architecture **Intelligent Event Routing**: Leverages primary key-based partitioning to distribute data-change events across dedicated single-thread partition workers, ensuring optimal load distribution and thread safety. **Robust Queue Management**: Implements bounded per-partition queues using `ArrayBlockingQueue` with a strict blocking policy, guaranteeing zero data loss through our no-drop mechanism. **Ordering Guarantees**: The source thread's `drainRound` serves as the exclusive collector, maintaining per-key ordering integrity throughout the entire processing pipeline. **Reliable Checkpoint Recovery**: Binds binlog offset advancement to post-emit operations, ensuring bulletproof checkpoint consistency and seamless recovery capabilities. ### 🌐 Global Async Control Path **Comprehensive Event Support**: Seamlessly handles control events (DDL/Watermark/Heartbeat) through a sophisticated global async pathway. **Advanced Completion Management**: Utilizes sequence-to-batch mapping for intelligent out-of-order completion handling while maintaining strict in-order emission guarantees. ### ⚙️ Flexible Configuration Framework **Builder-Driven Design**: Offers comprehensive configuration control through intuitive builder patterns: - `parallelDeserializeEnabled` - Master switch (disabled by default) - `parallelDeserializePkWorkers` - Primary key worker scaling - `parallelDeserializeThreads` - Thread pool optimization - `parallelDeserializeQueueCapacity` - Queue size tuning **Backward Compatibility**: When disabled, the system gracefully falls back to the original single-thread execution path with zero scheduler overhead. ### 🛡️ Enhanced Stability & Performance **Concurrency Safety**: Resolves potential `ConcurrentModificationException` issues during emission by implementing snapshot-to-array conversion before iteration, eliminating iterator modification conflicts. **Code Excellence**: Features comprehensive Chinese comments, streamlined code paths, and removal of legacy JVM system properties for improved maintainability. ### 📋 Migration Guide **Activation Steps**: - Set `parallelDeserializeEnabled=true` - Configure optimal values for `pkWorkers`, `threads`, and `queue` parameters **Seamless Transition**: When disabled, behavior remains 100% identical to previous versions, ensuring zero-risk deployment. **Integration Requirements**: Ensure downstream systems respect per-key ordering constraints; all collection operations remain anchored to the source thread for consistency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
