LvYanquan created FLINK-35802: --------------------------------- Summary: Deadlock may happen after adding new tables Key: FLINK-35802 URL: https://issues.apache.org/jira/browse/FLINK-35802 Project: Flink Issue Type: Bug Components: Flink CDC Affects Versions: cdc-3.1.0 Reporter: LvYanquan Fix For: cdc-3.2.0 Attachments: image-2024-07-10-13-44-49-972.png, image-2024-07-10-13-45-52-450.png, image-2024-07-10-13-47-07-190.png
Problem Description: 1.CDC originally consumed the full incremental data of a table, and currently, the snapshot phase has ended, and it is in the binlog consumption phase. 2.Stop the job to add the full incremental data synchronization for a new table. 3.After the full phase of the new table ends, it fails to return to the binlog consumption phase. 4. Checking the thread that consumes the binlog, a deadlock situation is discovered, and the specific thread stack is as follows. 5. The likely cause is that after the Enumerator issues a BinlogSplitUpdateRequestEvent, both the MysqlSplitReader and MySqlBinlogSplitReadTask close the binlogClient connection but fail to acquire the lock. 6. The lock is held by the consumer thread, but the queue is full, waiting for consumers to consume the data out, and yet there are no consumers, thus causing a deadlock. ThreadDump: 1. MysqlSplitReader.pollSplitRecords method !image-2024-07-10-13-44-49-972.png! 2. MySqlStreamingChangeEventSource.execute method !image-2024-07-10-13-45-52-450.png! 3. MySqlBinlogSplitReadTask.handleEvent method !image-2024-07-10-13-47-07-190.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)