LvYanquan created FLINK-35802:
---------------------------------
Summary: Deadlock may happen after adding new tables
Key: FLINK-35802
URL: https://issues.apache.org/jira/browse/FLINK-35802
Project: Flink
Issue Type: Bug
Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: LvYanquan
Fix For: cdc-3.2.0
Attachments: image-2024-07-10-13-44-49-972.png,
image-2024-07-10-13-45-52-450.png, image-2024-07-10-13-47-07-190.png
Problem Description:
1.CDC originally consumed the full incremental data of a table, and currently,
the snapshot phase has ended, and it is in the binlog consumption phase.
2.Stop the job to add the full incremental data synchronization for a new table.
3.After the full phase of the new table ends, it fails to return to the binlog
consumption phase.
4. Checking the thread that consumes the binlog, a deadlock situation is
discovered, and the specific thread stack is as follows.
5. The likely cause is that after the Enumerator issues a
BinlogSplitUpdateRequestEvent, both the MysqlSplitReader and
MySqlBinlogSplitReadTask close the binlogClient connection but fail to acquire
the lock.
6. The lock is held by the consumer thread, but the queue is full, waiting for
consumers to consume the data out, and yet there are no consumers, thus causing
a deadlock.
ThreadDump:
1. MysqlSplitReader.pollSplitRecords method
!image-2024-07-10-13-44-49-972.png!
2. MySqlStreamingChangeEventSource.execute method
!image-2024-07-10-13-45-52-450.png!
3. MySqlBinlogSplitReadTask.handleEvent method
!image-2024-07-10-13-47-07-190.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)