[jira] [Created] (FLINK-34889) Flink CDC may occur binlog can't be find when dynamic added table repeatedly

Flink CDC Issue Import (Jira) Wed, 20 Mar 2024 05:13:39 -0700

Flink CDC Issue Import created FLINK-34889:
----------------------------------------------


             Summary: Flink CDC may occur binlog can't be find when dynamic 
added table repeatedly
                 Key: FLINK-34889
                 URL: https://issues.apache.org/jira/browse/FLINK-34889
             Project: Flink
          Issue Type: Bug
          Components: Flink CDC
            Reporter: Flink CDC Issue Import


We met a strange problem of the Flink CDC: When repeatedly adding table to a 
Flink CDC link, it may fails and report a very old gtid can't be find, we 
digging the source code and found the reason is bellow:

1. When CDC full phase change to incremental phase, binlog need pull ending 
offset of all chunk, and it will take the minimum of these offset as the 
stating offset of the incremental phase.Ending offset of each chunk are store 
in the JM.

2. If we added table repeatedly, and each time we need to suspend the job, 
alter the config, and then resume form latest checkpoint.

3. Normally, when finished adding table, we pull the ending offset of each 
chunk. The pull process will transfer a size between the jm and tm, which means 
when there is 100 tables in jm, and we have processed 80, we need process 81 to 
pull the next offset.

4. There has one problem because the order of the split in jm and tm is not the 
same.The jm will order by table name (such as a:0, a:1, b:0, b:1), when added 
table, we need pull the ending offset of the newly added table, while jm order 
the split by the table name, and the newly added table may occurs in middle, so 
we may get a ending offset of a very old split.
<img width="1386" alt="1" 
src="https://github.com/apache/flink-cdc/assets/5321584/f8383f59-82d9-4d97-bad7-1aea54c6ac81";>


---------------- Imported from GitHub ----------------
Url: https://github.com/apache/flink-cdc/issues/3141
Created by: [zlzhang0122|https://github.com/zlzhang0122]
Labels: 
Created at: Wed Mar 13 21:17:17 CST 2024
State: open




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34889) Flink CDC may occur binlog can't be find when dynamic added table repeatedly

Reply via email to