我在阅读flink-connector-mysql-cdc项目源码过程中遇到一个不清楚的地方,还请大佬指点,谢谢!
MySqlSplitReader类有一段代码如下,注释“(1) Reads binlog split firstly and then read snapshot split”这一句话我不理解。 为什么要先读binlog split再读snapshot split?为保证记录的时序性,不是应该先读全量的snapshot split再读增量的binlog split么? private MySqlRecords pollSplitRecords() throws InterruptedException { Iterator<SourceRecords> dataIt; if (currentReader == null) { // (1) Reads binlog split firstly and then read snapshot split if (binlogSplits.size() > 0) { // the binlog split may come from: // (a) the initial binlog split // (b) added back binlog-split in newly added table process MySqlSplit nextSplit = binlogSplits.poll(); currentSplitId = nextSplit.splitId(); currentReader = getBinlogSplitReader(); currentReader.submitSplit(nextSplit); } else if (snapshotSplits.size() > 0) { MySqlSplit nextSplit = snapshotSplits.poll(); currentSplitId = nextSplit.splitId(); currentReader = getSnapshotSplitReader(); currentReader.submitSplit(nextSplit); } else { LOG.info("No available split to read."); } dataIt = currentReader.pollSplitRecords(); return dataIt == null ? finishedSplit() : forRecords(dataIt); } else if (currentReader instanceof SnapshotSplitReader) { .... } ... }