dzmxcyr opened a new issue, #63269: URL: https://github.com/apache/doris/issues/63269
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version Source Doris:2.1.11-x64 Target Doris:2.1.11-arm64 CCR:ccr-syncer-3.0.6-rc05-arm64 ### What's Wrong? After creating a database-level CCR replication task, CCR starts full synchronization normally and then enters the incremental synchronization phase, with everything working properly. However, when the upstream Doris FE master node switches to another node, CCR triggers a fullsync and pulls data again from scratch. Due to the large volume of data, the synchronization takes a long time and has a significant impact on the production environment. With the ccr log: [2026-05-14 09:33:51.786] WARN call [:0] error: GetBinlog error: remote or network error: get connection error: dial tcp :0: connection has been closed by peer, req: TGetBinlogRequest({Cluster: User:0x40001a8378 Passwd:0x40001a8388 Db:0x40001a83a8 Table: TableId: UserIp: Token: PrevCommitSeq:0x400082e928 NumAcquired:0x400082e930}): [rpc] remote or network error: get connection error: dial tcp :0: connection has been closed by peer, try next addr job=CCR_PROD_ZHBB line=rpc/fe.go:259 ... [2026-05-14 09:33:52.149] WARN job sync failed, job: CCR_PROD_DW, err: [meta] index ids is empty ... [2026-05-14 09:33:53.597] INFO fullsync status: create snapshot with prefix ccrs_CCR_PROD_DW_1778668141 job=CCR_PROD_DW line=ccr/job.go:973 [2026-05-14 09:33:53.694] INFO fullsync status: create snapshot ccrs_CCR_PROD_DW_1778668141_1778722433 job=CCR_PROD_DW line=ccr/job.go:1019 [2026-05-14 09:33:53.694] INFO create snapshot PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433, backup snapshot sql: BACKUP SNAPSHOT PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433 TO __keep_on_local__ PROPERTIES ("type" = "full") job=CCR_PROD_DW line=base/spec.go:771 ### What You Expected? CCR runs nomally after the Doris fe master node fails over to another node. ### How to Reproduce? When database-level CCR synchronization is running on the upstream cluster with continuous writes to a large number of tables, if the FE Master node goes down and a switchover occurs, CCR will trigger a fullsync again. ### Anything Else? _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
