Re: [PR] [fix](streamingjob) Persist cdc_stream TVF offset across FE checkpoint [doris]

via GitHub Sun, 10 May 2026 23:11:29 -0700


github-actions[bot] commented on code in PR #62902:
URL: https://github.com/apache/doris/pull/62902#discussion_r3216669687



##########
regression-test/suites/job_p0/streaming_job/cdc/tvf/test_streaming_job_cdc_stream_postgres_checkpoint_restart_fe.groovy:
##########
@@ -0,0 +1,298 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import org.apache.doris.regression.suite.ClusterOptions
+import org.awaitility.Awaitility
+
+import static java.util.concurrent.TimeUnit.SECONDS
+
+/**
+ * Test FE checkpoint-restart recovery of a cdc_stream TVF streaming job for 
PostgreSQL.
+ *
+ * Counterpart of test_streaming_job_cdc_stream_postgres_restart_fe but with 
the pre-checkpoint
+ * journal GC'd before each restart, so recovery cannot rely on EditLog txn 
replay and must
+ * fall back to image-persisted bop/chw written via getPersistInfo (inherited 
from parent) and
+ * read back by JdbcTvfSourceOffsetProvider.restoreFromPersistInfo.
+ *
+ * edit_log_roll_num=1 finalizes journals on every write so the next 
checkpoint cycle picks
+ * them up; sleeping >= 90s after each commit ensures a checkpoint runs 
(interval defaults
+ * to 60s) and the pre-checkpoint journal is then eligible for GC.
+ *
+ * Two checkpoint-restart scenarios are covered in sequence:
+ *
+ * Restart 1 — mid-snapshot:
+ *   snapshot_split_size=1 splits 5 pre-existing rows (A1-E1) into 5 separate 
tasks.
+ *   After the first task succeeds and a checkpoint runs, FE is restarted. 
This exercises
+ *   replayIfNeed() with currentOffset == null but chunkHighWatermarkMap 
restored from image:
+ *   remainingSplits is rebuilt from the meta table with chw remap so 
already-finished splits
+ *   are not re-processed.
+ *
+ * Restart 2 — binlog phase:
+ *   After the full snapshot completes and F1/G1 are consumed via binlog, a 
checkpoint runs
+ *   and FE is restarted. This exercises the binlog recovery path where 
currentOffset == null
+ *   but binlogOffsetPersist is restored from image: a BinlogSplit is rebuilt 
from bop so
+ *   the job resumes from the last committed binlog position rather than the 
initial one.
+ *   H1/I1 are then inserted to verify the job continues reading binlog 
correctly.
+ */
+suite("test_streaming_job_cdc_stream_postgres_checkpoint_restart_fe",
+        
"docker,p0,external,pg,external_docker,external_docker_pg,nondatalake") {
+    def jobName = "test_streaming_job_cdc_stream_pg_ckpt_restart_fe"
+    def options = new ClusterOptions()
+    options.setFeNum(1)
+    options.cloudMode = null
+    // Roll the journal on every write so the checkpoint thread has finalized 
journals to
+    // include; without this, a small steady-state EditLog stays in the active 
segment and
+    // never reaches the checkpoint image, defeating the test.
+    options.feConfigs += [
+        'edit_log_roll_num=1'
+    ]
+
+    docker(options) {
+        def currentDb = (sql "select database()")[0][0]
+        def dorisTable = "test_streaming_job_cdc_stream_pg_ckpt_restart_fe_tbl"
+        def pgDB = "postgres"
+        def pgSchema = "cdc_test"
+        def pgUser = "postgres"
+        def pgPassword = "123456"
+        def pgTable = "test_streaming_job_cdc_stream_pg_ckpt_restart_fe_src"
+
+        sql """DROP JOB IF EXISTS where jobname = '${jobName}'"""
+        sql """drop table if exists ${currentDb}.${dorisTable} force"""
+
+        sql """
+            CREATE TABLE IF NOT EXISTS ${currentDb}.${dorisTable} (
+                `name` varchar(200) NULL,
+                `age`  int NULL
+            ) ENGINE=OLAP
+            DUPLICATE KEY(`name`)
+            DISTRIBUTED BY HASH(`name`) BUCKETS AUTO
+            PROPERTIES ("replication_allocation" = "tag.location.default: 1")
+        """
+
+        String enabled = context.config.otherConfigs.get("enableJdbcTest")
+        if (enabled != null && enabled.equalsIgnoreCase("true")) {
+            String pg_port = context.config.otherConfigs.get("pg_14_port")
+            String externalEnvIp = 
context.config.otherConfigs.get("externalEnvIp")
+            String s3_endpoint = getS3Endpoint()
+            String bucket = getS3BucketName()
+            String driver_url = 
"https://${bucket}.${s3_endpoint}/regression/jdbc_driver/postgresql-42.5.0.jar";
+
+            // ── Phase 1: prepare source table with pre-existing snapshot 
rows ───────────
+            connect("${pgUser}", "${pgPassword}", 
"jdbc:postgresql://${externalEnvIp}:${pg_port}/${pgDB}") {
+                sql """DROP TABLE IF EXISTS ${pgDB}.${pgSchema}.${pgTable}"""
+                sql """CREATE TABLE ${pgDB}.${pgSchema}.${pgTable} (
+                          "name" varchar(200) PRIMARY KEY,
+                          "age"  int2
+                      )"""
+                sql """INSERT INTO ${pgDB}.${pgSchema}.${pgTable} (name, age) 
VALUES ('A1', 1)"""
+                sql """INSERT INTO ${pgDB}.${pgSchema}.${pgTable} (name, age) 
VALUES ('B1', 2)"""
+                sql """INSERT INTO ${pgDB}.${pgSchema}.${pgTable} (name, age) 
VALUES ('C1', 3)"""
+                sql """INSERT INTO ${pgDB}.${pgSchema}.${pgTable} (name, age) 
VALUES ('D1', 4)"""
+                sql """INSERT INTO ${pgDB}.${pgSchema}.${pgTable} (name, age) 
VALUES ('E1', 5)"""
+            }
+
+            // ── Phase 2: create streaming job (offset=initial, split_size=1 
→ 5 tasks) ─
+            sql """
+                CREATE JOB ${jobName}
+                ON STREAMING DO INSERT INTO ${currentDb}.${dorisTable} (name, 
age)
+                SELECT name, age FROM cdc_stream(
+                    "type"                = "postgres",
+                    "jdbc_url"            = 
"jdbc:postgresql://${externalEnvIp}:${pg_port}/${pgDB}",
+                    "driver_url"          = "${driver_url}",
+                    "driver_class"        = "org.postgresql.Driver",
+                    "user"                = "${pgUser}",
+                    "password"            = "${pgPassword}",
+                    "database"            = "${pgDB}",
+                    "schema"              = "${pgSchema}",
+                    "table"               = "${pgTable}",
+                    "offset"              = "initial",
+                    "snapshot_split_size" = "1"
+                )
+            """
+
+            // ── Phase 3: wait for the first snapshot task to succeed, then 
trigger checkpoint ──
+            try {
+                Awaitility.await().atMost(300, SECONDS).pollInterval(2, 
SECONDS).until({
+                    def cnt = sql """select SucceedTaskCount from 
jobs("type"="insert")
+                                     where Name='${jobName}' and 
ExecuteType='STREAMING'"""
+                    log.info("SucceedTaskCount before first ckpt restart: " + 
cnt)
+                    cnt.size() == 1 && (cnt.get(0).get(0) as int) >= 1
+                })
+            } catch (Exception ex) {
+                log.info("job: " + (sql """select * from jobs("type"="insert") 
where Name='${jobName}'"""))
+                log.info("tasks: " + (sql """select * from 
tasks("type"="insert") where JobName='${jobName}'"""))
+                throw ex
+            }
+
+            def jobInfoBeforeRestart = sql """
+                select status, currentOffset, loadStatistic
+                from jobs("type"="insert") where Name='${jobName}'
+            """
+            log.info("job info before first ckpt restart: " + 
jobInfoBeforeRestart)
+            assert jobInfoBeforeRestart.get(0).get(0) == "RUNNING" :
+                    "Job should be RUNNING before first restart, got: 
${jobInfoBeforeRestart.get(0).get(0)}"
+            def scannedRowsBeforeFirstRestart = 
parseJson(jobInfoBeforeRestart.get(0).get(2) as String).scannedRows as long
+            log.info("scannedRows before first ckpt restart: " + 
scannedRowsBeforeFirstRestart)
+
+            // Wait >= 90s so the checkpoint thread (60s interval) runs at 
least once after our
+            // last commit, then GC's the pre-checkpoint journal. Subsequent 
FE restart can no
+            // longer recover via journal txn replay and must rely on 
image-persisted chw.
+            log.info("Waiting 90s for checkpoint to run before first FE 
restart...")
+            sleep(90000)
+
+            // ── Phase 4: restart FE (mid-snapshot, post-checkpoint) 
──────────────────
+            cluster.restartFrontends()
+            sleep(60000)
+            context.reconnectFe()
+
+            // ── Phase 5: verify job recovers and finishedSplits are not 
re-processed ───
+            try {

Review Comment:
   This wait does not freeze the streaming job after the first snapshot task 
succeeds. With only five 1-row snapshot splits, the scheduler can continue 
committing the remaining snapshot splits and transition to binlog during these 
90 seconds, so the first restart often is no longer a mid-snapshot restart. In 
that case the test can pass through the `binlogOffsetPersist` path and never 
prove the new `chunkHighWatermarkMap` image restore path (`currentOffset == 
null`, no bop, chw restored) that this scenario is intended to cover. Please 
make the test hold the job in a partial-snapshot state until after the 
checkpoint image is created, or assert immediately before restart that 
`currentOffset` is still a snapshot offset and that fewer than all snapshot 
rows/splits have completed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [fix](streamingjob) Persist cdc_stream TVF offset across FE checkpoint [doris]

Reply via email to