This is an automated email from the ASF dual-hosted git repository.
hello-stephen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 4f8c14451a7 [fix](regression) handle cumulative delete-version
compaction wait (#64945)
4f8c14451a7 is described below
commit 4f8c14451a73354f52270f0c94bda5b9c5063da0
Author: shuke <[email protected]>
AuthorDate: Wed Jul 1 21:17:58 2026 +0800
[fix](regression) handle cumulative delete-version compaction wait (#64945)
# [fix](regression) handle cumulative delete-version compaction wait
## Summary
Fix `trigger_and_wait_compaction` so Cloud cumulative compaction that
meets a delete version does not wait until the 300s timeout after valid
progress has already happened.
When cumulative compaction meets a delete version, BE can return
`[E-2010] cumulative compaction meet delete version`, advance the
cumulative point, and let base compaction handle the rowsets. In that
path the cumulative success/failure timestamps may not change, so the
old helper kept polling even after base compaction had completed and
`run_status=false`.
This patch treats `E-2010` plus cumulative point advancement plus a
changed base success time as an equivalent completed cumulative
delete-version path while still waiting when `run_status=true`. If
`E-2010` advances the cumulative point but base success time has not
changed yet, the helper keeps waiting even if the cumulative failure
timestamp changed.
## Root Cause
The case `compaction/test_compacation_with_delete.groovy` creates
alternating data and delete rowsets, then calls
`trigger_and_wait_compaction(tableName, "cumulative")`.
In Cloud mode this can legally follow:
1. cumulative compaction meets delete version and returns `E-2010`
2. cumulative point advances
3. base compaction handles the delete-version rowsets
The helper only watched cumulative success/failure timestamp changes. In
the failing log, base compaction completed in 448 ms, but the helper
waited for 5 minutes because the cumulative timestamps did not change.
## Validation
- `git diff --check`
- `git diff --check origin/master..HEAD`
- Local Groovy condition simulation:
- `E-2010 + cumulative point advanced + base success time changed +
run_status=false` exits wait
- `E-2010 + cumulative point advanced` keeps waiting if base success
time has not changed
- `E-2010 + cumulative point advanced + cumulative failure time changed`
still keeps waiting if base success time has not changed
- `run_status=true` keeps waiting even if the
delete-version/base-success condition is met
- normal cumulative success timestamp change still exits wait when there
is no delete-version handoff
Cloud P0 rerun is still needed for final validation.
---
regression-test/plugins/plugin_compaction.groovy | 29 +++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/regression-test/plugins/plugin_compaction.groovy
b/regression-test/plugins/plugin_compaction.groovy
index e57523416b0..d8ee5951222 100644
--- a/regression-test/plugins/plugin_compaction.groovy
+++ b/regression-test/plugins/plugin_compaction.groovy
@@ -128,6 +128,16 @@ Suite.metaClass.trigger_and_wait_compaction = { String
table_name, String compac
// 3. wait all compaction finished
def running = triggered_tablets.size() > 0
+ def toLongOrNull = { value ->
+ if (value == null) {
+ return null
+ }
+ try {
+ return value.toString().trim().toLong()
+ } catch (Throwable ignored) {
+ return null
+ }
+ }
Awaitility.await().atMost(timeout_seconds,
TimeUnit.SECONDS).pollInterval(1, TimeUnit.SECONDS).until(() -> {
for (tablet in triggered_tablets) {
def be_host = backendId_to_backendIP["${tablet.BackendId}"]
@@ -146,9 +156,26 @@ Suite.metaClass.trigger_and_wait_compaction = { String
table_name, String compac
def tabletStatus = parseJson(stdout.trim())
def oldStatus =
be_tablet_compaction_status.get("${be_host}-${tablet.TabletId}")
// last compaction success/failure time isn't updated,
indicates compaction is not started(so we treat it as running and wait)
+ def handedOffToBaseCompactionAfterDeleteVersion = false
+ def completedByBaseCompactionAfterDeleteVersion = false
+ if (compaction_type == "cumulative") {
+ def oldCumulativePoint =
toLongOrNull(oldStatus["cumulative point"])
+ def newCumulativePoint =
toLongOrNull(tabletStatus["cumulative point"])
+ def lastCumulativeStatus = "${tabletStatus["last
cumulative status"]}".toLowerCase()
+ def baseSuccessTimeChanged = oldStatus["last base success
time"] != tabletStatus["last base success time"]
+ // E-2010 advances the cumulative point and lets base
compaction handle delete-version rowsets.
+ handedOffToBaseCompactionAfterDeleteVersion =
lastCumulativeStatus.contains("e-2010") &&
+ oldCumulativePoint != null && newCumulativePoint
!= null &&
+ newCumulativePoint > oldCumulativePoint
+ completedByBaseCompactionAfterDeleteVersion =
+ handedOffToBaseCompactionAfterDeleteVersion &&
baseSuccessTimeChanged
+ }
def success_time_unchanged = (oldStatus["last
${compaction_type} success time"] == tabletStatus["last ${compaction_type}
success time"])
def failure_time_unchanged = (oldStatus["last
${compaction_type} failure time"] == tabletStatus["last ${compaction_type}
failure time"])
- running = running || (success_time_unchanged &&
failure_time_unchanged)
+ def currentCompactionTimestampChanged =
!success_time_unchanged || !failure_time_unchanged
+ def compactionFinished =
completedByBaseCompactionAfterDeleteVersion ||
+ (!handedOffToBaseCompactionAfterDeleteVersion &&
currentCompactionTimestampChanged)
+ running = running || !compactionFinished
if (running) {
logger.info("compaction is still running, be host:
${be_host}, tablet id: ${tablet.TabletId}, run status:
${compactionStatus.run_status}, old status: ${oldStatus}, new status:
${tabletStatus}")
return false
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]