This is an automated email from the ASF dual-hosted git repository.
liaoxin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 0f27e8ac1f1 [fix](cloud) Skip empty rowsets before accessor lookup in
batch delete_rowset_data (#60919)
0f27e8ac1f1 is described below
commit 0f27e8ac1f15ad41f906c8d3587bae7082dc56ae
Author: Xin Liao <[email protected]>
AuthorDate: Tue Mar 10 11:21:18 2026 +0800
[fix](cloud) Skip empty rowsets before accessor lookup in batch
delete_rowset_data (#60919)
## Proposed changes
In batch `delete_rowset_data`, empty rowsets (e.g. base compaction
output of empty rowsets) have `num_segments=0` and no `resource_id` set.
The `accessor_map_.find("")` fails and sets `ret=-1`, which causes the
caller to skip `txn_remove` for the entire batch. This prevents recycle
KV keys from being cleaned up, creating a perpetual loop where the same
rowsets are scanned every recycle round.
The fix moves the `num_segments <= 0` check before the `accessor_map_`
lookup so these empty rowsets are safely skipped without poisoning the
batch return value.
## Problem summary
- Empty rowsets from base compaction have `resource_id=""`,
`rowset_meta_size=181`, `num_segments=0`
- `accessor_map_.find("")` fails, sets `ret = -1`, `txn_remove` skipped
for entire batch
- Normal rowsets in the same batch: object storage data already deleted,
but recycle KV not cleaned up
- Next recycle round re-scans the same rowsets, forming a dead loop
---
cloud/src/recycler/recycler.cpp | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/cloud/src/recycler/recycler.cpp b/cloud/src/recycler/recycler.cpp
index 1d68877f851..23068ab66ff 100644
--- a/cloud/src/recycler/recycler.cpp
+++ b/cloud/src/recycler/recycler.cpp
@@ -3563,6 +3563,18 @@ int InstanceRecycler::delete_rowset_data(
}
}
+ int64_t num_segments = rs.num_segments();
+ // Check num_segments before accessor lookup, because empty rowsets
+ // (e.g. base compaction output of empty rowsets) may have no
resource_id
+ // set. Skipping them early avoids a spurious "no such resource id"
error
+ // that marks the entire batch as failed and prevents txn_remove from
+ // cleaning up recycle KV keys.
+ if (num_segments <= 0) {
+ metrics_context.total_recycled_num++;
+ metrics_context.total_recycled_data_size += rs.total_disk_size();
+ continue;
+ }
+
auto it = accessor_map_.find(rs.resource_id());
// possible if the accessor is not initilized correctly
if (it == accessor_map_.end()) [[unlikely]] {
@@ -3585,12 +3597,6 @@ int InstanceRecycler::delete_rowset_data(
ret = -1;
continue;
}
- int64_t num_segments = rs.num_segments();
- if (num_segments <= 0) {
- metrics_context.total_recycled_num++;
- metrics_context.total_recycled_data_size += rs.total_disk_size();
- continue;
- }
// Process delete bitmap - check if it's stored in packed file
bool delete_bitmap_is_packed = false;
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]