github-actions[bot] commented on code in PR #63310:
URL: https://github.com/apache/doris/pull/63310#discussion_r3425103907


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -67,6 +69,14 @@ public class CatalogRecycleBin extends MasterDaemon 
implements Writable {
 
     private final ReentrantReadWriteLock lock = new 
ReentrantReadWriteLock(true);
 
+    // Injectable clock source, defaults to system clock
+    private LongSupplier clock = System::currentTimeMillis;

Review Comment:
   `CatalogRecycleBin.write()` still serializes `this` with 
`GsonUtils.GSON.toJson(this)` and `readFieldsWithGson()` immediately calls 
`GsonUtils.GSON.fromJson(..., CatalogRecycleBin.class)` on that JSON. Because 
this new test hook is not `transient` or otherwise excluded, the image JSON now 
contains a `LongSupplier` implementation. On image load Gson has to deserialize 
that interface field even though the returned object is ignored, which can fail 
and at minimum persists a non-deterministic test-only clock implementation. 
Please keep the injected clock out of the persisted image.
   
   ```suggestion
       private transient LongSupplier clock = System::currentTimeMillis;
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -1393,13 +1403,14 @@ public void addTabletToInvertedIndex() {
 
     @Override
     protected void runAfterCatalogReady() {
-        long currentTimeMs = System.currentTimeMillis();
         // should follow the partition/table/db order
         // in case of partition(table) is still in recycle bin but table(db) 
is missing
+        // Each erase method gets its own currentTimeMs to avoid using a stale 
timestamp,
+        // since previous erase operations may take significant time.
         int keepNum = Config.max_same_name_catalog_trash_num;
-        erasePartition(currentTimeMs, keepNum);
-        eraseTable(currentTimeMs, keepNum);
-        eraseDatabase(currentTimeMs, keepNum);
+        erasePartition(clock.getAsLong(), keepNum);
+        eraseTable(clock.getAsLong(), keepNum);
+        eraseDatabase(clock.getAsLong(), keepNum);

Review Comment:
   This fresh timestamp can remove a parent after its child phase already 
skipped the child, which violates the invariant in the comment above. Concrete 
drop-database case: tables are recycled before the db in 
`InternalCatalog.unprotectDropDb()`/`recycleDatabase()`. If `eraseTable()` is 
called at `tableRecycleTime + expire - 1` it leaves the table in `idToTable`; 
if that phase then takes long enough, this later 
`eraseDatabase(clock.getAsLong(), ...)` can see the db as expired and the 
normal expired branch only removes `idToDatabase`/logs `OP_ERASE_DB` (it does 
not call `eraseAllTables`). After replay or image rebuild the table remains in 
the recycle bin with its database already erased, exactly what the 
partition/table/db ordering is supposed to prevent. The same boundary exists 
between partition and table. Please either use a consistent cutoff for 
dependent child/parent phases, or make parent erasure check/erase remaining 
children before logging the parent erase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to