wombatu-kun commented on PR #18995:
URL: https://github.com/apache/hudi/pull/18995#issuecomment-4701501486

   @voonhous dug into the master failure (run 27489081724). #18995 removed only 
one of several asynchronous sources of these spans, so the race stayed open.
   
   The tests assert the exact per-query filesystem-span multiset, but Trino 
resets the span exporter at the start of each `executeWithPlan`. Any span a 
background thread emits after the synchronous query returns therefore lands in 
the next test's measurement window, which is the symmetric off-by-N you see. 
Besides the stats refresh, the metadata table is read by the background 
split-loader / partition-listing / index-support pools, and the Alluxio variant 
additionally depends on async cache warmth (a read is `Alluxio.readCached` vs 
`readExternal` depending on whether an earlier cache write finished). Every 
observed flake delta was a `METADATA_TABLE` op (plus `Alluxio.readCached` on 
the Alluxio test).
   
   Fix in #19004: stop asserting the racy quantities and keep only the 
synchronous foreground reads, i.e. filter out `METADATA_TABLE` ops in all three 
classes and all `Alluxio.*` ops in the Alluxio class. 
`hudi.table-statistics-enabled=false` stays, because the stats executor also 
reads `index.json` and the `hoodie.properties` files on a background thread, 
which are part of the surviving asserted set (dropping the flag reproduced an 
off-by-one on exactly those files). Verified locally: build with zero 
checkstyle violations and 20 consecutive green runs of the three classes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to