Re: [PR] [improvement](fe) Preload external table metadata before internal table lock in mixed queries [doris]

via GitHub Wed, 03 Jun 2026 01:54:08 -0700


morningman commented on PR #64035:
URL: https://github.com/apache/doris/pull/64035#issuecomment-4610649668


   ## 1. Avoid warming *latest* schema/partitions for tables referenced only 
via non-latest relations
   
   For a table referenced only by non-latest relations (time travel / branch / 
tag, i.e. `hasNonLatestRelation == true`):
   
   - The latest-snapshot preload is correctly skipped.
   - But `getBaseSchema()` (latest schema) and 
`initSelectedPartitions(getSnapshot(table))` still run. Here 
`getSnapshot(table)` returns `Optional.empty()`, and for Iceberg 
`IcebergUtils.getSnapshotCacheValue(empty, table)` falls through to 
`getLatestSnapshotCacheValue(...)` — so it warms the **latest** version's 
schema/partitions, while the query actually targets a specific historical 
snapshot/branch/tag.
   
   This is **not a correctness bug**: the `initSelectedPartitions` return value 
is discarded in preload (it only warms caches), and the real `LogicalFileScan` 
recomputes with the correct snapshot. But it is **wasted work on exactly the 
slow metadata path this PR is trying to optimize**, adding needless latency to 
time-travel-style mixed queries. Consider gating the partition preload (and 
possibly the whole preload) on the latest-only condition:
   
   ```java
   boolean preloadPartition = table.supportInternalPartitionPruned() && 
preloadLatestSnapshot;
   ```
   
   ## 2. Mark the regression suite `nonConcurrent`, and reconsider the 
`Suite.groovy` relaxation
   
   `test_jdbc_preload_profile.groovy` declares only the `external` group. Per 
`RegressionTest.getGroupExecType`, that maps to **NORMAL (concurrent)** 
execution, yet the test sets a **global** debug point via 
`enableDebugPointForAllFEs("PluginDrivenExternalTable.initSchema.sleep", ...)`. 
While that debug point is active, any concurrently running external suite that 
triggers a JDBC `initSchema` cache miss is forced to sleep 10s — a recipe for 
**cross-test flakiness / timeouts**.
   
   The cleaner fix is to mark the suite `nonConcurrent`:
   
   ```groovy
   suite("test_jdbc_preload_profile", "external,nonConcurrent") {
   ```
   
   `getGroupExecType` checks `nonConcurrent` first and returns `SINGLE`, so the 
suite runs isolated on a single-threaded executor **and** satisfies the 
*original* debug-point guard (`SINGLE` was already allowed). In other words, 
once `nonConcurrent` is added, this PR's change to `Suite.groovy` is **no 
longer necessary**.
   
   The `Suite.groovy` change (allowing debug points in the `external` group) 
**permanently weakens a guardrail**: future authors could then use a global 
debug point inside a *concurrently executed* external suite and silently 
introduce cross-test interference. Prefer the `nonConcurrent` approach and 
consider reverting the framework change. **Why it matters**: a debug point is 
process-global mutable state; suites that use one must run in isolation, 
otherwise you get hard-to-diagnose CI flakiness.
   
   ## 3. Clarify profile time attribution to avoid double counting
   
   `NEREIDS_LOCK_TABLE_TIME = nereidsLockTableFinishTime − parseSqlFinishTime`, 
and `nereidsLockTableFinishTime` is set *after* `lock()`. Since preload runs 
inside that window, the preload duration is counted **both** in the new 
`Nereids Preload External Metadata Time` counter **and** in the parent `Nereids 
Lock Table Time`. Both are rendered at the same indentation level (siblings), 
so a user summing the sub-stages will double-count preload. (Note: `collect()` 
time was already inside this window pre-PR; this change just adds preload to 
it.)
   
   Consider either documenting that "Lock Table Time" already includes Collect 
+ Preload + Lock (so the sub-stages shouldn't be naively summed), or narrowing 
the `Nereids Lock Table Time` computation to cover only the actual locking. Low 
priority — observability only.
   
   ## 4. (Minor) Use `TimeUtils.getElapsedTimeMs` instead of manual subtraction
   
   `NereidsPlanner` and `StatementContext` compute elapsed time as 
`TimeUtils.getStartTimeMs() - startTime` in several places. `TimeUtils` already 
provides `getElapsedTimeMs(long startTime)`, which reads more clearly. 
(`getStartTimeMs()` is a slightly misleading name for "read the current 
instant," though its implementation is just `System.currentTimeMillis()`, so 
the result is correct.) Purely a readability cleanup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [improvement](fe) Preload external table metadata before internal table lock in mixed queries [doris]

Reply via email to