linliu-code opened a new issue, #18861: URL: https://github.com/apache/hudi/issues/18861
### Describe the problem During compaction and clustering **plan scheduling**, the planner collects eligible file slices one partition at a time: each partition is processed as an independent task that issues its own file listing. On metadata-table (MDT) backed tables, each of those listings is a separate read against the MDT `files` partition. For tables with a large number of partitions this becomes **O(N) independent metadata reads**, serialized across the available executor cores and dominated by per-read I/O latency despite negligible CPU per task. As a result, plan-generation latency grows roughly linearly with partition count and can dominate the scheduling phase for partition-heavy MoR/streaming tables. ### Why it's avoidable The metadata table already supports fetching files for many partitions in a **single batched read**, and the file-system view already exposes a partition pre-load entry point — the plan generators simply don't use them. There's also a **latent** case where the filesystem-backed metadata lists partitions sequentially on the driver. ### Proposal Pre-load all required partitions in one batched metadata read before building the plan, so plan generation issues a single read instead of N. Gate this on metadata-table availability so non-MDT tables keep today's fully-distributed listing path, and parallelize the sequential filesystem-backed listing. The produced plan is unchanged. ### Impact Lower, partition-count-independent plan-scheduling latency on a hot path exercised by every MoR/streaming deployment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
