acvictor opened a new pull request, #11459: URL: https://github.com/apache/incubator-gluten/pull/11459
## What changes are proposed in this pull request? This PR fixes an issue where the numFiles driver-side metric was not being populated when using Gluten/Velox for file scans in Spark 4.0. - Added sendDriverMetrics() call after the if/else block in dynamicallySelectedPartitions to match spark35 shim behavior - Changed getPartitionArray() to use dynamicallySelectedPartitions.filePartitionIterator instead of directly listing files, ensuring the metrics initialization chain is properly triggered The numFiles metric (and other driver-side metrics like filesSize, numPartitions) were always returning 0 in Gluten's Spark 4.0 shim because: - sendDriverMetrics() was never called - When there were no dynamic partition filters, the dynamicallySelectedPartitions method returned selectedPartitions directly without calling sendDriverMetrics() to post the metrics to Spark's metrics system. - getPartitionArray() bypassed the metrics initialization chain - It directly called relation.location.listFiles() instead of using dynamicallySelectedPartitions, which meant the selectedPartitions lazy val (where setFilesNumAndSizeMetric is called) was never evaluated. ## How was this patch tested? UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
