puneetdixit200 opened a new pull request, #18901: URL: https://github.com/apache/hudi/pull/18901
### Describe the issue this Pull Request addresses Closes #18758. ### Summary and Changelog Adds Lance support for metadata column statistics so Lance base files can contribute column ranges to the metadata table. Changes: - Implement `LanceUtils.readColumnStatsFromMetadata` by reading projected Lance columns and collecting `HoodieColumnRangeMetadata` with the existing metadata utility. - Route `.lance` base files through the column-range metadata reader. - Enable column stats and partition stats metadata partitions for Lance tables now that Lance column ranges can be produced. - Add focused regression coverage for Lance column range metadata and Lance metadata partition enablement. No code was copied. ### Impact Lance base files can now populate metadata column stats. Partition stats can also be enabled for partitioned Lance tables because they aggregate per-file column stats. No public API, storage format, or config key changes are introduced. ### Risk Level medium This changes metadata-table behavior for Lance tables by enabling existing stats indexes for that file format. Verification covers the Lance stats reader path and the metadata partition enablement gate. ### Documentation Update none No new config is added; this enables existing column/partition stats behavior for Lance. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Enough context is provided in the sections above - [x] Adequate tests were added if applicable Local verification: - `mvn test -q -Punit-tests -pl hudi-spark-datasource/hudi-spark -am -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TestHoodieSparkLanceReader#testReadColumnStatsFromMetadata -DwildcardSuites=abc -Dspark3.5 -Dlance.skip.tests=false -Dmaven.repo.local=/private/tmp/hudi-18758-m2` - `mvn test -q -Punit-tests -pl hudi-common -am -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TestMetadataPartitionType#testColumnAndPartitionStatsEnabledForLanceTables -DwildcardSuites=abc -Dmaven.repo.local=/private/tmp/hudi-18758-m2` - `git diff --check` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
