puneetdixit200 opened a new pull request, #18901:
URL: https://github.com/apache/hudi/pull/18901

   ### Describe the issue this Pull Request addresses
   
   Closes #18758.
   
   ### Summary and Changelog
   
   Adds Lance support for metadata column statistics so Lance base files can 
contribute column ranges to the metadata table.
   
   Changes:
   - Implement `LanceUtils.readColumnStatsFromMetadata` by reading projected 
Lance columns and collecting `HoodieColumnRangeMetadata` with the existing 
metadata utility.
   - Route `.lance` base files through the column-range metadata reader.
   - Enable column stats and partition stats metadata partitions for Lance 
tables now that Lance column ranges can be produced.
   - Add focused regression coverage for Lance column range metadata and Lance 
metadata partition enablement.
   
   No code was copied.
   
   ### Impact
   
   Lance base files can now populate metadata column stats. Partition stats can 
also be enabled for partitioned Lance tables because they aggregate per-file 
column stats. No public API, storage format, or config key changes are 
introduced.
   
   ### Risk Level
   
   medium
   
   This changes metadata-table behavior for Lance tables by enabling existing 
stats indexes for that file format. Verification covers the Lance stats reader 
path and the metadata partition enablement gate.
   
   ### Documentation Update
   
   none
   
   No new config is added; this enables existing column/partition stats 
behavior for Lance.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable
   
   Local verification:
   
   - `mvn test -q -Punit-tests -pl hudi-spark-datasource/hudi-spark -am 
-DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false 
-Dtest=TestHoodieSparkLanceReader#testReadColumnStatsFromMetadata 
-DwildcardSuites=abc -Dspark3.5 -Dlance.skip.tests=false 
-Dmaven.repo.local=/private/tmp/hudi-18758-m2`
   - `mvn test -q -Punit-tests -pl hudi-common -am -DfailIfNoTests=false 
-Dsurefire.failIfNoSpecifiedTests=false 
-Dtest=TestMetadataPartitionType#testColumnAndPartitionStatsEnabledForLanceTables
 -DwildcardSuites=abc -Dmaven.repo.local=/private/tmp/hudi-18758-m2`
   - `git diff --check`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to