[I] Make sure HoodieTableMetadata API allows to read both in-memory and on-cluster [hudi]

via GitHub Sat, 29 Nov 2025 22:19:28 -0800


hudi-bot opened a new issue, #15690:
URL: https://github.com/apache/hudi/issues/15690


   Currently, most of the HoodieTableMetadata APIs (getColumnStats, 
getBloomFilters, etc) provide only for loading the data from MT in-memory which 
shifts the burden on the caller to break their requests in chunks to make sure 
these fit in memory: for ex, trying to read Bloom Filters from MT, Bloom Index 
(caller) have to make sure we're not reading more than 256 filters at a time to 
limit its memory footprint.
   
    
   
   Instead HoodieTableMetadata API has to be
    # Rebased to rely on HoodieData
    # Provide levers to the caller whether MT should be read in-memory or 
on-cluster
    #  
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5556
   - Type: Improvement
   - Epic: https://issues.apache.org/jira/browse/HUDI-1292
   - Fix version(s):
     - 1.1.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Make sure HoodieTableMetadata API allows to read both in-memory and on-cluster [hudi]

Reply via email to