[I] Fetch schema from log file while computing col stats [hudi]

via GitHub Sun, 30 Nov 2025 05:37:33 -0800


hudi-bot opened a new issue, #17301:
URL: https://github.com/apache/hudi/issues/17301


   We are using table schema resolver to fetch writer schema to read log file 
while computing col stats. Ref : [https://github.com/apache/hudi/pull/12105] 
   
    
   
   Lets follow up to see if we can just fetch the schema from the log file 
directly rather than using table schema resolver. 
   
    
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-8445
   - Type: Sub-task
   - Parent: https://issues.apache.org/jira/browse/HUDI-8727
   - Fix version(s):
     - 1.1.0
   
   
   ---
   
   
   ## Comments
   
   28/Oct/24 17:16;yihua;The log blocks and files can be written by different 
commits using different writer schemas in case there is schema evolution.  
Strictly speaking, the column stats of a log file should only contain the 
columns that exist in the file.  Using the table schema in this case may 
introduce additional column stats entries of columns that do not exist in the 
log file, making it hard to understand.;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Fetch schema from log file while computing col stats [hudi]

Reply via email to