hudi-bot opened a new issue, #17301: URL: https://github.com/apache/hudi/issues/17301
We are using table schema resolver to fetch writer schema to read log file while computing col stats. Ref : [https://github.com/apache/hudi/pull/12105] Lets follow up to see if we can just fetch the schema from the log file directly rather than using table schema resolver. ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-8445 - Type: Sub-task - Parent: https://issues.apache.org/jira/browse/HUDI-8727 - Fix version(s): - 1.1.0 --- ## Comments 28/Oct/24 17:16;yihua;The log blocks and files can be written by different commits using different writer schemas in case there is schema evolution. Strictly speaking, the column stats of a log file should only contain the columns that exist in the file. Using the table schema in this case may introduce additional column stats entries of columns that do not exist in the log file, making it hard to understand.;;; -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
