hudi-bot opened a new issue, #15169:
URL: https://github.com/apache/hudi/issues/15169

   Currently, even when meta fields are not populated, we still insert 
empty-string columns to adhere to the expected schema.
   
   This has a non-trivial overhead of ~20% (relative to just writing dataset as 
is), since Spark had to essentially "re-write" the original row with prepended 
new fields.
   
   We should investigate whether it's feasible to avoid adding empty-string 
columns completely if meta-fields are disabled.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-4036
   - Type: Task
   - Epic: https://issues.apache.org/jira/browse/HUDI-3249


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to