sbernauer commented on pull request #2012: URL: https://github.com/apache/hudi/pull/2012#issuecomment-834085407
Hi together, we sadly can't do schema evolution for 10 months now (https://github.com/apache/hudi/issues/1845) and have to rely on ugly workarounds. Many thanks for working together to find a solution! We have tested this patch out in our test systems and everything worked fine. When we rolled it out to production we noticed that the Memory consumption increased by multiple times. This caused our executors to spill to disk and crash. So i would like to highlight the comment of @sathyaprakashg > @n3nash I am working on fixing build issue and will have that fix pushed soon. I would like to point out that with this new approach, we are stroing writer schema part of payload, which means, size of dataframe would increase to store same schema information with each record. Any suggestion on optimizing this? Regards, Sebastian -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org