[GitHub] [hudi] sbernauer commented on pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

GitBox Thu, 06 May 2021 22:48:34 -0700


sbernauer commented on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-834085407



   Hi together,
   
   we sadly can't do schema evolution for 10 months now 
(https://github.com/apache/hudi/issues/1845) and have to rely on ugly 
workarounds.
   Many thanks for working together to find a solution!
   We have tested this patch out in our test systems and everything worked 
fine. When we rolled it out to production we noticed that the Memory 
consumption increased by multiple times. This caused our executors to spill to 
disk and crash.
   So i would like to highlight the comment of @sathyaprakashg
   > @n3nash I am working on fixing build issue and will have that fix pushed 
soon. I would like to point out that with this new approach, we are stroing 
writer schema part of payload, which means, size of dataframe would increase to 
store same schema information with each record. Any suggestion on optimizing 
this?
   
   Regards,
   Sebastian


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] sbernauer commented on pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

Reply via email to