nsivabalan commented on pull request #4660:
URL: https://github.com/apache/hudi/pull/4660#issuecomment-1023768290


   Here are the points to consider before we can make 
DefaultHoodieRecordPayload as default
   As of now, its easier on the write path, bcoz, users can set the payload 
properties in the write config and all write path can rely on that. But we have 
quite a few read paths, like RealtimeCompactedRecordReader, 
RealtimeUnmergedRecordReader,HoodieMergeOnReadRDD which tries to call into 
getInsert() or combineAndGetUpdate() method on the record payload. 
   So, we are in need of serializing the properties in table config and reuse 
them in these flows. As of now, preCombine field is already part of table 
config and hence we can set the hoodie.payload.ordering.field based on that. 
But for "hoodie.payload.event.time.field", we don't serialize them to table 
config as of today. 
   
   So, if we really want to leverage event time, we might have to add the field 
to table config. If not, we may not update the event time in some of the read 
flows. 
   
   This patch have a close connection with 
https://github.com/apache/hudi/pull/4681. in this patch, we can see where all 
we might need to read the initialize the payload properties required to be 
passed into DefaultHoodieRecordPayload method calls. 
   
   Wanted to hear your thoughts on this. 
   @xushiyan @codope @yihua 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to