nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-811855554


   Myself and Nishith discussed on this. Here is our proposal. 
   Let's rely on Deltastreamer.Config.checkpoint to pass in any type of 
checkpoint. 
   We can add another config called "checkpoint.type" which could default to 
string for all default checkpoints. For checkpoint of interest of this PR, we 
could set the value for this new config to "timestamp". 
   
   With this, its upto each source to parse and interpret the checkpoint value 
and DeltaSync does not need to deal w/ diff checkpointing formats. 
   
   Having said this, DeltaSync readFromSource() should not have any changes in 
this diff. 
   KafkaOffsetGen should have logic to parse diff checkpoint values, based on 
two values(deltastreamer.config.checkpoint and checkpoint.type). 
   
   With this, we also moved source specific checkpointing logic within source 
specific class and did not leak it to DeltaSync which should be agnostic to 
different Source. 
   
   @liujinhui1994 : Let me know what do you think. Happy to chat more on this. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to