nsivabalan commented on pull request #2438: URL: https://github.com/apache/hudi/pull/2438#issuecomment-811855554
Myself and Nishith discussed on this. Here is our proposal. Let's rely on Deltastreamer.Config.checkpoint to pass in any type of checkpoint. We can add another config called "checkpoint.type" which could default to string for all default checkpoints. For checkpoint of interest of this PR, we could set the value for this new config to "timestamp". With this, its upto each source to parse and interpret the checkpoint value and DeltaSync does not need to deal w/ diff checkpointing formats. Having said this, DeltaSync readFromSource() should not have any changes in this diff. KafkaOffsetGen should have logic to parse diff checkpoint values, based on two values(deltastreamer.config.checkpoint and checkpoint.type). With this, we also moved source specific checkpointing logic within source specific class and did not leak it to DeltaSync which should be agnostic to different Source. @liujinhui1994 : Let me know what do you think. Happy to chat more on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org