vinothchandar edited a comment on issue #1377: [HUDI-663] Fix HoodieDeltaStreamer offset not handled correctly URL: https://github.com/apache/incubator-hudi/pull/1377#issuecomment-596357138 >>then start the delta streamer, hudi will store the empty checkpoint. Re-reading this again.. Is this the right behavior? I think there are a few cases now handled in delta-streamer that has made life a bit complicated.. Reason for writing such empty checkpoint could be that - we want to write checkpoints even for empty commits, since it could have read data but the transformer could have filtered all of that out.. I think the right fix could be to checkpoint the actual fromOffsets instead of empty checkpoint.. >>the second commit will use the last checkpoint {}, which means the fromoffset is 0. but the previous messages may be removed because of kafka retention mechanism. And this is because we enter `checkupValidOffsets` right? I'd appreciate it if we took into consideration how checkpoint is handled in a general source agnostic way and also fix this issue..
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services