vinothchandar edited a comment on issue #1377: [HUDI-663] Fix 
HoodieDeltaStreamer offset not handled correctly
URL: https://github.com/apache/incubator-hudi/pull/1377#issuecomment-596357138
 
 
   >>then start the delta streamer, hudi will store the empty checkpoint.
   Re-reading this again.. Is this the right behavior? I think there are a few 
cases now handled in delta-streamer that has made life a bit complicated.. 
   
   Reason for writing such empty checkpoint could be that - we want to write 
checkpoints even for empty commits, since it could have read data but the 
transformer could have filtered all of that out.. 
   I think the right fix could be to checkpoint the actual fromOffsets instead 
of empty checkpoint.. 
   
   >>the second commit will use the last checkpoint {}, which means the 
fromoffset is 0.
   but the previous messages may be removed because of kafka retention 
mechanism.
   
   And this is because we enter `checkupValidOffsets` right? 
   
   I'd appreciate it if we took into consideration how checkpoint is handled in 
a general source agnostic way and also fix this issue.. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to