garyli1019 commented on issue #1377: [HUDI-663] Fix HoodieDeltaStreamer offset not handled correctly URL: https://github.com/apache/incubator-hudi/pull/1377#issuecomment-600344505 discussed with @lamber-ken offline, summary: - We found the empty checkpoint is actually not a bug. https://github.com/apache/incubator-hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java#L331 We need to think about how to handle it. Should we handle the empty checkpoint or we set a special marker to identify this case? This question should be addressed in this PR. - There are three places that will possibly reset the checkpoint https://github.com/apache/incubator-hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java#L260 https://github.com/apache/incubator-hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java#L186 https://github.com/apache/incubator-hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java#L222 We need somehow simplify this. This could be a separate task from this PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services