Dippatel98 commented on code in PR #32456:
URL: https://github.com/apache/beam/pull/32456#discussion_r1761444351
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/ReadFromKafkaDoFn.java:
##########
@@ -461,6 +461,11 @@ public ProcessContinuation processElement(
return ProcessContinuation.resume();
}
for (ConsumerRecord<byte[], byte[]> rawRecord : rawRecords) {
+ // If the Kafka consumer returns a record with an offset that is
already processed
+ // the record can be safely skipped.
+ if (rawRecord.offset() < startOffset) {
+ continue;
Review Comment:
Yeah, that makes sense. But I am not super familiar with
`ProcessContinuation.resumer()`. From the docs it seems that the DoFn will
begin processing the same element again, does that not cause an infinite loop
since the element is already processed.
Maybe if the processing begins from the beginning of the DoFn and we seek
again then maybe we can get a different result.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]