Hey Deepak, I have observed this too. See point "a" in "Other quirks I found:" in this thread [1].
[1] https://lists.apache.org/thread/ksd4nfjmzmp97hs2zgn2mfpf8fsy0myw On Tue, Jun 7, 2022 at 2:13 PM Chamikara Jayalath <chamik...@google.com> wrote: > > > On Tue, Jun 7, 2022 at 11:06 AM Deepak Nagaraj <deepak.naga...@primer.ai> > wrote: > >> Hi Cham, >> >> On Mon, Jun 6, 2022 at 7:18 PM Chamikara Jayalath <chamik...@google.com> >> wrote: >> >>> >>> >>> On Mon, Jun 6, 2022 at 1:08 PM Ahmet Altay <al...@google.com> wrote: >>> >>>> >>>> >>>> On Mon, Jun 6, 2022 at 10:22 AM Chamikara Jayalath < >>>> chamik...@google.com> wrote: >>>> >>>>> BTW I think we have already document this behavior of KafkaIO here: >>>>> https://github.com/apache/beam/blob/4b623313707df8a3c3846412f54edf2e3c947374/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L274 >>>>> >>>>> After the first checkpoint, the readers should start reading from the >>>>> last saved offsets, so it seems like the behavior you are observing is >>>>> primarily regarding picking the starting offset of the reader. I'm not >>>>> sure >>>>> if we need to do changes to make this more safer for Flink pipeline >>>>> restarts. >>>>> >>>> >>>> Would it make sense to make this a warning? It is easy to miss this >>>> note in the javadocs. >>>> >>> >>> Makes sense to me. Deepak, will you be able to send a PR for this and/or >>> create a Jira ? >>> >>> >> Thanks -- sure, did you mean Github Issues? I've seen my past tickets get >> migrated to Github now. >> > > Yeah, a Github issue. Thanks. > > >> >> Deepak >> >>