Re: possible data loss with Kafka I/O

2022-06-07 Thread Deepak Nagaraj
On Tue, Jun 7, 2022 at 11:21 AM Cristian Constantinescu wrote: > > Hey Deepak, > > I have observed this too. See point "a" in "Other quirks I found:" in this > thread [1]. > > [1] https://lists.apache.org/thread/ksd4nfjmzmp97hs2zgn2mfpf8fsy0myw > Yes! This is exactly what we saw as well. Thanks,

Re: possible data loss with Kafka I/O

2022-06-07 Thread Cristian Constantinescu
Hey Deepak, I have observed this too. See point "a" in "Other quirks I found:" in this thread [1]. [1] https://lists.apache.org/thread/ksd4nfjmzmp97hs2zgn2mfpf8fsy0myw On Tue, Jun 7, 2022 at 2:13 PM Chamikara Jayalath wrote: > > > On Tue, Jun 7, 2022 at 11:06 AM Deepak Nagaraj > wrote: > >> H

Re: possible data loss with Kafka I/O

2022-06-07 Thread Deepak Nagaraj
Hi Cham, On Mon, Jun 6, 2022 at 7:18 PM Chamikara Jayalath wrote: > > > On Mon, Jun 6, 2022 at 1:08 PM Ahmet Altay wrote: > >> >> >> On Mon, Jun 6, 2022 at 10:22 AM Chamikara Jayalath >> wrote: >> >>> BTW I think we have already document this behavior of KafkaIO here: >>> https://github.com/ap

Re: possible data loss with Kafka I/O

2022-06-04 Thread Deepak Nagaraj
On Sat, Jun 4, 2022 at 3:35 PM Chamikara Jayalath wrote: > On Sat, Jun 4, 2022 at 1:55 PM Reuven Lax wrote: > >> Cham do you know if the Flunk runner uses the sdf version or the old >> version? >> > > I think that depends on whether the experiment "use_deprecated_read" was > specified or not. If

Re: possible data loss with Kafka I/O

2022-06-04 Thread Deepak Nagaraj
Hi Cham, Thank you for your response. One thing I didn't mention earlier is: all of this is with Beam's Flink runner. On Sat, Jun 4, 2022 at 9:55 AM Chamikara Jayalath wrote: > >> >> * Kafka consumer config: enable.auto.commit set to true (default), >> auto.offset.reset set to latest (default)

possible data loss with Kafka I/O

2022-06-03 Thread Deepak Nagaraj
Hi Beam team, We have seen data loss with Beam pipelines under the following condition. i.e., Beam thinks it has processed data when in reality it has not: * Kafka consumer config: enable.auto.commit set to true (default), auto.offset.reset set to latest (default) * ReadFromKafka(): commit_of