Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-18 Thread Raghu Angadi
On Thu, Jan 10, 2019 at 7:57 AM Alexey Romanenko wrote: > Don’t you think that we could have some race condition there since, > according to initial issue description, sometimes offset was committed and > sometimes not? > Yeah, there is a timing issue. 'finalizeCheckpoint()' does not wait until

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-18 Thread Alexey Romanenko
Hi Jozef, I’m not aware if someone is working on this. In mean time, I created a Jira for this: https://issues.apache.org/jira/browse/BEAM-6466 Feel free to contribute if you wish. > On 17 Jan 2019, at 09:10, Jozef Vilcek wrote: > > Hello, > w

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-17 Thread Jozef Vilcek
Hello, was there any progress on this or JIRA I can follow? I could use bounded processing over KafkaIO too. Thanks, Jozef On Thu, Jan 10, 2019 at 4:57 PM Alexey Romanenko wrote: > Don’t you think that we could have some race condition there since, > according to initial issue description, some

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-10 Thread Alexey Romanenko
Don’t you think that we could have some race condition there since, according to initial issue description, sometimes offset was committed and sometimes not? > On 9 Jan 2019, at 19:48, Raghu Angadi wrote: > > Oh, the generic bounded source wrapper over an unbounded source does not seem > to c

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-09 Thread Raghu Angadi
Oh, the generic bounded source wrapper over an unbounded source does not seem to call finalize when it is done with a split. I think it should. Could you file a bug for the wrapper? Mean while, this check could be added sanity checks in KafkaIO.Read.expand(). On Wed, Jan 9, 2019 at 10:37 AM And

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-09 Thread André Missaglia
Hi Juan, After researching a bit, I found this issue, which is open since 2017: https://issues.apache.org/jira/browse/BEAM-2185 I guess KafkaIO isn't intended to provide a bounded source. Maybe I should write my own code that fetches messages from kafka, even if it means giving up on some process

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-09 Thread Juan Carlos Garcia
Just for you to have a look where this happen: https://github.com/apache/beam/blob/dffe2c1a2bd95f78869b266d3e1ea3f8ad8c323d/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java#L584 Cheers On Wed, Jan 9, 2019 at 5:09 PM Juan Carlos Garcia wrote: > I also expe

Re: KafkaIO not commiting offsets when using withMaxReadTime

2019-01-09 Thread Juan Carlos Garcia
I also experience the same, as per the documentation **withMaxReadTime** and **withMaxNumRecords** are mainly used for Demo purposes, so i guess is beyond the scope of the current KafkaIO to behave as Bounded with offset management or just something is missing in the current implementation (Waterma

KafkaIO not commiting offsets when using withMaxReadTime

2019-01-09 Thread André Missaglia
Hello everyone, I need to do some batch processing that uses messages in a Kafka topic. So I tried the "withMaxReadTime" KafkaIO setting: --- val properties = new Properties() properties.setProperty("bootstrap.servers", "...") properties.setProperty("group.id", "mygroup") properties.setProperty("