Hi Beam Community,

While preparing a few dataflow streaming pipelines, with Kafka as a source, I 
have come across a bit of an issue. Some of the topics I am reading from have 
very low throughput, but I hope to utilise the withStartReadTime option to help 
control the offset at start up.

The issue I am facing is related to the hard failure which arises when there is 
no data present to consume after setting the withStartReadTime option as 
documented here [1]. Draining is blocked while this hard error is occurring 
this gives false alerts in our monitoring to detect failing jobs. The use of 
multiple topics is also problematic as the job will not read from any topic as 
long as any one is producing this error.

I would like to understand why has this been made such a hard error when it 
feels a situation pipelines can easily be in, and would there be any 
possibility of reducing it to a softer error allowing features such as draining 
and multiple topics on these jobs.

Thanks for any help understanding this issue,

Dan

[1] 
https://beam.apache.org/releases/javadoc/2.27.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#withStartReadTime-org.joda.time.Instant-

Reply via email to