[ 
https://issues.apache.org/jira/browse/BEAM-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755490#comment-16755490
 ] 

Raghu Angadi commented on BEAM-2185:
------------------------------------

You are correct about issue about 'BoundedReadFromUnoundedSource'. 

May be there should be big warning that withMaxRecords() is meant mainly for 
testing and debugging. There is no 'Finalize()' construct in Beam batch. 
commitOffsetsInFinalize() was never meant for Batch usage. As mentioned in 
BEAM-6466, KafkaIO can improve implementation of commitOffsetsInFinalize() to 
work better with in this context.

> KafkaIO bounded source
> ----------------------
>
>                 Key: BEAM-2185
>                 URL: https://issues.apache.org/jira/browse/BEAM-2185
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-kafka
>            Reporter: Raghu Angadi
>            Priority: Major
>
> KafkaIO could be a useful source for batch applications as well. It could 
> implement a bounded source. The primary question is how the bounds are 
> specified.
> One option : Source specifies a time period (say 9am-10am), and KafkaIO 
> fetches appropriate start and end offsets based on time-index in Kafka. This 
> would suite many batch applications that are launched on a scheduled.
> Another option is to always read till the end and commit the offsets to 
> Kafka. Handling failures and multiple runs of a task might be complicated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to