Hi Bill, The batch part of the Flink Runners supports reading from a finite collection, but I'm assuming you're working with the streaming execution of the Flink runner. We haven't implemented support in the Runner yet but Flink natively supports reading from finite sources. So it looks fairly easy to implement. I've filed a JIRA issue and would like to look into this later today.
I've recently added a test case (UnboundedSourceITCase) which throws a custom exception when the end of the output has been reached. Admittedly, that is not a very nice approach but it works. All the more, we should support finite sources in streaming mode. Best, Max On Wed, Mar 30, 2016 at 2:43 AM, William McCarthy <[email protected] > wrote: > Hi, > > I want to get access to a bounded PCollection in Beam on the Flink runner. > Ideally, I’d pull from HDFS. Is that supported? If not, what would be the > best way for me to get something bounded, for testing purposes? > > Thanks, > > Bill
