Hi Shen,

This is a very nice suggestion. Currently there is no way to do this,
probably because nobody thought of this before, but here's a few thoughts
anyway.

- Both the Iterable and its Iterator will need to be Serializable, because
an UnboundedSource must be able to checkpoint and resume, to provide fault
tolerance in case the worker reading from it crashes. Do your iterables
satisfy this constraint?
- Reading will, of course, be sequential rather than parallel; processing
can still be parallelized, though. I suppose that's fine for your use case.
- Once you have that - wrapping an UnboundedSource will be possible and an
interesting exercise. And, I believe, wrapping it with a splittable DoFn
http://s.apache.org/splittable-do-fn will be much easier, though SDF
support is yet inconsistent between runners (Direct works, Flink works,
Apex and Dataflow in review). It'd actually be a good test case of the ease
of use of the API.

On Sat, Apr 29, 2017 at 12:50 PM Shen Li <cs.she...@gmail.com> wrote:

> It seems that Create.of(Iterable) can only create a BoundedSource. Is there
> a convenient way to read from an unbounded Iterable object without writing
> application code to wrap it into an UnboundedSource object?
>
>
> Thanks,
>
> Shen
>

Reply via email to