Thanks!

Shen

On Sat, Apr 29, 2017 at 4:08 PM, Eugene Kirpichov <
[email protected]> wrote:

> Hi Shen,
>
> This is a very nice suggestion. Currently there is no way to do this,
> probably because nobody thought of this before, but here's a few thoughts
> anyway.
>
> - Both the Iterable and its Iterator will need to be Serializable, because
> an UnboundedSource must be able to checkpoint and resume, to provide fault
> tolerance in case the worker reading from it crashes. Do your iterables
> satisfy this constraint?
> - Reading will, of course, be sequential rather than parallel; processing
> can still be parallelized, though. I suppose that's fine for your use case.
> - Once you have that - wrapping an UnboundedSource will be possible and an
> interesting exercise. And, I believe, wrapping it with a splittable DoFn
> http://s.apache.org/splittable-do-fn will be much easier, though SDF
> support is yet inconsistent between runners (Direct works, Flink works,
> Apex and Dataflow in review). It'd actually be a good test case of the ease
> of use of the API.
>
> On Sat, Apr 29, 2017 at 12:50 PM Shen Li <[email protected]> wrote:
>
> > It seems that Create.of(Iterable) can only create a BoundedSource. Is
> there
> > a convenient way to read from an unbounded Iterable object without
> writing
> > application code to wrap it into an UnboundedSource object?
> >
> >
> > Thanks,
> >
> > Shen
> >
>

Reply via email to