Thanks! Shen
On Sat, Apr 29, 2017 at 4:08 PM, Eugene Kirpichov < [email protected]> wrote: > Hi Shen, > > This is a very nice suggestion. Currently there is no way to do this, > probably because nobody thought of this before, but here's a few thoughts > anyway. > > - Both the Iterable and its Iterator will need to be Serializable, because > an UnboundedSource must be able to checkpoint and resume, to provide fault > tolerance in case the worker reading from it crashes. Do your iterables > satisfy this constraint? > - Reading will, of course, be sequential rather than parallel; processing > can still be parallelized, though. I suppose that's fine for your use case. > - Once you have that - wrapping an UnboundedSource will be possible and an > interesting exercise. And, I believe, wrapping it with a splittable DoFn > http://s.apache.org/splittable-do-fn will be much easier, though SDF > support is yet inconsistent between runners (Direct works, Flink works, > Apex and Dataflow in review). It'd actually be a good test case of the ease > of use of the API. > > On Sat, Apr 29, 2017 at 12:50 PM Shen Li <[email protected]> wrote: > > > It seems that Create.of(Iterable) can only create a BoundedSource. Is > there > > a convenient way to read from an unbounded Iterable object without > writing > > application code to wrap it into an UnboundedSource object? > > > > > > Thanks, > > > > Shen > > >
