FYI I just sent a pull request for adding a bounded source to Beam for reading distributedlog streams. I am going to send out the pull request for adding an unbounded source and a sink after that.
If you are interested in this and willing to help review it, this is the pull request - https://github.com/apache/incubator-beam/pull/1464 - KN On Thu, Sep 1, 2016 at 9:41 PM, Sijie Guo <si...@apache.org> wrote: > Wow. This sounds interesting. Look forward to this. Let us know if you need > any helps. > > - Sijie > > On Wed, Aug 31, 2016 at 2:02 AM, Khurrum Nasim <khurrumnas...@gmail.com> > wrote: > > > Hello beam folks, > > > > We are evaluating a new solution to unify our streaming and batching data > > pipeline, from storage, computing engine to programming model. The idea > is > > basically to implement the Kappa architecture, using DistributedLog as a > > unified stream store for both streaming and batching, using Flink or > Spark > > (still debating) as the process engine, and using Beam as the programming > > model. > > > > We'd like to contribute an IO connector to DistributedLog (both bounded > > source/sink and unbounded source/sink). > > > > Is there any special instructions or best practise to add a new IO > > connector? Any suggestion is very appreciated. > > > > The jira is here: https://issues.apache.org/jira/browse/BEAM-607 > > > > Also, /cc the distributed log team for any helps. > > > > KN > > >