Producer)

Dan Halperin Mon, 02 May 2016 16:07:05 -0700

Coming back from vacation, sorry for delay.

I agree with Davor. While it's nice to have a `UnboundedFlinkSource`
<https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/UnboundedFlinkSource.java>
wrapper that can be used to convert a Flink-specific source into one that
can be used in Beam, I'd hope this is a temporary stop-gap until we have
Beam-API versions of these for all important connectors. If a
Flink-specific source has major advantages over one that can be implemented
in Beam, we should explore why and see whether we have gaps in our APIs.


(And similar analogs for Spark and for Sinks, Transforms libraries, and
everything else).

Thanks!
Dan

On Thu, Apr 28, 2016 at 10:03 AM, Davor Bonaci <da...@google.com.invalid>
wrote:

> [ Moving over to the dev@ list ]
>
> I think we should be aiming a little higher than "trying out Beam" ;)
>
> Beam SDK currently has built-in IOs for Kafka, as well as for all important
> Google Cloud Platform services. Additionally, there are pull requests for
> Firebase and Cassandra. This is not bad, particularly talking into account
> that we have APIs for user to develop their own IO connectors. Of course,
> there's a long way to go, but there should *not* be any users that are
> blocked or scenarios that are impossible.
>
> In terms of the runner support, Cloud Dataflow runner supports all IOs,
> including any user-written ones. Other runners don't as extensively, but
> this is a high priority item to address.
>
> In my mind, we should strive to address the following:
>
>    - Complete conversion of existing IOs to the Source / Sink API. ETA: a
>    week or two for full completion.
>    - Make sure Spark & Flink runners fully support Source / Sink API, and
>    that ties into the new Runner / Fn API discussion.
>    - Increase the set of built-in IOs. No ETA; iterative process over time.
>    There are 2 pending pull requests, others in development.
>
> I'm hopeful we can address all of these items in a relatively short period
> of time -- in a few months or so -- and likely before we can call any
> release "stable". (This is why the new Runner / Fn API discussions are so
> important.)
>
> In summary, in my mind, "long run" here means "< few months".
>
> ---------- Forwarded message ----------
> From: Maximilian Michels <m...@apache.org>
> Date: Thu, Apr 28, 2016 at 3:20 AM
> Subject: Re: How to read/write avro data using FlinkKafka Consumer/Producer
> (Beam Pipeline) ?
> To: u...@beam.incubator.apache.org
>
> On Wed, Apr 27, 2016 at 11:12 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> > generally speaking, we have to check that all runners work fine with the
> provided IO. I don't think it's a good idea that the runners themselves
> implement any IO: they should use "out of the box" IO.
>
> In the long run, big yes and I liked to help to make it possible!
> However, there is still a gap between what Beam and its Runners
> provide and what users want to do. For the time being, I think the
> solution we have is fine. It gives users the option to try out Beam
> with sources and sinks that they expect to be available in streaming
> systems.
>

Re: IO timelines (Was: How to read/write avro data using FlinkKafka Consumer/Producer)

Reply via email to