The KafkaIO is there:

https://github.com/apache/incubator-beam/tree/master/sdks/java/io/kafka

You have to use SNAPSHOT/nightly build or build yourself.

Regards
JB

On 04/28/2016 07:49 PM, amir bahmanyari wrote:
Thanks so much JB.I Googled KafkaIO & got no references.Which Beam release 
supports KafkaIO? Please let me know when the examples are available...cannt wait 
!Have a wonderful day.

       From: Jean-Baptiste Onofré <j...@nanthrax.net>
  To: dev@beam.incubator.apache.org
  Sent: Thursday, April 28, 2016 10:40 AM
  Subject: Re: [DISCUSS] Beam IO &runners native IO

Hi Amir,

Now, we have a KafkaIO in Beam (both source and sink). I would start
with this one.

Actually, I'm preparing new pipeline examples showing the usage of that.

Regards
JB

On 04/28/2016 07:36 PM, amir bahmanyari wrote:
Hi JB,Hope all is great.I am very new to "Beam". Trying to dive into it.Am having a 
lot of trouble to read from a Kafka topic using google.cloud.dataflow  APis & 
FlinkPipelineRunner.Could you point me to some docs and/or forums where I can find a solution 
for my problem pls?I really appreciate it.
In case you are curious, given the following 
line:p.apply(Read.named("ReadFromKafka").from(UnboundedFlinkSource.of(kafkaConsumer))).
that connects to my kafka consumer (and I confirm it at the server side), it fails 
& reports:The transform ReadFromKafka [Read(UnboundedFlinkSource)] is currently 
not supported.

I am not sure where/how I need to specify "ReadFromKafka" part.I am sending 
stream data from my laptop to kafka in  a linux box.The default Kafka consumer reports 
thats its received.
I really appreciate your help. I know this is not a conventional way to ask 
such questions but have been spending a lot of time to figure it out.Have a 
wonderful day JB.
Amir-
         From: Jean-Baptiste Onofré <j...@nanthrax.net>
   To: dev@beam.incubator.apache.org
   Sent: Thursday, April 28, 2016 5:41 AM
   Subject: [DISCUSS] Beam IO &runners native IO

Hi all,

regarding the recent threads on the mailing list, I would like to start
a format discussion around the IO.
As we can expect the first contributions on this area (I already have
some work in progress around this ;)), I think it's a fair discussion to
have.

Now, we have two kinds of IO: the one "generic" to Beam, the one "local"
to the runners.

For example, let's take Kafka: we have the KafkaIO (in IO), and for
instance, we have the spark-streaming kafka connector (in Spark Runner).

Right now, we have two approaches for the user:
1. In the pipeline, we use KafkaIO from Beam: it's the preferred
approach for sure. However, the user may want to use the runner specific
IO for two reasons:
       * Beam doesn't provide the IO yet (for instance, spark cassandra
connector is available whereas we don't have yet any CassandraIO (I'm
working on it anyway ;))
       * The runner native IO is optimized or contain more features that the
Beam native IO
2. So, for the previous reasons, the user could want to use the native
runner IO. The drawback of this approach is that the pipeline will be
tight to a specific runner, which is completely against the Beam design.

I wonder if it wouldn't make sense to add flag on the IO API (and
related on Runner API) like .useNative().

For instance, the user would be able to do:

pipeline.apply(KafkaIO.read().withBootstrapServers("...").withTopics("...").useNative(true);

then, if the runner has a "native" IO, it will use it, else, if
useNative(false) (the default), it won't use any runner native IO.

The point there is for the configuration: assuming the Beam IO and the
runner IO can differ, it means that the "Beam IO" would have to populate
all runner specific IO configuration.

Of course, it's always possible to use a PTransform to wrap the runner
native IO, but we are back on the same concern: the pipeline will be
couple to a specific runner.

The purpose of the useNative() flag is to "automatically" inform the
runner to use a specific IO if it has one: the pipeline stays decoupled
from the runners.

Thoughts ?

Thanks
Regards
JB



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to