Sorry JB.Where can I download a " SNAPSHOT/nightly build" jar?My apologies...I searched couldnt find it.
From: Jean-Baptiste Onofré <j...@nanthrax.net> To: dev@beam.incubator.apache.org Sent: Thursday, April 28, 2016 11:06 AM Subject: Re: [DISCUSS] Beam IO &runners native IO The KafkaIO is there: https://github.com/apache/incubator-beam/tree/master/sdks/java/io/kafka You have to use SNAPSHOT/nightly build or build yourself. Regards JB On 04/28/2016 07:49 PM, amir bahmanyari wrote: > Thanks so much JB.I Googled KafkaIO & got no references.Which Beam release > supports KafkaIO? Please let me know when the examples are available...cannt > wait !Have a wonderful day. > > From: Jean-Baptiste Onofré <j...@nanthrax.net> > To: dev@beam.incubator.apache.org > Sent: Thursday, April 28, 2016 10:40 AM > Subject: Re: [DISCUSS] Beam IO &runners native IO > > Hi Amir, > > Now, we have a KafkaIO in Beam (both source and sink). I would start > with this one. > > Actually, I'm preparing new pipeline examples showing the usage of that. > > Regards > JB > > On 04/28/2016 07:36 PM, amir bahmanyari wrote: >> Hi JB,Hope all is great.I am very new to "Beam". Trying to dive into it.Am >> having a lot of trouble to read from a Kafka topic using >> google.cloud.dataflow APis & FlinkPipelineRunner.Could you point me to some >> docs and/or forums where I can find a solution for my problem pls?I really >> appreciate it. >> In case you are curious, given the following >> line:p.apply(Read.named("ReadFromKafka").from(UnboundedFlinkSource.of(kafkaConsumer))). >> that connects to my kafka consumer (and I confirm it at the server side), it >> fails & reports:The transform ReadFromKafka [Read(UnboundedFlinkSource)] is >> currently not supported. >> >> I am not sure where/how I need to specify "ReadFromKafka" part.I am sending >> stream data from my laptop to kafka in a linux box.The default Kafka >> consumer reports thats its received. >> I really appreciate your help. I know this is not a conventional way to ask >> such questions but have been spending a lot of time to figure it out.Have a >> wonderful day JB. >> Amir- >> From: Jean-Baptiste Onofré <j...@nanthrax.net> >> To: dev@beam.incubator.apache.org >> Sent: Thursday, April 28, 2016 5:41 AM >> Subject: [DISCUSS] Beam IO &runners native IO >> >> Hi all, >> >> regarding the recent threads on the mailing list, I would like to start >> a format discussion around the IO. >> As we can expect the first contributions on this area (I already have >> some work in progress around this ;)), I think it's a fair discussion to >> have. >> >> Now, we have two kinds of IO: the one "generic" to Beam, the one "local" >> to the runners. >> >> For example, let's take Kafka: we have the KafkaIO (in IO), and for >> instance, we have the spark-streaming kafka connector (in Spark Runner). >> >> Right now, we have two approaches for the user: >> 1. In the pipeline, we use KafkaIO from Beam: it's the preferred >> approach for sure. However, the user may want to use the runner specific >> IO for two reasons: >> * Beam doesn't provide the IO yet (for instance, spark cassandra >> connector is available whereas we don't have yet any CassandraIO (I'm >> working on it anyway ;)) >> * The runner native IO is optimized or contain more features that the >> Beam native IO >> 2. So, for the previous reasons, the user could want to use the native >> runner IO. The drawback of this approach is that the pipeline will be >> tight to a specific runner, which is completely against the Beam design. >> >> I wonder if it wouldn't make sense to add flag on the IO API (and >> related on Runner API) like .useNative(). >> >> For instance, the user would be able to do: >> >> pipeline.apply(KafkaIO.read().withBootstrapServers("...").withTopics("...").useNative(true); >> >> then, if the runner has a "native" IO, it will use it, else, if >> useNative(false) (the default), it won't use any runner native IO. >> >> The point there is for the configuration: assuming the Beam IO and the >> runner IO can differ, it means that the "Beam IO" would have to populate >> all runner specific IO configuration. >> >> Of course, it's always possible to use a PTransform to wrap the runner >> native IO, but we are back on the same concern: the pipeline will be >> couple to a specific runner. >> >> The purpose of the useNative() flag is to "automatically" inform the >> runner to use a specific IO if it has one: the pipeline stays decoupled >> from the runners. >> >> Thoughts ? >> >> Thanks >> Regards >> JB >> > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com