You can write one pipeline and simply replace the IO, for example:

To read from (text) files you can use:
*PCollection<String> lines =
p.apply(TextIO.Read.from("file://some/inputData.txt"));    *

and from Kafka (I'm adding a generic key here because Kafka messages are
keyed):
*PCollection<KV<K, String>> moreLines = p,apply(*
*    KafkaIO.<K, String>read()*
*        .withBootstrapServers("brokers.list")*
*        .withTopics("topic-list")*
*        .withKeyCoder(Coder<K>)*
*        .withValueCoder(StringUtf8Coder.of()));*

Now you can apply the same code to both PCollections, or (as you mentioned)
you can Flatten the together into one PCollection (after removing the keys
from Kafka-read PCollection) and apply the transformations you want.

You might find the IO section in the programming guide useful:
https://beam.apache.org/documentation/programming-guide/#io


On Wed, Feb 15, 2017 at 10:13 AM ankit beohar <ankitbeoha...@gmail.com>
wrote:

> Hi All,
>
> I have a use case where I have kafka and flat files so can I write one code
> and run for both or I have to create two different pipelines or use
> pipeline join in a one pipeline.
>
> Which one is better?
>
> Best Regards,
> ANKIT BEOHAR
>

Reply via email to