You can write one pipeline and simply replace the IO, for example: To read from (text) files you can use: *PCollection<String> lines = p.apply(TextIO.Read.from("file://some/inputData.txt")); *
and from Kafka (I'm adding a generic key here because Kafka messages are keyed): *PCollection<KV<K, String>> moreLines = p,apply(* * KafkaIO.<K, String>read()* * .withBootstrapServers("brokers.list")* * .withTopics("topic-list")* * .withKeyCoder(Coder<K>)* * .withValueCoder(StringUtf8Coder.of()));* Now you can apply the same code to both PCollections, or (as you mentioned) you can Flatten the together into one PCollection (after removing the keys from Kafka-read PCollection) and apply the transformations you want. You might find the IO section in the programming guide useful: https://beam.apache.org/documentation/programming-guide/#io On Wed, Feb 15, 2017 at 10:13 AM ankit beohar <ankitbeoha...@gmail.com> wrote: > Hi All, > > I have a use case where I have kafka and flat files so can I write one code > and run for both or I have to create two different pipelines or use > pipeline join in a one pipeline. > > Which one is better? > > Best Regards, > ANKIT BEOHAR >