Hi,

I have a very simple DAG on my dataflow job. (KafkaIO->Filter->WriteGCS).
When I submit this Dataflow job per topic it has 4kps per instance
processing speed. However I want to consume two different topics in one DF
job. I used TupleTag. I created TupleTags per message type. Each topic has
different message types and also needs different filters. So my pipeline
turned to below DAG. Message Extractor is a very simple step checking
header of kafka messages and writing the correct TupleTag. However after
starting to use this new DAG, dataflow canprocess 2kps per instance.

                                                 |--->Filter1-->WriteGCS
KafkaIO->MessageExtractor-> |
                                                 |--->Filter2-->WriteGCS

Do you have any idea why my data process speed decreased ?

Thanks

Reply via email to