A naive question about DirectPipelineRunner: Is it possible to execute DirectPipelineRunner with multiple threads/ instances (across machines) or the parallelism is only supported by runner such as SparkPipelineRunner?
My requirement is to run pipeline in parallel, either threading or multiple machines. And I just start to investigating Apache Beam. When reading google dataflow doc, the options setting mention that numWorkers can be configured for the instances to use (I understand it's still different from Apache Beam). However, searching Apache Beam source on github with the keyword 'numWorkers' doesn't come up related source snippet. So I am wondering if the only way to execute pipeline process in parallel is to use SparkPipelineRunner/ FlinkPipelineRunner (meaning I have to use Apache Beam + Spark/ Flink) or make use of Google Cloud Platform? Thanks [1]. https://cloud.google.com/dataflow/pipelines/specifying-exec-params#setting-other-cloud-pipeline-options
