Chinmay, I would recommend to collapse ParDo sequences to the maximum extend possible using THREAD_LOCAL affinity. The Apex runner has a configuration file option that can be used to tune the chaining when needed ( https://issues.apache.org/jira/browse/BEAM-980). What you build into the pipeline translation should just be the default behavior.
Note that in cases where operators have multiple inputs, you may not be able to use THREAD_LOCAL and may have to use CONTAINER_LOCAL instead. Thanks, Thomas On Wed, Feb 22, 2017 at 10:59 PM, Chinmay Kolhatkar <chin...@apache.org> wrote: > Dear Community, > > I'm working on BEAM-831 to implement ParDo chaining for Apache Apex Runner. > > As suggested on Jira, chaining needs to be done using Stream locality of > Apache Apex engine. > > I got some links from Eugene Kirpichov on the Jira. I'm currently focusing > on producer-consumer fusion optimization. I'm unsure how much good it is to > do sibling fusion for Apex Runner as of now. > > For producer-consumer fusion, I am able to identify which stages are > ParDos. > Only thing that I'm not sure about is when to stop the merging of ParDos... > i.e. if the DAG is like ParDo A -> ParDo B -> ParDo C -> ParDo D. > Then at time it might be efficient to merge only B & C and not merge all of > them... > > How should this decision be made? Any reference for available for it? > > Please suggest. > > Thanks, > Chinmay. >