Hi beamers! Enjoying using apache beam with Python.
I'm often finding myself creating patterns of: input -> (key, calc1 result) input -> (key, calc2 result) input -> (key, calc3 result) ... input -> (key, calcN result) CoGroupByKey -> merge results of calc1, calc2, calc3....calcN into original input dict (where input is a dict which represents a row) Up until now, I've been using CoGroupByKey, but I wonder if there's a better way such that as a set of calcs for a row is finished it can continue to be processed down-pipeline. Given that: - There's only one result for each calc for each row - The calculations can be made in parallel ... waiting for the whole set to be realized at the merging CoGroupByKey seems like an unnecessary bottleneck. Is there a better pattern for splitting / processing in parallel / merging results per-row? Thanks, Adam