Hey all, Just curious if this pattern comes up for others and if people have worked out a good convention.
There are many combiners and a lot of them have two forms: a global form (e.g. Count.Globally) and a per key form (e.g. Count.PerKey). These are convenient but it feels like often we're running into the case where we GroupBy a set of data once and then wish to perform a series of combines on them, in which case neither of these forms work, and it begs another form which operates on pre-grouped KVs. Contrived example: maybe you have a pcollection of keyed numbers and you want to calculate some summary statistics on them. You could do: ``` keyed_means = (keyed_nums | Mean.PerKey()) keyed_counts = (keyed_num | Count.PerKey()) ... # other combines ``` But it'd feel more natural to pre-group the pcollection. ``` grouped_nums = keyed_nums | GBK() keyed_means = (grouped_nums | Mean.PerGrouped()) keyed_counts (grouped_nums | Count.PerGrouped()) ``` But these "PerGrouped" variants don't actually currently exist. Does anyone else run into this pattern often? I might be missing an obvious pattern here. -- Joey Tran | Staff Developer | AutoDesigner TL *he/him* [image: Schrödinger, Inc.] <https://schrodinger.com/>
