Hey all,

Just curious if this pattern comes up for others and if people have worked
out a good convention.

There are many combiners and a lot of them have two forms: a global form
(e.g. Count.Globally) and a per key form (e.g. Count.PerKey). These are
convenient but it feels like often we're running into the case where we
GroupBy a set of data once and then wish to perform a series of combines on
them, in which case neither of these forms work, and it begs another form
which operates on pre-grouped KVs.

Contrived example: maybe you have a pcollection of keyed numbers and you
want to calculate some summary statistics on them. You could do:
```
keyed_means = (keyed_nums
 | Mean.PerKey())
keyed_counts = (keyed_num
 | Count.PerKey())
... # other combines
```
But it'd feel more natural to pre-group the pcollection.
```
grouped_nums = keyed_nums | GBK()
keyed_means = (grouped_nums | Mean.PerGrouped())
keyed_counts (grouped_nums | Count.PerGrouped())
```
But these "PerGrouped" variants don't actually currently exist. Does anyone
else run into this pattern often? I might be missing an obvious pattern
here.

-- 

Joey Tran | Staff Developer | AutoDesigner TL

*he/him*

[image: Schrödinger, Inc.] <https://schrodinger.com/>

Reply via email to