Hey Peter,
We might need some more details on what you're trying to do. You're allowed
to add additional parallelDo operations after the combineValues operation,
e.g.,
PTable<K, V> myTable = ...;
myTable.groupByKey()
.combineValues(CombineFn/Aggregator to do the combine step)
.parallelDo(DoFn to transform result of CombineFn to another format for
output)
is perfectly valid.
J
On Tue, Dec 11, 2012 at 9:41 PM, Peter Knap <[email protected]> wrote:
> Hi guys,
>
> I started a small POC with crunch as a replacement for the current python
> implementation and I ran into a problem with using combiners. How would one
> specify a combiner which is different from the reducer? I know that's not a
> typical case but I want to have partial optimization on the map side and at
> the same time the output format from reducer is different than from the
> combiner so I need two distinct classes. From looking at the code I can't
> figure it out how to do it. Any help would be greatly appreciated.
>
> Thanks,
> Piotr
>