So sorry about teasing you with the Scala. But the method is there in Java
too, I just checked.
On Fri, Sep 19, 2014 at 2:02 PM, Victor Tso-Guillen v...@paxata.com wrote:
It might not be the same as a real hadoop reducer, but I think it would
accomplish the same. Take a look at:
import
OK so in Java - pardon the verbosity I might say something like the code
below
but I face the following issues
1) I need to store all values in memory as I run combineByKey - it I could
return an RDD which consumed values that would be great but I don't know
how to do that -
2) In my version of
1. Actually, I disagree that combineByKey requires that all values be
held in memory for a key. Only the use case groupByKey does that, whereas
reduceByKey, foldByKey, and the generic combineByKey do not necessarily
make that requirement. If your combine logic really shrinks the result