The grouping is determined by the POJO's equals() method. You can also call groupBy() to group by some function of the POJOs. For example if you're grouping Doubles into nearly-equal bunches, you could group by their .intValue()
On Thu, Mar 26, 2015 at 8:47 PM, Mihran Shahinian <slowmih...@gmail.com> wrote: > I would like to group records, but instead of grouping on exact key I want > to be able to compute the similarity of keys on my own. Is there a > recommended way of doing this? > > here is my starting point > > final JavaRDD< pojo > records = spark.parallelize(getListofPojos()).cache(); > > class pojo { > String prop1 > String prop2 > } > > during groupBy I would like to compute similarity between prop1 for each > pojo. > > Much appreciated, > Mihran --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org