It would also be nice if there was a better example of joining two Datasets. I am looking at the documentation here: http://spark.apache.org/docs/latest/sql-programming-guide.html. It seems a little bit sparse - is there a better documentation source?
Regards, Bryan Jeffrey On Tue, Jun 7, 2016 at 2:32 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > Hello. > > I am looking at the option of moving RDD based operations to Dataset based > operations. We are calling 'reduceByKey' on some pair RDDs we have. What > would the equivalent be in the Dataset interface - I do not see a simple > reduceByKey replacement. > > Regards, > > Bryan Jeffrey > >