Hi Bryan, What about groupBy [1] and agg [2]? What about UserDefinedAggregateFunction [3]?
[1] https://home.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset@groupBy(col1:String,cols:String*):org.apache.spark.sql.RelationalGroupedDataset [2] https://home.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/api/scala/index.html#org.apache.spark.sql.RelationalGroupedDataset [3] https://home.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/api/scala/index.html#org.apache.spark.sql.expressions.UserDefinedAggregateFunction Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, Jun 7, 2016 at 8:32 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > Hello. > > I am looking at the option of moving RDD based operations to Dataset based > operations. We are calling 'reduceByKey' on some pair RDDs we have. What > would the equivalent be in the Dataset interface - I do not see a simple > reduceByKey replacement. > > Regards, > > Bryan Jeffrey > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org