Structured Stream equivalent of reduceByKey

2017-10-25 Thread Piyush Mukati
Hi, we are migrating some jobs from Dstream to Structured Stream. Currently to handle aggregations we call map and reducebyKey on each RDD like rdd.map(event => (event._1, event)).reduceByKey((a, b) => merge(a, b)) The final output of each RDD is merged to the sink with support for aggregation at

Re: Structured Stream equivalent of reduceByKey

2017-10-26 Thread Michael Armbrust
- dev I think you should be able to write an Aggregator . You probably want to run in update mode if you are looking for it to output any group that has changed in the batch. On Wed, Oct 25, 201

Re: Structured Stream equivalent of reduceByKey

2017-10-26 Thread Piyush Mukati
Thanks, Michael I have explored Aggregator with update mode. The problem is it will give the overall aggregated value for the changed. while I only want the delta change in the group as the aggre

Re: Structured Stream equivalent of reduceByKey

2017-11-06 Thread Michael Armbrust
Hmmm, I see. You could output the delta using flatMapGroupsWithState