thanks thats helpful On Sun, Jan 29, 2017 at 12:54 PM, Anton Okolnychyi < anton.okolnyc...@gmail.com> wrote:
> Hi, > > I recently extended the Spark SQL programming guide to cover user-defined > aggregations, where I modified existing variables and returned them back in > reduce and merge. This approach worked and it was approved by people who > know the context. > > Hope that helps. > > 2017-01-29 17:17 GMT+01:00 Koert Kuipers <ko...@tresata.com>: > >> anyone? >> it not i will follow the trail and try to deduce it myself >> >> On Mon, Jan 23, 2017 at 2:31 PM, Koert Kuipers <ko...@tresata.com> wrote: >> >>> looking at the docs for org.apache.spark.sql.expressions.Aggregator it >>> says for reduce method: "For performance, the function may modify `b` and >>> return it instead of constructing new object for b.". >>> >>> it makes no such comment for the merge method. >>> >>> this is surprising to me because i know that for >>> PairRDDFunctions.aggregateByKey mutation is allowed in both seqOp and >>> combOp (which are the equivalents of reduce and merge in Aggregator). >>> >>> is it safe to mutate b1 and return it in Aggregator.merge? >>> >>> >> >