Hi,

I recently extended the Spark SQL programming guide to cover user-defined
aggregations, where I modified existing variables and returned them back in
reduce and merge. This approach worked and it was approved by people who
know the context.

Hope that helps.

2017-01-29 17:17 GMT+01:00 Koert Kuipers <ko...@tresata.com>:

> anyone?
> it not i will follow the trail and try to deduce it myself
>
> On Mon, Jan 23, 2017 at 2:31 PM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> looking at the docs for org.apache.spark.sql.expressions.Aggregator it
>> says for reduce method: "For performance, the function may modify `b` and
>> return it instead of constructing new object for b.".
>>
>> it makes no such comment for the merge method.
>>
>> this is surprising to me because i know that for
>> PairRDDFunctions.aggregateByKey mutation is allowed in both seqOp and
>> combOp (which are the equivalents of reduce and merge in Aggregator).
>>
>> is it safe to mutate b1 and return it in Aggregator.merge?
>>
>>
>

Reply via email to