Hi Wei,

During the action, all the transformations before it will occur in order
leading up to the action. If you have an accumulator in any of these
transformations, then you won't get exactly once semantics, because the
transformation may be restarted elsewhere.

Bet,
Burak

On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou <zhweisop...@gmail.com> wrote:

> Hi Burak,
>
> Thanks for your quick reply. I guess what confuses me is that accumulator
> won't be updated until an action is used due to the laziness, so
> transformation such as a map won't even update the accumulator, then how
> would restarted the transformation ended up updating accumulator more than
> once?
>
> Best,
> Wei
>
> 2015-06-24 13:23 GMT-07:00 Burak Yavuz <brk...@gmail.com>:
>
>> Hi Wei,
>>
>> For example, when a straggler executor gets killed in the middle of a map
>> operation and it's task is restarted at a different instance, the
>> accumulator will be updated more than once.
>>
>> Best,
>> Burak
>>
>> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou <zhweisop...@gmail.com> wrote:
>>
>>> Quoting from Spark Program guide:
>>>
>>> "For accumulator updates performed inside *actions only*, Spark
>>> guarantees that each task’s update to the accumulator will only be applied
>>> once, i.e. restarted tasks will not update the value. In transformations,
>>> users should be aware of that each task’s update may be applied more than
>>> once if tasks or job stages are re-executed."
>>>
>>> Can anyone gives me a possible scenario of when accumulator might be
>>> updated more than once during transformation? Thanks.
>>>
>>> Regards,
>>> Wei
>>>
>>
>>
>

Reply via email to