Re: Understanding accumulator during transformations

2015-06-24 Thread Wei Zhou
Hi Burak, Thanks for your quick reply. I guess what confuses me is that accumulator won't be updated until an action is used due to the laziness, so transformation such as a map won't even update the accumulator, then how would restarted the transformation ended up updating accumulator more than

Re: Understanding accumulator during transformations

2015-06-24 Thread Wei Zhou
Hi Burak, It makes sense, it boils down to any actions happens after transformations then. Thanks for your answers. Best, Wei 2015-06-24 15:06 GMT-07:00 Burak Yavuz brk...@gmail.com: Hi Wei, During the action, all the transformations before it will occur in order leading up to the action.

Re: Understanding accumulator during transformations

2015-06-24 Thread Burak Yavuz
Hi Wei, For example, when a straggler executor gets killed in the middle of a map operation and it's task is restarted at a different instance, the accumulator will be updated more than once. Best, Burak On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou zhweisop...@gmail.com wrote: Quoting from Spark

Re: Understanding accumulator during transformations

2015-06-24 Thread Burak Yavuz
Hi Wei, During the action, all the transformations before it will occur in order leading up to the action. If you have an accumulator in any of these transformations, then you won't get exactly once semantics, because the transformation may be restarted elsewhere. Bet, Burak On Wed, Jun 24,

Understanding accumulator during transformations

2015-06-24 Thread Wei Zhou
Quoting from Spark Program guide: For accumulator updates performed inside *actions only*, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update