Hi Spark group,

We haven't been able to find clear descriptions of how Spark handles the
resiliency of RDDs in relationship to executing actions with side-effects.
If you do an `rdd.foreach(someSideEffect)`, then you are doing a
side-effect for each element in the RDD. If a partition goes down -- the
resiliency rebuilds the data,  but did it keep track of how far it go in
the partition's set of data or will it start from the beginning again. So
will it do at-least-once execution of foreach closures or at-most-once?

thanks,
Michal

Reply via email to