Hi, I have an RDD and a function that should be called on every item in this RDD once (say it updates an external database). So far, I used rdd.map(myFunction).count() or rdd.mapPartitions(iter => iter.map(myFunction)) but I am wondering if this always triggers the call of myFunction in both cases. Actually, in the first case, the count() will be the same whether or not myFunction is called for each element, so I was just wondering if I can rely on count() evaluating the whole pipeline including functions that cannot change the count.
Thanks Tobias