Hi
Does updateStateByKey pass elements to updateFunc (in Seq[V]) in order in which
they appear in the RDD?
My guess is no which means updateFunc needs to be commutative. Am I correct?
I've asked this question before but there were no takers.
Here's the scala docs for updateStateByKey
/**
* Return a new "state" DStream where the state for each key is updated by
applying
* the given function on the previous state of the key and the new values of
each key.
* Hash partitioning is used to generate the RDDs with Spark's default number
of partitions.
* @param updateFunc State update function. If `this` function returns None,
then
* corresponding state key-value pair will be eliminated.
* @tparam S State type
*/
def updateStateByKey[S: ClassTag](
updateFunc: (Seq[V], Option[S]) => Option[S]
): DStream[(K, S)] = {
updateStateByKey(updateFunc, defaultPartitioner())
}