Has anyone else noticed that sometimes the same tuple calls update state 
function twice?
I have 2 tuples with the same key in 1 RDD part of DStream: RDD[ (a,1), (a,2) ]
When the update function is called the first time Seq[V] has data: 1, 2 which 
is correct: StateClass(3,2, ArrayBuffer(1, 2))
Then right away (in my output I see this) the same key is used and the function 
is called again but this time Seq is empty: StateClass(3,2, ArrayBuffer( ))

In the update function I also save Seq[V] to state so I can see it in the RDD. 
I also show a count and sum of the values.
StateClass(sum, count, Seq[V])

Why is the update function called with empty Seq[V] on the same key when all 
values for that key have been already taken care of in a previous update?

-Adrian

Reply via email to