you confirm?
Thanks again!
-A
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: April-30-14 4:30 PM
To: user@spark.apache.org
Subject: Re: What is Seq[V] in updateStateByKey?
S is the previous count, if any. Seq[V] are potentially many new counts. All of
them have
...@cloudera.com]
Sent: April-30-14 4:30 PM
To: user@spark.apache.org
Subject: Re: What is Seq[V] in updateStateByKey?
S is the previous count, if any. Seq[V] are potentially many new counts.
All of them have to be added together to keep an accurate total. It's as
if the count were 3, and I tell
S is the previous count, if any. Seq[V] are potentially many new
counts. All of them have to be added together to keep an accurate
total. It's as if the count were 3, and I tell you I've just observed
2, 5, and 1 additional occurrences -- the new count is 3 + (2+5+1) not
1 + 1.
I butted in
Yeah, I remember changing fold to sum in a few places, probably in
testsuites, but missed this example I guess.
On Wed, Apr 30, 2014 at 1:29 PM, Sean Owen so...@cloudera.com wrote:
S is the previous count, if any. Seq[V] are potentially many new
counts. All of them have to be added together
What is Seq[V] in updateStateByKey?
Does this store the collected tuples of the RDD in a collection?
Method signature:
def updateStateByKey[S: ClassTag] ( updateFunc: (Seq[V], Option[S]) =
Option[S] ): DStream[(K, S)]
In my case I used Seq[Double] assuming a sequence of Doubles in the RDD
time slice.
On Tue, Apr 29, 2014 at 7:50 PM, Adrian Mocanu
amoc...@verticalscope.com wrote:
What is Seq[V] in updateStateByKey?
Does this store the collected tuples of the RDD in a collection?
Method signature:
def updateStateByKey[S: ClassTag] ( updateFunc: (Seq[V], Option[S]) =
Option[S
at each time slice.
On Tue, Apr 29, 2014 at 7:50 PM, Adrian Mocanu
amoc...@verticalscope.com wrote:
What is Seq[V] in updateStateByKey?
Does this store the collected tuples of the RDD in a collection?
Method signature:
def updateStateByKey[S: ClassTag] ( updateFunc: (Seq[V