How to we reset the aggregated statistics to null? Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN)
www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.com/company/knowbigdata> [image: other site icon] <http://knowbigdata.com> [image: facebook icon] <https://facebook.com/knowbigdata> [image: twitter icon] <https://twitter.com/IKnowBigData> <https://twitter.com/IKnowBigData> On Fri, Oct 30, 2015 at 9:49 AM, Sandeep Giri <sand...@knowbigdata.com> wrote: > Yes, update state by key worked. > > Though there are some more complications. > On Oct 30, 2015 8:27 AM, "skaarthik oss" <skaarthik....@gmail.com> wrote: > >> Did you consider UpdateStateByKey operation? >> >> >> >> *From:* Sandeep Giri [mailto:sand...@knowbigdata.com] >> *Sent:* Thursday, October 29, 2015 3:09 PM >> *To:* user <user@spark.apache.org>; dev <d...@spark.apache.org> >> *Subject:* Maintaining overall cumulative data in Spark Streaming >> >> >> >> Dear All, >> >> >> >> If a continuous stream of text is coming in and you have to keep >> publishing the overall word count so far since 0:00 today, what would you >> do? >> >> >> >> Publishing the results for a window is easy but if we have to keep >> aggregating the results, how to go about it? >> >> >> >> I have tried to keep an StreamRDD with aggregated count and keep doing a >> fullouterjoin but didn't work. Seems like the StreamRDD gets reset. >> >> >> >> Kindly help. >> >> >> >> Regards, >> >> Sandeep Giri >> >> >> >