Re: Maintaining overall cumulative data in Spark Streaming

2015-10-30 Thread Sandeep Giri
How to we reset the aggregated statistics to null? Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.com/company/knowbigdata> [image: other site

RE: Maintaining overall cumulative data in Spark Streaming

2015-10-29 Thread Sandeep Giri
Yes, update state by key worked. Though there are some more complications. On Oct 30, 2015 8:27 AM, "skaarthik oss" wrote: > Did you consider UpdateStateByKey operation? > > > > *From:* Sandeep Giri [mailto:sand...@knowbigdata.com] > *Sent:* Thursday, October 29, 201

Maintaining overall cumulative data in Spark Streaming

2015-10-29 Thread Sandeep Giri
StreamRDD with aggregated count and keep doing a fullouterjoin but didn't work. Seems like the StreamRDD gets reset. Kindly help. Regards, Sandeep Giri

Re: MongoDB and Spark

2015-09-11 Thread Sandeep Giri
I think it should be possible by loading collections as RDD and then doing a union on them. Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.co

Re: MongoDB and Spark

2015-09-11 Thread Sandeep Giri
use map-reduce. On Fri, Sep 11, 2015, 14:32 Mishra, Abhishek wrote: > Hello , > > > > Is there any way to query multiple collections from mongodb using spark > and java. And i want to create only one Configuration Object. Please help > if anyone has something regarding this. > > > > > > Thank Y

Re:

2015-08-07 Thread Sandeep Giri
Looks good. Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.com/company/knowbigdata> [image: other site icon] <http://knowbigdata.com> [im

Re:

2015-08-05 Thread Sandeep Giri
qualifying_function() does not get called after an element has been found? Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.com/company/knowbigdat

[no subject]

2015-08-05 Thread Sandeep Giri
Yes, but in the take() approach we will be bringing the data to the driver and is no longer distributed. Also, the take() takes only count as argument which means that every time we would transferring the redundant elements. Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN

Re: New Feature Request

2015-08-05 Thread Sandeep Giri
be executed on whole dataset even if the value was already found in the first element of RDD: - data.filter(qualifying_function).take(n).count() >= n - val contains1MatchingElement = !(data.filter(qualifying_ function).isEmpty()) Isn't it? Am I missing something? Regards, Sande

New Feature Request

2015-07-31 Thread Sandeep Giri
exists*(qualifying_function, n): Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.KnowBigData.com. <http://KnowBigData.com.> Phone: +1-253-397-1945 (Office) [image: linkedin icon] <https://linkedin.com/company/knowbigdata> [image: other site icon] <h