Btw, here is a great article about accumulators and all their related
traps!
http://imranrashid.com/posts/Spark-Accumulators/ (I'm not the author)

On 16 March 2016 at 18:24, swetha kasireddy <swethakasire...@gmail.com>
wrote:

> OK. I did take a look at them. So once I have an accumulater for a
> HashSet, how can I check if a particular key is already present in the
> HashSet accumulator? I don't see any .contains method there. My requirement
> is that I need to keep accumulating the keys in the HashSet across all the
> tasks in various nodes and use it to do a check if the key is already
> present in the HashSet.
>
> On Tue, Mar 15, 2016 at 9:56 PM, pppsunil <pppsu...@gmail.com> wrote:
>
>> Have you looked at using Accumulable interface,  Take a look at Spark
>> documentation at
>> http://spark.apache.org/docs/latest/programming-guide.html#accumulators
>> it
>> gives example of how to use vector type for accumalator, which might be
>> very
>> close to what you need
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-an-accumulator-for-a-Set-in-Spark-tp26510p26514.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 

*Adrien Mogenet*
Head of Backend/Infrastructure
adrien.moge...@contentsquare.com
http://www.contentsquare.com
50, avenue Montaigne - 75008 Paris

Reply via email to