Btw, here is a great article about accumulators and all their related traps! http://imranrashid.com/posts/Spark-Accumulators/ (I'm not the author)
On 16 March 2016 at 18:24, swetha kasireddy <swethakasire...@gmail.com> wrote: > OK. I did take a look at them. So once I have an accumulater for a > HashSet, how can I check if a particular key is already present in the > HashSet accumulator? I don't see any .contains method there. My requirement > is that I need to keep accumulating the keys in the HashSet across all the > tasks in various nodes and use it to do a check if the key is already > present in the HashSet. > > On Tue, Mar 15, 2016 at 9:56 PM, pppsunil <pppsu...@gmail.com> wrote: > >> Have you looked at using Accumulable interface, Take a look at Spark >> documentation at >> http://spark.apache.org/docs/latest/programming-guide.html#accumulators >> it >> gives example of how to use vector type for accumalator, which might be >> very >> close to what you need >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-an-accumulator-for-a-Set-in-Spark-tp26510p26514.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- *Adrien Mogenet* Head of Backend/Infrastructure adrien.moge...@contentsquare.com http://www.contentsquare.com 50, avenue Montaigne - 75008 Paris