Strange duplicates in data when scaling up

2014-10-17 Thread Jacob Maloney
Issue was solved by clearing hashmap and hashset at the beginning of the call method. From: Jacob Maloney [mailto:jmalo...@conversantmedia.com] Sent: Thursday, October 16, 2014 5:09 PM To: user@spark.apache.org Subject: Strange duplicates in data when scaling up I have a flatmap function

Strange duplicates in data when scaling up

2014-10-16 Thread Jacob Maloney
I have a flatmap function that shouldn't possibly emit duplicates and yet it does. The output of my function is a HashSet so the function itself cannot output duplicates and yet I see many copies of keys emmited from it (in one case up to 62). The curious thing is I can't get this to happen

Issue with java spark broadcast

2014-10-10 Thread Jacob Maloney
[mailto:user-h...@spark.apache.org] Sent: Friday, October 10, 2014 4:02 PM To: Jacob Maloney Subject: FAQ for user@spark.apache.org Hi! This is the ezmlm program. I'm managing the user@spark.apache.org mailing list. FAQ - Frequently asked questions of the user@spark.apache.org list. None available yet