?Can you elaborate? Broadcast will distribute the seed, which is only one number. But what construct do I use to "plant" the seed (call random.seed()) once on each worker?
________________________________ From: ayan guha <guha.a...@gmail.com> Sent: Tuesday, May 12, 2015 11:17 PM To: Charles Hayden Cc: user Subject: Re: how to set random seed Easiest way is to broadcast it. On 13 May 2015 10:40, "Charles Hayden" <charles.hay...@atigeo.com<mailto:charles.hay...@atigeo.com>> wrote: In pySpark, I am writing a map with a lambda that calls random.shuffle. For testing, I want to be able to give it a seed, so that successive runs will produce the same shuffle. I am looking for a way to set this same random seed once on each worker. Is there any simple way to do it??