Hi everyone, I'd like to ask how does Spark (or more generally, distributed computing engines) handle RNGs? High-level speaking, there are two ways,
1. Use a single RNG on the driver and random numbers generating on each work makes request to the single RNG on the driver. 2. Use a separate RNG on each worker. If the 2nd approach above is used, may I ask how does Spark seed RNGs on different works to ensure the overall quality of random number generating? Best, ---- Ben Du Personal Blog<http://www.legendu.net/> | GitHub<https://github.com/dclong/> | Bitbucket<https://bitbucket.org/dclong/> | Docker Hub<https://hub.docker.com/r/dclong/>