Is it explicitly defined somewhere that random number generators should always return a deterministic set of numbers given the same seed, or is that just a side-effect of some hardware not having a better way to generate random numbers so they use a user-defined seed to kick off the randomization starting point?
On Mon, Jan 8, 2018 at 9:27 AM, kellen sunderland < kellen.sunderl...@gmail.com> wrote: > Hello MXNet devs, > > I wanted to see what people thought about the follow section of code, which > I think has some subtle pros/cons: > https://github.com/apache/incubator-mxnet/blob/ > d2a856a3a2abb4e72edc301b8b821f0b75f30722/src/resource.cc#L188 > > Tobi (tdomhan) from sockeye pointed it out to me after he spent some time > debugging non-determinism in his model training. > > This functionality is well documented here: > https://mxnet.incubator.apache.org/api/python/ndarray. > html#mxnet.random.seed > but I don't think the current api meets all use cases due to this section: > > "Random number generators in MXNet are device specific. Therefore, random > numbers generated from two devices can be different even if they are seeded > using the same seed." > > I'm guessing this is a feature that makes distributed training easier in > MXNet, you wouldn't want to train the same model on each GPU. However the > downside of this is that if you run unit tests on a multi-gpu system, or in > a training environment where you don't have control over which GPU you use, > you can't count on deterministic behaviour which you can assert results > against. I have a feeling there are non-unit test use cases where you'd > also want deterministic behaviour independent of which gpu you happen to > have your code scheduled to run on. > > How do others feel about this? Would it make sense to have some optional > args in the seed call to have the seed-per-device functionality turned off? > > -Kellen >