On Sun, Jun 3, 2018 at 9:23 PM, Warren Weckesser <warren.weckes...@gmail.com > wrote:
> > > On Sun, Jun 3, 2018 at 11:20 PM, Ralf Gommers <ralf.gomm...@gmail.com> > wrote: > >> >> >> On Sun, Jun 3, 2018 at 6:54 PM, <josef.p...@gmail.com> wrote: >> >>> >>> >>> On Sun, Jun 3, 2018 at 9:08 PM, Robert Kern <robert.k...@gmail.com> >>> wrote: >>> >>>> On Sun, Jun 3, 2018 at 5:46 PM <josef.p...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Sun, Jun 3, 2018 at 8:21 PM, Robert Kern <robert.k...@gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> The list of ``StableRandom`` methods should be chosen to support unit >>>>>>> tests: >>>>>>> >>>>>>> * ``.randint()`` >>>>>>> * ``.uniform()`` >>>>>>> * ``.normal()`` >>>>>>> * ``.standard_normal()`` >>>>>>> * ``.choice()`` >>>>>>> * ``.shuffle()`` >>>>>>> * ``.permutation()`` >>>>>>> >>>>>> >>>>>> https://github.com/numpy/numpy/pull/11229#discussion_r192604311 >>>>>> @bashtage writes: >>>>>> > standard_gamma and standard_exponential are important enough to be >>>>>> included here IMO. >>>>>> >>>>>> "Importance" was not my criterion, only whether they are used in unit >>>>>> test suites. This list was just off the top of my head for methods that I >>>>>> think were actually used in test suites, so I'd be happy to be shown live >>>>>> tests that use other methods. I'd like to be a *little* conservative >>>>>> about >>>>>> what methods we stick in here, but we don't have to be *too* >>>>>> conservative, >>>>>> since we are explicitly never going to be modifying these. >>>>>> >>>>> >>>>> That's one area where I thought the selection is too narrow. >>>>> We should be able to get a stable stream from the uniform for some >>>>> distributions. >>>>> >>>>> However, according to the Wikipedia description Poisson doesn't look >>>>> easy. I just wrote a unit test for statsmodels using Poisson random >>>>> numbers >>>>> with hard coded numbers for the regression tests. >>>>> >>>> >>>> I'd really rather people do this than use StableRandom; this is best >>>> practice, as I see it, if your tests involve making precise comparisons to >>>> expected results. >>>> >>> >>> I hardcoded the results not the random data. So the unit tests rely on a >>> reproducible stream of Poisson random numbers. >>> I don't want to save 500 (100 or 1000) observations in a csv file for >>> every variation of the unit test that I run. >>> >> >> I agree, hardcoding numbers in every place where seeded random numbers >> are now used is quite unrealistic. >> >> It may be worth having a look at test suites for scipy, statsmodels, >> scikit-learn, etc. and estimate how much work this NEP causes those >> projects. If the devs of those packages are forced to do large scale >> migrations from RandomState to StableState, then why not instead keep >> RandomState and just add a new API next to it? >> >> > > As a quick and imperfect test, I monkey-patched numpy so that a call to > numpy.random.seed(m) actually uses m+1000 as the seed. I ran the tests > using the `runtests.py` script: > > *seed+1000, using 'python runtests.py -n' in the source directory:* > > 236 failed, 12881 passed, 1248 skipped, 585 deselected, 84 xfailed, 7 > xpassed > > > Most of the failures are in scipy.stats: > > *seed+1000, using 'python runtests.py -n -s stats' in the source > directory:* > > 203 failed, 1034 passed, 4 skipped, 370 deselected, 4 xfailed, 1 xpassed > > > Changing the amount added to the seed or running the tests using the > function `scipy.test("full")` gives different (but similar magnitude) > results: > > *seed+1000, using 'import scipy; scipy.test("full")' in an ipython shell:* > > 269 failed, 13359 passed, 1271 skipped, 134 xfailed, 8 xpassed > > *seed+1, using 'python runtests.py -n' in the source directory:* > > 305 failed, 12812 passed, 1248 skipped, 585 deselected, 84 xfailed, 7 > xpassed > > > I suspect many of the tests will be easy to update, so fixing 300 or so > tests does not seem like a monumental task. > It's all not monumental, but it adds up quickly. In addition to changing tests, one will also need compatibility code when supporting multiple numpy versions (e.g. scipy when get a copy of RandomStable in scipy/_lib/_numpy_compat.py). A quick count of just np.random.seed occurrences with ``$ grep -roh --include \*.py np.random.seed . | wc -w`` for some packages: numpy: 77 scipy: 462 matplotlib: 204 statsmodels: 461 pymc3: 36 scikit-image: 63 scikit-learn: 69 keras: 46 pytorch: 0 tensorflow: 368 astropy: 24 And note, these are *not* incorrect/broken usages, this is code that works and has done so for years. Conclusion: the current proposal will cause work for the vast majority of libraries that depends on numpy. The total amount of that work will certainly not be counted in person-days/weeks, and more likely in years than months. So I'm not convinced yet that the current proposal is the best way forward. Ralf I haven't looked into why there are 585 deselected tests; maybe there are > many more tests lurking there that will have to be updated. > > Warren > > > > Ralf >> >> >> >>> >>> >>>> >>>> StableRandom is intended as a crutch so that the pain of moving >>>> existing unit tests away from the deprecated RandomState is less onerous. >>>> I'd really rather people write better unit tests! >>>> >>>> In particular, I do not want to add any of the integer-domain >>>> distributions (aside from shuffle/permutation/choice) as these are the ones >>>> that have the platform-dependency issues with respect to 32/64-bit `long` >>>> integers. They'd be unreliable for unit tests even if we kept them stable >>>> over time. >>>> >>>> >>>>> I'm not sure which other distributions are common enough and not >>>>> easily reproducible by transformation. E.g. negative binomial can be >>>>> reproduces by a gamma-poisson mixture. >>>>> >>>>> On the other hand normal can be easily recreated from standard_normal. >>>>> >>>> >>>> I was mostly motivated by making it a bit easier to mechanically >>>> replace uses of randn(), which is probably even more common than normal() >>>> and standard_normal() in unit tests. >>>> >>>> >>>>> Would it be difficult to keep this list large, given that it should be >>>>> frozen, low maintenance code ? >>>>> >>>> >>>> I admit that I had in mind non-statistical unit tests. That is, tests >>>> that didn't depend on the precise distribution of the inputs. >>>> >>> >>> The problem is that the unit test in `stats` rely on precise inputs (up >>> to some numerical noise). >>> For example p-values themselves are uniformly distributed if the >>> hypothesis test works correctly. That mean if I don't have control over the >>> inputs, then my p-value could be anything in (0, 1). So either we need a >>> real dataset, save all the random numbers in a file or have a reproducible >>> set of random numbers. >>> >>> 95% of the unit tests that I write are for statistics. A large fraction >>> of them don't rely on the exact distribution, but do rely on a random >>> numbers that are "good enough". >>> For example, when writing unit test, then I get every once in a while or >>> sometimes more often a "bad" stream of random numbers, for which >>> convergence might fail or where the estimated numbers are far away from the >>> true numbers, so test tolerance would have to be very high. >>> If I pick one of the seeds that looks good, then I can have tighter unit >>> test tolerance to insure results are good in a nice case. >>> >>> The problem is that we cannot write robust unit tests for regression >>> tests without stable inputs. >>> E.g. I verified my results with a Monte Carlo with 5000 replications and >>> 1000 Poisson observations in each. >>> Results look close to expected and won't depend much on the exact stream >>> of random variables. >>> But the Monte Carlo for each variant of the test took about 40 seconds. >>> Doing this for all option combination and dataset specification takes too >>> long to be feasible in a unit test suite. >>> So I rely on numpy's stable random numbers and hard code the results for >>> a specific random sample in the regression unit tests. >>> >>> Josef >>> >>> >>> >>>> >>>> -- >>>> Robert Kern >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion