I have used C-api in the past, and would like to see a convenient and stable way to do this. Currently I'm using randomgen, but calling (from c++) to the python api. The inefficiency is amortized by generating and caching batches of results.
I thought randomgen was supposed to be the future of numpy random, so I've based on that. On Fri, Sep 20, 2019 at 6:08 AM Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > > > On Fri, Sep 20, 2019 at 5:29 AM Robert Kern <robert.k...@gmail.com> wrote: >> >> On Thu, Sep 19, 2019 at 11:04 PM Ralf Gommers <ralf.gomm...@gmail.com> wrote: >>> >>> >>> >>> On Thu, Sep 19, 2019 at 4:53 PM Robert Kern <robert.k...@gmail.com> wrote: >>>> >>>> On Thu, Sep 19, 2019 at 5:24 AM Ralf Gommers <ralf.gomm...@gmail.com> >>>> wrote: >>>>> >>>>> >>>>> On Thu, Sep 19, 2019 at 10:28 AM Kevin Sheppard >>>>> <kevin.k.shepp...@gmail.com> wrote: >>>>>> >>>>>> There are some users of the NumPy C code in randomkit. This was never >>>>>> officially supported. There has been a long open issue to provide this >>>>>> officially. >>>>>> >>>>>> When I wrote randomgen I supplied .pdx files that make it simpler to >>>>>> write Cython code that uses the components. The lower-level API has not >>>>>> had much scrutiny and is in need of a clean-up. I thought this would >>>>>> also encourage users to extend the random machinery themselves as part >>>>>> of their project or code so as to minimize the requests for new (exotic) >>>>>> distributions to be included in Generator. >>>>>> >>>>>> Most of the generator functions follow a pattern random_DISTRIBUTION. >>>>>> Some have a bit more name mangling which can easily be cleaned up (like >>>>>> ranomd_gauss_zig, which should become PREFIX_standard_normal). >>>>>> >>>>>> Ralf Gommers suggested unprefixed names. >>>>> >>>>> >>>>> I suggested that the names should match the Python API, which I think >>>>> isn't quite the same. The Python API doesn't contain things like "gamma", >>>>> "t" or "f". >>>> >>>> >>>> As the implementations evolve, they aren't going to match one-to-one 100%. >>>> The implementations are shared by the legacy RandomState. When we update >>>> an algorithm, we'll need to make a new function with the better algorithm >>>> for Generator to use, then we'll have two C functions roughly >>>> corresponding to the same method name (albeit on different classes). C >>>> doesn't give us as many namespace options as Python. We could rely on >>>> conventional prefixes to distinguish between the two classes of function >>>> (e.g. legacy_normal vs random_normal). >>> >>> >>> That seems simple and clear >>> >>>> There are times when it would be nice to be more descriptive about the >>>> algorithm difference (e.g. random_normal_polar vs random_normal_ziggurat), >>> >>> >>> We decided against versioning algorithms in NEP 19, so an update to an >>> algorithm would mean we'd want to get rid of the older version (unless it's >>> still in use by legacy). So AFAICT we'd never have both random_normal_polar >>> and random_normal_ziggurat present at the same time? >> >> >> Well, we must because one's used by the legacy RandomState and one's used by >> Generator. :-) >> >>> >>> I may be missing your point here, but if we have in Python >>> `Generator.normal` and can switch its implementation from polar to ziggurat >>> or vice versa without any deprecation, then why would we want to switch >>> names in the C API? >> >> >> I didn't mean to suggest that we'd have an unbounded number of functions as >> we improve the algorithms, just that we might have 2 once we decide to >> change something about the algorithm. We need 2 to support both the improved >> algorithm in Generator and the legacy algorithm in RandomState. The current >> implementation of the C function would be copied to a new name (`legacy_foo` >> or whatever), then we'd make RandomState use that frozen copy, then we make >> the desired modifications to the main function that Generator is referencing >> (`random_foo`). >> >> Or we could just make those legacy copies now so that people get to use them >> explicitly under the legacy names, whatever they are, and we can feel more >> free to modify the main implementations. I suggested this earlier, but >> convinced myself that it wasn't strictly necessary. But then I admit I was >> more focused on the Python API stability than any promises about the >> C/Cython API. >> >> We might end up with more than 2 implementations if we need to change >> something about the function signature, for whatever reason, and we want to >> retain C/Cython API compatibility with older code. The C functions aren't >> necessarily going to be one-to-one to the Generator methods. They're just >> part of the implementation. So for example, if we wanted to, say, precompute >> some intermediate values from the given scalar parameters so we don't have >> to recompute them for each element of the `size`-large requested output, we >> might do that in one C function and pass those intermediate values as >> arguments to the C function that does the actual sampling. So we'd have two >> C functions for that one Generator method, and the sampling C function will >> not have the same signature as it did before the modification that >> refactored the work into two functions. In that case, I would not be so >> strict as to require that `Generator.foo` is one to one with `random_foo`. > > > You're saying "be so strict" as if it were a bad thing, or a major effort. I > understand that in some cases a C API can not be evolved in the same way as a > Python API, but in the example you're giving here I'd say you want one > function to be public, and one private. Making both public just exposes more > implementation details for no good reason, and will give us more maintenance > issues long-term. > > Anyway, this is not an issue today. If we try to keep Python and C APIs > matching, we can deal with possible difficulties with that if and when they > arise - should be infrequent. > > Cheers, > Ralf > >> >> To your point, though, we don't have to use gratuitously different names >> when there _is_ a one-to-one relationship. `random_gauss_zig` should be >> `random_normal`. >> >> -- >> Robert Kern >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Those who don't understand recursion are doomed to repeat it _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion