rok commented on issue #47255: URL: https://github.com/apache/arrow/issues/47255#issuecomment-3167259950
Compute random is a good idea! > Behaviour change: Current random generate numbers in the range [0, 1.0), but for integer scenario we'd better generate close interval [min, max] just as `np.random.randint` do. As a result, default number will generate [0, 1.0] instead of [0, 1.0). [Numpys](https://numpy.org/devdocs/reference/random/generated/numpy.random.rand.html) random appears to have the same [0, 1.0) behavior so replacing it with `pyarrow.compute.random` would be ok. Integers could be done with rounding. Minor potential distribution change here would probably not be an issue for our test requirements. > For non numerical types(e.g. bool/string), we may choose to not support in `random` or support without min/max limit. I prefer to not supporting numerical types in `random`. > > Any suggestions? [@rok](https://github.com/rok) [@raulcd](https://github.com/raulcd) For c++ testing purposes we have a [RandomArrayGenerator](https://github.com/apache/arrow/blob/main/cpp/src/arrow/testing/random.h) that would cover our needs here. But we don't expose testing utilities. So perhaps we should create the needed compute kernel (or extend the current one, I'll open an issue) and add required type generators functionality there. See [PR for compute.random](https://github.com/apache/arrow/pull/11864). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
