On Fri, Nov 17, 2023, at 11:07, Robert Kern wrote:
> If the arrays you are drawing indices for are real in-memory arrays for
> present-day 64-bit computers, this should be adequate. If it's a notional
> array that is larger, then you'll need actual arbitrary-sized integer
> sampling. The builtin `random.randrange()` will do arbitrary-sized integers
> and is quite reasonable for this task. If you want it to use our
> BitGenerators underneath for clean PRNG state management, this is quite
> doable with a simple subclass of `random.Random`:
> https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258
Thanks, Robert. The use case is for randomly populating a hypothetical
N-dimensional sparse array object.
In practice, int64 will probably suffice.
I can see how to generate arbitrarily large integers using the pattern above,
but still unsure how to sample without replacement. Aaron mentioned that
conflicts are extremely unlikely, so perhaps fine to assume they won't happen.
Checking for conflicts is expensive.
Attached is a script that implements this solution.
Stéfan
import random
import functools
import itertools
import operator
import numpy as np
def cumulative_prod(arr):
return list(itertools.accumulate(arr, func=operator.mul))
def unravel_index(x, dims):
dim_prod = cumulative_prod([1] + list(dims)[:0:-1])[::-1]
return [list((x[j] // dim_prod[i]) % dims[i] for i in range(len(dims))) for j in range(len(x))]
# From Robert Kern's comment at
# https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258
class PythonRandomInterface(random.Random):
def __init__(self, rng):
self._rng = rng
def getrandbits(self, k):
"""getrandbits(k) -> x. Generates an int with k random bits."""
if k < 0:
raise ValueError('number of bits must be non-negative')
numbytes = (k + 7) // 8 # bits / 8 and rounded up
x = int.from_bytes(self._rng.bytes(numbytes), 'big')
return x >> (numbytes * 8 - k) # trim excess bits
def indices(self, shape, size=1):
D = functools.reduce(lambda x, y: x * y, dims)
indices = [pri.randint(0, D) for i in range(size)]
return unravel_index(indices, shape)
rng = np.random.default_rng()
pri = PythonRandomInterface(rng)
dims = (500, 400, 30, 15, 20, 800, 900, 2000, 800)
k = 5
print(pri.indices(dims, size=k))
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]