On Thu, 2018-04-26 at 19:26 +0200, Sebastian Berg wrote: > On Thu, 2018-04-26 at 09:51 -0700, Hameer Abbasi wrote: > > Hi Nathan, > > > > np.any and np.all call np.or.reduce and np.and.reduce respectively, > > and unfortunately the underlying function (ufunc.reduce) has no way > > of detecting that the value isn’t going to change anymore. It’s > > also > > used for (for example) np.sum (np.add.reduce), np.prod > > (np.multiply.reduce), np.min(np.minimum.reduce), > > np.max(np.maximum.reduce). > > > I would like to point out that this is not almost, but not quite > true. > The boolean versions will short circuit on the innermost level, which > is good enough for all practical purposes probably. > > One way to get around it would be to use a chunked iteration using > np.nditer in pure python. I admit it is a bit tricky to get start on, > but it is basically what numexpr uses also (at least in the simplest > mode), and if your arrays are relatively large, there is likely no > real > performance hit compared to a non-pure python version. >
I mean something like this: def check_any(arr, func=lambda x: x, buffersize=0): """ Check if the function is true for any value in arr and stop once the first was found. Parameters ---------- arr : ndarray Array to test. func : function Function taking a 1D array as argument and returning an array (on which ``np.any`` will be called. buffersize : int Size of the chunk/buffer in the iteration, zero will use the default numpy value. Notes ----- The stopping does not occur immediatly but in buffersize chunks. """ iterflags = ['buffered', 'external_loop', 'refs_ok', 'zerosize_ok'] for chunk in np.nditer((arr,), flags=iterflags, buffersize=buffersize): if np.any(func(chunk)): return True return False not sure how it performs actually, but you can give it a try especially if you know you have large arrays, or if "func" is pretty expensive. If the input is already bool, it will be quite a bit slower though I am sure. - Sebastian > - Sebastian > > > > > > > You can find more information about this on the ufunc doc page. I > > don’t think it’s worth it to break this machinery for any and all, > > as > > it has numerous other advantages (such as being able to override in > > duck arrays, etc) > > > > Best regards, > > Hameer Abbasi > > Sent from Astro for Mac > > > > > On Apr 26, 2018 at 18:45, Nathan Goldbaum <nathan12...@gmail.com> > > > wrote: > > > > > > Hi all, > > > > > > I was surprised recently to discover that both np.any and > > > np.all() > > > do not have a way to exit early: > > > > > > In [1]: import numpy as np > > > > > > In [2]: data = np.arange(1e6) > > > > > > In [3]: print(data[:10]) > > > [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] > > > > > > In [4]: %timeit np.any(data) > > > 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 > > > loops > > > each) > > > > > > In [5]: data = np.zeros(int(1e6)) > > > > > > In [6]: %timeit np.any(data) > > > 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 > > > loops > > > each) > > > > > > I don't see any discussions about this on the NumPy issue tracker > > > but perhaps I'm missing something. > > > > > > I'm curious if there's a way to get a fast early-terminating > > > search > > > in NumPy? Perhaps there's another package I can depend on that > > > does > > > this? I guess I could also write a bit of cython code that does > > > this but so far this project is pure python and I don't want to > > > deal with the packaging headache of getting wheels built and > > > conda- > > > forge packages set up on all platforms. > > > > > > Thanks for your help! > > > > > > -Nathan > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion