An example with a 1-D array (where it is easiest to see what I mean) is the following. I will follow Dom Grigonis's suggestion that the range not be provided as a separate argument, as it can be just as easily "folded into" the array by passing a slice. So it becomes just: idx = first_true(arr, cond)
As Dom also points out, the "cond" would likely need to be a "function pointer" (i.e., the name of a function defined elsewhere, turning first_true into a higher-order function), unless there's some way to pass a parseable expression for simple cases. A few special cases like the first zero/nonzero element could be handled with dedicated options (sort of like matplotlib colors), but for anything beyond that it gets unwieldy fast. So let's say we have this: ****************** def cond(x): return x>50 search_arr = np.exp(np.arange(0,1000)) print(np.first_true(search_arr, cond)) ******************* This should print 4, because the element of search_arr at index 4 (i.e. the 5th element) is e^4, which is slightly greater than 50 (while e^3 is less than 50). It should return this *without testing the 6th through 1000th elements of the array at all to see whether they exceed 50 or not*. This example is rather contrived, because simply taking the natural log of 50 and rounding up is far superior, not even *evaluating the array of exponentials *(which my example clearly still does--and in the use cases I've had for such a function, I can't predict the array elements like this--they come from loaded data, the output of a simulation, etc., and are all already in a numpy array). And in this case, since the values are strictly increasing, search_sorted() would work as well. But it illustrates the idea. On Thu, Oct 26, 2023 at 5:54 AM Dom Grigonis <dom.grigo...@gmail.com> wrote: > Could you please give a concise example? I know you have provided one, but > it is engrained deep in verbose text and has some typos in it, which makes > hard to understand exactly what inputs should result in what output. > > Regards, > DG > > > On 25 Oct 2023, at 22:59, rosko37 <rosk...@gmail.com> wrote: > > > > I know this question has been asked before, both on this list as well as > several threads on Stack Overflow, etc. It's a common issue. I'm NOT asking > for how to do this using existing Numpy functions (as that information can > be found in any of those sources)--what I'm asking is whether Numpy would > accept inclusion of a function that does this, or whether (possibly more > likely) such a proposal has already been considered and rejected for some > reason. > > > > The task is this--there's a large array and you want to find the next > element after some index that satisfies some condition. Such elements are > common, and the typical number of elements to be searched through is small > relative to the size of the array. Therefore, it would greatly improve > performance to avoid testing ALL elements against the conditional once one > is found that returns True. However, all built-in functions that I know of > test the entire array. > > > > One can obviously jury-rig some ways, like for instance create a "for" > loop over non-overlapping slices of length slice_length and call something > like np.where(cond) on each--that outer "for" loop is much faster than a > loop over individual elements, and the inner loop at most will go > slice_length-1 elements past the first "hit". However, needing to use such > a convoluted piece of code for such a simple task seems to go against the > Numpy spirit of having one operation being one function of the form > func(arr)". > > > > A proposed function for this, let's call it "np.first_true(arr, > start_idx, [stop_idx])" would be best implemented at the C code level, > possibly in the same code file that defines np.where. I'm wondering if I, > or someone else, were to write such a function, if the Numpy developers > would consider merging it as a standard part of the codebase. It's possible > that the idea of such a function is bad because it would violate some > existing broadcasting or fancy indexing rules. Clearly one could make it > possible to pass an "axis" argument to np.first_true() that would select an > axis to search over in the case of multi-dimensional arrays, and then the > result would be an array of indices of one fewer dimension than the > original array. So np.first_true(np.array([1,5],[2,7],[9,10],cond) would > return [1,1,0] for cond(x): x>4. The case where no elements satisfy the > condition would need to return a "signal value" like -1. But maybe there > are some weird cases where there isn't a sensible return val > ue, hence why such a function has not been added. > > > > -Andrew Rosko > > _______________________________________________ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: dom.grigo...@gmail.com > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: rosk...@gmail.com >
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com