since ndrange is a superset of the features of ndindex, we can implement ndindex with ndrange or keep it as is. ndindex is now a glorified `nditer` object anyway. So it isn't so much of a maintenance burden. As for how ndindex is implemented, I'm a little worried about python 2 performance seeing as range is a list. I would wait on changing the way ndindex is implemented for now.
I agree with Stephan that ndindex should be kept in. Many want backward compatible code. It would be hard for me to justify why a dependency should be bumped up to bleeding edge numpy just for a convenience iterator. Honestly, I was really surprised to see such a speed difference, I thought it would have been closer. Allan, I decided to run a few more benchmarks, the nditer just seems slow for single array access some reason. Maybe a bug? ``` import numpy as np import itertools a = np.ones((1000, 1000)) b = {} for i in np.ndindex(a.shape): b[i] = i %%timeit # op_flag=('readonly',) doesn't change performance for a_value in np.nditer(a): pass 109 ms ± 921 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %%timeit for i in itertools.product(range(1000), range(1000)): a_value = a[i] 113 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %%timeit for i in itertools.product(range(1000), range(1000)): c = b[i] 193 ms ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit for a_value in a.flat: pass 25.3 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %%timeit for k, v in b.items(): pass 19.9 ms ± 675 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %%timeit for i in itertools.product(range(1000), range(1000)): pass 28 ms ± 715 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` On Mon, Oct 8, 2018 at 4:26 PM Stephan Hoyer <sho...@gmail.com> wrote: > I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e., > discouraging its use in our docs, but not actually deprecating it). > Certainly ndrange seems like a small but meaningful improvement in the > interface. > > That said, I'm not convinced this is really worth the trouble. I think the > nested loop is still pretty readable/clear, and there are few times when > I've actually found ndindex() be useful. > > On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhald...@gmail.com> > wrote: > >> On 10/8/18 12:21 PM, Mark Harfouche wrote: >> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like >> > `range`, is not an iterator. Changing this behaviour would likely lead >> > to breaking code that uses that assumption. For example anybody using >> > introspection or code like: >> > >> > ``` >> > indx = np.ndindex(5, 5) >> > next(indx) # Don't look at the (0, 0) coordinate >> > for i in indx: >> > print(i) >> > ``` >> > would break if `ndindex` becomes "not an iterator" >> >> OK, I see now. Just like python3 has separate range and range_iterator >> types, where range is sliceable, we would have separate ndrange and >> ndindex types, where ndrange is sliceable. You're just copying the >> python3 api. That justifies it pretty well for me. >> >> I still think we shouldn't have two functions which do nearly the same >> thing. We should only have one, and get rid of the other. I see two ways >> forward: >> >> * replace ndindex by your ndrange code, so it is no longer an iter. >> This would require some deprecation cycles for the cases that break. >> * deprecate ndindex in favor of a new function ndrange. We would keep >> ndindex around for back-compatibility, with a dep warning to use >> ndrange instead. >> >> Doing a code search on github, I can see that a lot of people's code >> would break if ndindex no longer was an iter. I also like the name >> ndrange for its allusion to python3's range behavior. That makes me lean >> towards the second option of a separate ndrange, with possible >> deprecation of ndindex. >> >> > itertools.product + range seems to be much faster than the current >> > implementation of ndindex >> > >> > (python 3.6) >> > ``` >> > %%timeit >> > >> > for i in np.ndindex(100, 100): >> > pass >> > 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) >> > >> > %%timeit >> > import itertools >> > for i in itertools.product(range(100), range(100)): >> > pass >> > 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) >> > ``` >> >> If the new code ends up faster than the old code, that's great, and >> further justification for using ndrange instead of ndindex. I had >> thought using nditer in the old code was fastest. >> >> So as far as I am concerned, I say go ahead with the PR the way you are >> doing it. >> >> Allan >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion