I wonder whether numpy is using the "old" iteration protocol (repeatedly calling x[i] for increasing i until StopIteration is reached?) A quick timing shows that it is indeed slower. ... actually it's not even clear to me what qualifies as a sequence for `np.array`:
class C: def __iter__(self): return iter(range(10)) # [0... 9] under the new iteration protocol def __getitem__(self, i): raise IndexError # [] under the old iteration protocol np.array(C()) ===> array(<__main__.C object at 0x7f3f21ffff28>, dtype=object) So how can np.array(range(...)) even work? 2016-02-14 22:21 GMT-08:00 Ralf Gommers <ralf.gomm...@gmail.com>: > > > On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > >> >> >> On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers <ralf.gomm...@gmail.com> >> wrote: >> >>> >>> >>> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee <antony....@berkeley.edu> >>> wrote: >>> >>>> re: no reason why... >>>> This has nothing to do with Python2/Python3 (I personally stopped using >>>> Python2 at least 3 years ago.) Let me put it this way instead: if >>>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >>>> provided by numpy, I don't think it would be controversial at all to >>>> provide an `__array__` special method to efficiently convert it to a >>>> ndarray. It would be the same if `np.array` used a >>>> `functools.singledispatch` dispatcher rather than an `__array__` special >>>> method (which is obviously not possible for chronological reasons). >>>> >>>> re: iterable vs iterator: check for the presence of the __next__ >>>> special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and >>>> not isinstance(x, Iterable)) >>>> >>> >>> I think it's good to do something about this, but it's not clear what >>> the exact proposal is. I could image one or both of: >>> >>> - special-case the range() object in array (and asarray/asanyarray?) >>> such that array(range(N)) becomes as fast as arange(N). >>> - special-case all iterators, such that array(range(N)) becomes as >>> fast as deque(range(N)) >>> >> >> I think the last wouldn't help much, as numpy would still need to >> determine dimensions and type. I assume that is one of the reason sparse >> itself doesn't do that. >> > > Not orders of magnitude, but this shows that there's something to optimize > for iterators: > > In [1]: %timeit np.array(range(100000)) > 100 loops, best of 3: 14.9 ms per loop > > In [2]: %timeit np.array(list(range(100000))) > 100 loops, best of 3: 9.68 ms per loop > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion