Robert Cimrman <cimrman3 <at> ntc.zcu.cz> writes: > > Hi, > > I am starting a new thread, so that it reaches the interested people. > Let us discuss improvements to arraysetops (array set operations) at [1] > (allowing non-unique arrays as function arguments, better naming > conventions and documentation). > > r. > > [1] http://projects.scipy.org/numpy/ticket/1133 >
Hi, These changes looks good to me. For point (1) I think we should fold the unique and _nu code into a single function. For point (3) I like in1d - it's shorter than isin1d but is still clear. What about merging unique and unique1d? They're essentially identical for an array input, but unique uses the builtin set() for non-array inputs and so is around 2x faster in this case - see below. Is it worth accepting a speed regression for unique to get rid of the function duplication? (Or can they be combined?) Neil In [24]: l = list(np.random.randint(100, size=10000)) In [25]: %timeit np.unique1d(l) 1000 loops, best of 3: 1.9 ms per loop In [26]: %timeit np.unique(l) 1000 loops, best of 3: 793 µs per loop In [27]: l = list(np.random.randint(100, size=1000000)) In [28]: %timeit np.unique(l) 10 loops, best of 3: 78 ms per loop In [29]: %timeit np.unique1d(l) 10 loops, best of 3: 233 ms per loop _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion