On Tue, Nov 6, 2012 at 7:52 PM, Warren Weckesser <warren.weckes...@gmail.com > wrote:
> > > On Tue, Nov 6, 2012 at 8:27 PM, Phillip Feldman < > phillip.m.feld...@gmail.com> wrote: > >> numpy.unique behaves as I would expect for small inputs like the >> following: >> >> In [12]: x= [0, 0, 1, 0, 1, 2, 0, 1, 2, 3] >> >> In [13]: unique(x, return_index=True) >> Out[13]: (array([0, 1, 2, 3]), array([0, 2, 5, 9], dtype=int64)) >> >> But, when I give it something larger, the return index values do not >> always correspond to the first occurrences in the input. The documentation >> is silent on the question of how the return index values are chosen when a >> given element of x appears more than once. Either the documentation should >> be >> clarified, or better yet, the behavior should be changed. >> > > > In fact, it was changed (in the master branch on github) several months > ago, but there has not yet been a release with the changes. The sort > method that np.unique passes to np.argsort is now 'mergesort', and the > docstring states that the indices returned are for the first occurrences of > the unique elements. The new docstring is here: > http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.unique.html#numpy.unique > > See > https://github.com/numpy/numpy/commit/dbf235169ed3386b359caaa9217f5280bf1d6749for > the commit, and > https://github.com/numpy/numpy/blob/master/numpy/lib/arraysetops.py for > the latest version of the source. > > That change was backported to 1.6.2, but doesn't work for record/object arrays. That oversight is fixed in master. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion