Re: [Numpy-discussion] strange behavior of numpy.unique

Charles R Harris Wed, 07 Nov 2012 13:48:16 -0800

On Tue, Nov 6, 2012 at 7:52 PM, Warren Weckesser <warren.weckes...@gmail.com
> wrote:


>
>
> On Tue, Nov 6, 2012 at 8:27 PM, Phillip Feldman <
> phillip.m.feld...@gmail.com> wrote:
>
>> numpy.unique behaves as I would expect for small inputs like the
>> following:
>>
>> In [12]: x= [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
>>
>> In [13]: unique(x, return_index=True)
>> Out[13]: (array([0, 1, 2, 3]), array([0, 2, 5, 9], dtype=int64))
>>
>> But, when I give it something larger, the return index values do not
>> always correspond to the first occurrences in the input. The documentation
>> is silent on the question of how the return index values are chosen when a
>> given element of x appears more than once. Either the documentation should
>> be
>> clarified, or better yet, the behavior should be changed.
>>
>
>
> In fact, it was changed (in the master branch on github) several months
> ago, but there has not yet been a release with the changes.  The sort
> method that np.unique passes to np.argsort is now 'mergesort', and the
> docstring states that the indices returned are for the first occurrences of
> the unique elements.  The new docstring is here:
> http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.unique.html#numpy.unique
>
> See
> https://github.com/numpy/numpy/commit/dbf235169ed3386b359caaa9217f5280bf1d6749for
>  the commit, and
> https://github.com/numpy/numpy/blob/master/numpy/lib/arraysetops.py for
> the latest version of the source.
>
>
That change was backported to 1.6.2, but doesn't work for record/object
arrays. That oversight is fixed in master.

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] strange behavior of numpy.unique

Reply via email to