On Thu, Aug 13, 2015 at 9:57 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote:
> On Thu, Aug 13, 2015 at 7:59 AM, Nathan Goldbaum <nathan12...@gmail.com> > wrote: > >> >> >> On Thu, Aug 13, 2015 at 9:44 AM, Charles R Harris < >> charlesr.har...@gmail.com> wrote: >> >>> >>> >>> On Thu, Aug 13, 2015 at 12:09 AM, Jaime Fernández del Río < >>> jaime.f...@gmail.com> wrote: >>> >>>> On Wed, Aug 12, 2015 at 2:03 PM, Nathan Goldbaum <nathan12...@gmail.com >>>> > wrote: >>>> >>>>> Hi all, >>>>> >>>>> I've been testing the package I spend most of my time on, yt, under >>>>> numpy 1.10b1 since the announcement went out. >>>>> >>>>> I think I've narrowed down and fixed all of the test failures that >>>>> cropped up except for one last issue. It seems that the behavior of >>>>> np.digitize with respect to ndarray subclasses has changed since the NumPy >>>>> 1.9 series. Consider the following test script: >>>>> >>>>> ```python >>>>> import numpy as np >>>>> >>>>> >>>>> class MyArray(np.ndarray): >>>>> def __new__(cls, *args, **kwargs): >>>>> return np.ndarray.__new__(cls, *args, **kwargs) >>>>> >>>>> data = np.arange(100) >>>>> >>>>> bins = np.arange(100) + 0.5 >>>>> >>>>> data = data.view(MyArray) >>>>> >>>>> bins = bins.view(MyArray) >>>>> >>>>> digits = np.digitize(data, bins) >>>>> >>>>> print type(digits) >>>>> ``` >>>>> >>>>> Under NumPy 1.9.2, this prints "<type 'numpy.ndarray'>", but under the >>>>> 1.10 beta, it prints "<class '__main__.MyArray'>" >>>>> >>>>> I'm curious why this change was made. Since digitize outputs index >>>>> arrays, it doesn't make sense to me why it should return anything but a >>>>> plain ndarray. I see in the release notes that digitize now uses >>>>> searchsorted under the hood. Is this related? >>>>> >>>> >>>> It is indeed searchsorted's fault, as it returns an object of the same >>>> type as the needle (the items to search for): >>>> >>>> >>> import numpy as np >>>> >>> class A(np.ndarray): pass >>>> >>> class B(np.ndarray): pass >>>> >>> np.arange(10).view(A).searchsorted(np.arange(5).view(B)) >>>> B([0, 1, 2, 3, 4]) >>>> >>>> I am all for making index-returning functions always return a base >>>> ndarray, and will be more than happy to send a PR fixing this if there is >>>> some agreement. >>>> >>> >>> I think that is the right thing to do. >>> >> >> Awesome, I'd appreciate having a PR to fix this. Arguably the return type >> *could* be the same type as the inputs, but given that it's a behavior >> change I agree that it's best to add a patch so the output of serachsorted >> is "sanitized" to be an ndarray before it's returned by digitize. >> > > It is relatively simple to do, just replace Py_TYPE(ap2) with > &PyArray_Type in this line: > > > https://github.com/numpy/numpy/blob/maintenance/1.10.x/numpy/core/src/multiarray/item_selection.c#L1725 > > Then fix all the tests that are expecting searchsorted to return something > else than a base ndarray. We already have modified nonzero to return base > ndarray's in this release, see the release notes, so it will go with the > same theme. > > For 1.11 I think we should try to extend this "if it returns an index, it > will be a base ndarray" to all other functions that don't right now. Then > sit back and watch AstroPy come down in flames... ;-))) > > Seriously, I think this makes a lot of sense, and should be documented as > the way NumPy handles index arrays. > > Anyway, I will try to find time tonight to put this PR together, unless > someone beats me to it, which I would be totally fine with. > PR #6206 it is: https://github.com/numpy/numpy/pull/6206 Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion