[Numpy-discussion] Obscure code in concatenate code path?
Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis = NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c So, if the user passes an axis = (by default) 32 the arrays to concatenate get flattened, and must all have the same number of elements (it turns out). This seems obscure. Obviously it is not likely that someone would pass in an axis no = 32 by accident, but if they did, they would not get the result they expect. Is there some code-path that needs this? Is there another way of doing it? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
On Wed, Sep 12, 2012 at 4:19 PM, Nathaniel Smith n...@pobox.com wrote: On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I just noticed that this works for numpy 1.6.1: In [36]: np.concatenate(([2, 3], [1]), 1) Out[36]: array([2, 3, 1]) but the beta release branch: In [3]: np.concatenate(([2, 3], [1]), 1) --- IndexErrorTraceback (most recent call last) /Users/mb312/ipython-input-3-0fa244c8aaa8 in module() 1 np.concatenate(([2, 3], [1]), 1) IndexError: axis 1 out of bounds [0, 1) In the interests of backward compatibility maybe it would be better to raise a warning for this release, rather than an error? Yep, that'd be a good idea. Want to write a patch? :-) https://github.com/numpy/numpy/pull/440 Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Contiguity of result of astype changed - intentional?
Hi, On Wed, Sep 12, 2012 at 10:24 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi Matt, On Wed, Sep 12, 2012 at 1:27 PM, Travis Oliphant tra...@continuum.io wrote: Is this intended? Is there a performance reason to keep the same strides in 1.7.0? I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to C-order but to Keep-order.So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). Thanks for the reply. So maybe the bottom line is that the user should not assume any contiguity from ``astype``? If that's the case I'll submit a docstring PR to say that. Yes, that would be a great addition to the docstring. Mark, can you confirm this is the desired behavior? Ondrej, this would be something to put in the release notes, if it isn't already. If you could submit the PR with the docs, that'd be awesome. In the meantime, I've created an issue for it: https://github.com/numpy/numpy/issues/437 and put it into my TODO list: https://github.com/numpy/numpy/issues/396 Sorry - inadequate research on my part - the current docstring for ``astype`` is clear and comprehensive, and the change was obviously intentional. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis = NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c How deeply weird. So, if the user passes an axis = (by default) 32 the arrays to concatenate get flattened, and must all have the same number of elements (it turns out). This seems obscure. Obviously it is not likely that someone would pass in an axis no = 32 by accident, but if they did, they would not get the result they expect. Is there some code-path that needs this? Is there another way of doing it? This behaviour seems to be older -- I can reproduce it empirically with 1.6.2. But the actual code you encountered was introduced along with PyArray_ConcatenateFlattenedArrays itself by Mark Wiebe in 9194b3af. So @Mark, you were the last one to look at this closely, any thoughts? :-) Though, in 1.6.2, there doesn't seem to be any requirement that the arrays have the same length: In [11]: np.concatenate(([[1, 2]], [[3]]), axis=100) Out[11]: array([1, 2, 3]) My first guess is that this was some ill-advised defensive programming thing where someone wanted to do *something* with these weird axis arguments, and picked *something* at more-or-less random. I like that theory better than the one where someone introduced this on purpose and then used it... It might even be that rare case where the best solution is to just rip it out and see if anyone notices. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis = NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c How deeply weird This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation.This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. Reference PyArray_AxisConverter (which turns a None Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
On Thu, Sep 13, 2012 at 9:01 AM, Travis Oliphant tra...@continuum.iowrote: On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis = NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c How deeply weird This is expected behavior. Heh, I guess expected is subjective: In [23]: np.__version__ Out[23]: '1.6.1' In [24]: a = zeros((2,2)) In [25]: b = ones((2,3)) In [26]: concatenate((a, b), axis=0) # Expected error. --- ValueErrorTraceback (most recent call last) /Users/warren/gitwork/class-material/demo/pytables/ipython-input-26-7cefb735e507 in module() 1 concatenate((a, b), axis=0) # Expected error. ValueError: array dimensions must agree except for d_0 In [27]: concatenate((a, b), axis=1) # Normal behavior. Out[27]: array([[ 0., 0., 1., 1., 1.], [ 0., 0., 1., 1., 1.]]) In [28]: concatenate((a, b), axis=2) # Cryptic error message. --- ValueErrorTraceback (most recent call last) /Users/warren/gitwork/class-material/demo/pytables/ipython-input-28-0bce84c34ef1 in module() 1 concatenate((a, b), axis=2) # Cryptic error message. ValueError: bad axis1 argument to swapaxes In [29]: concatenate((a, b), axis=32) # What the... ? Out[29]: array([ 0., 0., 0., 0., 1., 1., 1., 1., 1., 1.]) I would expect an error, consistent with the behavior when 1 axis 32. Warren It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation.This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. Reference PyArray_AxisConverter (which turns a None Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
On Thu, Sep 13, 2012 at 6:39 PM, Warren Weckesser warren.weckes...@enthought.com wrote: I would expect an error, consistent with the behavior when 1 axis 32. In that case, you are hitting the dimension limit. np.concatenate((a,b), axis=31) ValueError: bad axis1 argument to swapaxes Where axis=32, axis=3500, axis=None all return the flattened array. I have been trying with other functions and got something interesting. With the same a, b as before: np.sum((a,b), axis=0) ValueError: operands could not be broadcast together with shapes (2) (3) np.sum((a,b), axis=1) array([[ 0. 0.], [ 2. 2. 2.]], dtype=object) np.sum((a,b), axis=2) ValueError: axis(=2) out of bounds This is to be expected, but now this is not consistent: np.sum((a,b), axis=32) ValueError: operands could not be broadcast together with shapes (2) (3) np.sum((a,b), axis=500) ValueError: axis(=500) out of bounds ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
Hi, On Thu, Sep 13, 2012 at 3:01 PM, Travis Oliphant tra...@continuum.io wrote: On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis = NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c How deeply weird This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation.This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. Reference PyArray_AxisConverter (which turns a None Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. How about something like: #define NPY_NONE_AXIS NPY_MAXDIMS to make it clearer what is intended? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Obscure code in concatenate code path?
This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation. This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. Reference PyArray_AxisConverter (which turns a None Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. How about something like: #define NPY_NONE_AXIS NPY_MAXDIMS to make it clearer what is intended? +1 -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
Hi, On Thu, Sep 13, 2012 at 11:31 AM, Matthew Brett matthew.br...@gmail.com wrote: On Wed, Sep 12, 2012 at 4:19 PM, Nathaniel Smith n...@pobox.com wrote: On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I just noticed that this works for numpy 1.6.1: In [36]: np.concatenate(([2, 3], [1]), 1) Out[36]: array([2, 3, 1]) but the beta release branch: In [3]: np.concatenate(([2, 3], [1]), 1) --- IndexErrorTraceback (most recent call last) /Users/mb312/ipython-input-3-0fa244c8aaa8 in module() 1 np.concatenate(([2, 3], [1]), 1) IndexError: axis 1 out of bounds [0, 1) In the interests of backward compatibility maybe it would be better to raise a warning for this release, rather than an error? Yep, that'd be a good idea. Want to write a patch? :-) https://github.com/numpy/numpy/pull/440 Thinking about the other thread, and the 'number of elements' check, I noticed this: In [51]: np.__version__ Out[51]: '1.6.1' In [52]: r4 = range(4) In [53]: r3 = range(3) In [54]: np.concatenate((r4, r3), None) Out[54]: array([0, 1, 2, 3, 0, 1, 2]) but: In [46]: np.__version__ Out[46]: '1.7.0rc1.dev-ea23de8' In [47]: np.concatenate((r4, r3), None) --- ValueErrorTraceback (most recent call last) /Users/mb312/tmp/ipython-input-47-e354b8880702 in module() 1 np.concatenate((r4, r3), None) ValueError: all the input arrays must have same number of elements The change requiring the same number of elements appears to have been added explicitly by Mark in commit 9194b3af . Mark - what was the reason for that check? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
Yep, that'd be a good idea. Want to write a patch? :-) https://github.com/numpy/numpy/pull/440 Thinking about the other thread, and the 'number of elements' check, I noticed this: In [51]: np.__version__ Out[51]: '1.6.1' In [52]: r4 = range(4) In [53]: r3 = range(3) In [54]: np.concatenate((r4, r3), None) Out[54]: array([0, 1, 2, 3, 0, 1, 2]) but: In [46]: np.__version__ Out[46]: '1.7.0rc1.dev-ea23de8' In [47]: np.concatenate((r4, r3), None) --- ValueErrorTraceback (most recent call last) /Users/mb312/tmp/ipython-input-47-e354b8880702 in module() 1 np.concatenate((r4, r3), None) ValueError: all the input arrays must have same number of elements The change requiring the same number of elements appears to have been added explicitly by Mark in commit 9194b3af . Mark - what was the reason for that check? This looks like a regression. That should still work. -Travis Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion