Re: [Numpy-discussion] Reductions with nditer working only with the last axis
On Thu, Sep 27, 2012 at 6:08 PM, Sergio Pascual sergio.pa...@gmail.com wrote: Hello, I'm trying to understand how to work with nditer to do a reduction, in my case converting a 3d array into a 2d array. I followed the help here http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html and managed to create a function that applies reduction over the last axis of the input. With this function def nditer_sum(data, red_axes): it = numpy.nditer([data, None], flags=['reduce_ok', 'external_loop'], op_flags=[['readonly'], ['readwrite', 'allocate']], op_axes=[None, red_axes]) it.operands[1][...] = 0 for x, y in it: y[...] = x.sum() return it.operands[1] I can get something equivalent to data.sum(axis=2) data = numpy.arange(2*3*4).reshape((2,3,4)) nditer_sum(data, [0, 1, -1]) [[ 6 22 38] [54 70 86]] data.sum(axis=2) [[ 6 22 38] [54 70 86]] So to get something equivalent to data.sum(axis=0) I though that it was enough to change the argument red_axes to [-1, 0,1] But the result is quite different. data = numpy.arange(2*3*4).reshape((2,3,4)) data.sum(axis=0) [[12 14 16 18] [20 22 24 26] [28 30 32 34]] nditer_sum(data, [-1, 0, 1]) [[210 210 210 210] [210 210 210 210] [210 210 210 210]] In the for loop inside nditer_sum (for x,y in it:), the iterator is looping 2 times and giving an array of length 12 each time, instead of looping 12 times and giving an array of length 2 each time. I have read the numpy documentation several times and googled about this to no avail. Does anybody have an example of a reduction in the first axis of an array using nditer? Is this a bug? Regards, Sergio ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion The example from the link it shows how to do a reduction with y[...] += x. If you would replace x.sum() by += x, then it works. nditer_sum(data, [-1,0,1]) array([[12, 14, 16, 18], [20, 22, 24, 26], [28, 30, 32, 34]]) data.sum(axis=0) array([[12, 14, 16, 18], [20, 22, 24, 26], [28, 30, 32, 34]]) nditer_sum(data, [0,-1,1]) array([[12, 15, 18, 21], [48, 51, 54, 57]]) data.sum(axis=1) array([[12, 15, 18, 21], [48, 51, 54, 57]]) I think that is because sum() already reduces all axis by default. Regards, Han ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of .base
On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io wrote: Hey all, In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list.Traditionally, .base has always pointed to None for arrays that owned their own memory and to the most immediate array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. Recently this behavior was altered so that .base always points to the original object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to the most original object that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. What are reactions to this proposal? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I think the current behaviour of the .base attribute is much more stable and predictable than past behaviour. For views for instance, this makes sure you don't hold references of 'intermediate' views, but always point to the original *base* object. Also, I think a lot of internal logic depends on this behaviour, so I am not in favour of changing this back (yet) again. Also, considering that this behaviour already exists in past versions of NumPy, namely 1.6, and is very fundamental to how arrays work, I find it strange that it is now up for change in 1.7 at the last minute. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of .base
On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant tra...@continuum.io wrote: We are not talking about changing it back. The change in 1.6 caused problems that need to be addressed. Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 3:30 PM, Han Genuit hangen...@gmail.com wrote: On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io wrote: Hey all, In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list.Traditionally, .base has always pointed to None for arrays that owned their own memory and to the most immediate array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. Recently this behavior was altered so that .base always points to the original object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to the most original object that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. What are reactions to this proposal? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I think the current behaviour of the .base attribute is much more stable and predictable than past behaviour. For views for instance, this makes sure you don't hold references of 'intermediate' views, but always point to the original *base* object. Also, I think a lot of internal logic depends on this behaviour, so I am not in favour of changing this back (yet) again. Also, considering that this behaviour already exists in past versions of NumPy, namely 1.6, and is very fundamental to how arrays work, I find it strange that it is now up for change in 1.7 at the last minute. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Well, the current behaviour makes sure you can have an endless chain of views derived from each other without keeping a copy of each view alive. If I understand correctly, you propose to change this behaviour to where it would keep a copy of each view alive.. My concern is that the problems that occurred from the 1.6 change are now seen as paramount above a correct implementation. There are problems with backward compatibility, but most of these are due to lack of documentation and testing. And now there will be a lot of people depending on the new behaviour, which is also something to take into account. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of .base
On Sun, Sep 30, 2012 at 10:55 PM, Travis Oliphant tra...@continuum.io wrote: I think you are misunderstanding the proposal. The proposal is to traverse the views as far as you can but stop just short of having base point to an object of a different type. This fixes the infinite chain of views problem but also fixes the problem sklearn was having with base pointing to an unexpected mmap object. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 3:50 PM, Han Genuit hangen...@gmail.com wrote: On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant tra...@continuum.io wrote: We are not talking about changing it back. The change in 1.6 caused problems that need to be addressed. Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 3:30 PM, Han Genuit hangen...@gmail.com wrote: On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant tra...@continuum.io wrote: Hey all, In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list.Traditionally, .base has always pointed to None for arrays that owned their own memory and to the most immediate array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. Recently this behavior was altered so that .base always points to the original object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to the most original object that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. What are reactions to this proposal? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I think the current behaviour of the .base attribute is much more stable and predictable than past behaviour. For views for instance, this makes sure you don't hold references of 'intermediate' views, but always point to the original *base* object. Also, I think a lot of internal logic depends on this behaviour, so I am not in favour of changing this back (yet) again. Also, considering that this behaviour already exists in past versions of NumPy, namely 1.6, and is very fundamental to how arrays work, I find it strange that it is now up for change in 1.7 at the last minute. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Well, the current behaviour makes sure you can have an endless chain of views derived from each other without keeping a copy of each view alive. If I understand correctly, you propose to change this behaviour to where it would keep a copy of each view alive.. My concern is that the problems that occurred from the 1.6 change are now seen as paramount above a correct implementation. There are problems with backward compatibility, but most of these are due to lack of documentation and testing. And now there will be a lot of people depending on the new behaviour, which is also something to take into account. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Ah, sorry, I get it. You mean to make sure that base is an object of type ndarray. No problems there. :-) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release
[snip] Hello, I ran some compatibility tests on Windows, using numpy-MKL-1.7.x.dev.win-amd64-py2.7 with packages built against numpy-MKL-1.6.2. There are new test failures in scipy, bottleneck, pymc, and mvpa2 of the following types: IndexError: too many indices ValueError: negative dimensions are not allowed The test results are at http://www.lfd.uci.edu/~gohlke/pythonlibs/tests/20120916-win-amd64-py2.7-numpy-MKL-1.7.0rc1.dev-50f71cb/ Christoph Hi, https://github.com/numpy/numpy/pull/445 should fix negative dimensions are not allowed, the other one I have not yet been able to pinpoint. Regards, Han ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
Okay, sent in a pull request: https://github.com/numpy/numpy/pull/443. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
Yeah, that merge was fast. :-) Regards, Han ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
You're welcome. I do not have many expectations; only those you can expect from an open source project. ;-) On Sat, Sep 15, 2012 at 10:33 PM, Travis Oliphant tra...@continuum.io wrote: It's very nice to get your help.I hope I haven't inappropriately set expectations :-) -Travis On Sep 15, 2012, at 3:14 PM, Han Genuit wrote: Yeah, that merge was fast. :-) Regards, Han ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release
I think there is something wrong with the implementation.. I would expect each incoming array in PyArray_ConcatenateFlattenedArrays to be flattened and the sizes of all of them added into a one-dimensional shape. Now the shape is two-dimensional, which does not make sense to me. Also the requirement that all sizes must be equal between the incoming arrays only makes sense when you want to stack them into a two-dimensional array, which makes it unnecessarily complicated. The difficulty here is to use PyArray_CopyAsFlat without having to transform/copy each incoming array to the priority dtype, because they can have different item sizes between them, but other than that it should be pretty straightforward, imo. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] sum and prod
Is the difference between prod and sum intentional? I would expect that numpy.prod would also work on a generator, just like numpy.sum. Whatever the correct result may be, I would expect them to have the same behavior with respect to a generator argument. I found out that np.sum() has some special treatment in fromnumeric.py, where in case of a generator argument it uses the Python sum() function instead of the NumPy one. This is not the case for np.prod(), where the generator argument stays NPY_OBJECT in PyArray_GetArrayParamsFromObject. There is no NumPy code for handling generators, except for np.fromiter(), but that needs a dtype (which cannot be inferred automatically before running the generator). It might be more consistent to add special generator cases to other NumPy functions as well, using Python reduce() or imap(), but I'm not sure about the best way to solve this.. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] sum and prod
Hi, Maybe try something like this? args = np.array([4,8]) np.prod(args 0) 1 np.sum(args 0) 2 Cheers, Han ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)
Hi, instead of putting up a pull request that reverts all the 25000 lines of code than have been written to support an NA mask, why won't you set up a pull request that uses the current code base to implement your own ideas on how it should work? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)
On Sun, Oct 30, 2011 at 12:47 AM, Eric Firing efir...@hawaii.edu wrote: On 10/29/2011 12:02 PM, Olivier Delalleau wrote: I haven't been following the discussion closely, but wouldn't it be instead: a.mask[0:2] = True? That would be consistent with numpy.ma and the opposite of Mark's implementation. I can live with either, but I much prefer the numpy.ma version because it fits with the use of bit-flags for editing data; set bit 1 if it fails check A, set bit 2 if it fails check B, etc. So, if it evaluates as True, there is a problem, and the value is masked *out*. I think in Mark's implementation it works the same: a = np.arange(3, maskna=True) a[1] = np.NA a array([0, NA, 2]) np.isna(a) array([False, True, False], dtype=bool) This is more consistent than using False to represent an NA mask, I agree. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] consensus (was: NA masks in the next numpy release?)
To be honest, you have been slandering a lot, also in previous discussions, to get what you wanted. This is not a healthy way of discussion, nor does it help in any way. There have been many people willing to listen and agree with you on points; and this is exactly what discussion is all about, but where they might agree on some, they might disagree on others. When you start pulling the - people who won't listen to me are evil - card, it might have some effect the first time, but the second and third time they see what's coming.. o/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NA masks in the next numpy release?
There is a way to assign whole masks in the current implementation: a = np.arange(9, maskna=True).reshape((3,3)) a array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) mask = np.array([[False, False, True], [False, True, False], [True, False, True]]) np.copyto(a, np.NA, where=mask) a array([[0, 1, NA], [3, NA, 5], [NA, 7, NA]]) I think the ValueError: Cannot assign NA to an array which does not support NAs when trying to copy an array with a mask to an array without a mask is a bug.. a = np.arange(9, maskna=True).reshape((3,3)) a.flags.maskna True b = a.copy(maskna=False) b.flags.maskna False It should be possible to remove a mask when copying an array. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NA masks in the next numpy release?
Yes, to further iterate on that, you can also create multiple masked views with each its own mask properties. It would be ambiguous to mix a bit-pattern NA together with standard NA's in the same mask, but you can make different specialized masked views on the same data. Also, I like the short and concise abbreviation for 'Not Applicable', NA. It has more common uses than IGNORE. (See also here: http://www.johndcook.com/R_language_for_programmers.html#missing) Concerning the assignment, it is a bit implicit, I agree, but the representation and application of masks is also implicit. I think you only have to know that NA will be a mask assignment and not a data assignment. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NA masks in the next numpy release?
There is also: Missing/accumulating data http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057406.html An NA compromise idea -- many-NA http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057408.html NEPaNEP lessons - was: alterNEP http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057435.html NA/Missing Data Conference Call Summary http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057474.html HPC missing data - was: NA/Missing Data Conference Call Summary http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057482.html using the same vocabulary for missing value ideas http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057485.html towards a more productive missing values/masked arrays discussion... http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057511.html miniNEP1: where= argument for ufuncs http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057513.html miniNEP 2: NA support via special dtypes http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057542.html Missing Data development plan http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057567.html Missing Values Discussion http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057579.html NA masks for NumPy are ready to test http://mail.scipy.org/pipermail/numpy-discussion/2011-August/058103.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NA masks in the next numpy release?
Well, if I may have a say, I think that an open source project is especially open when users as developers can contribute to the code base and can participate in discussions on how to improve the existing designs and ideas. I do not think a project is open when it crumbles down into politics.. I have seen a lot of work done by Mark especially to ensure that everyone had a say in what he was doing, up to the point where this might not be fun anymore. And from what I can see at the time, which was back in August, everyone has had plenty of opportunity to discuss or contribute to the specific changes that were made. This was an open contribution to the NumPy code, not some cooked up shady business by high and mighty developers and I, for one, am happy with how it turned out. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Crash on (un-orthodox) __import__
Still, it shouldn't segfault, and it's worth figuring out why it does. gdb has been mostly unenlightening for me since gdb won't let me navigate the traceback. You could try faulthandler, it prints the (python) traceback after a crash: http://pypi.python.org/pypi/faulthandler/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancy indexing with masks
2011/9/27 Olivier Delalleau sh...@keba.be 2011/9/27 Zbigniew Jędrzejewski-Szmek zbys...@in.waw.pl On 09/22/2011 12:09 PM, Pauli Virtanen wrote: Thu, 22 Sep 2011 08:12:12 +0200, Han Genuit wrote: [clip] I also noticed that it does strange things when using a list: c[[True, False, True]] array([[3, 4, 5], [0, 1, 2], [3, 4, 5]]) It casts the list with booleans to an integer array. Probably shouldn't work like that... Changing that would require looping over the list first to check if everything is an boolean, or maybe just looking at the first few elements. Either way pretty ugly. So I guess that the current (slightly surprising) behaviour has to stay. Ugly implementation is better than ugly behavior IMO. -=- Olivier It should also be possible to convert the list to a NumPy array before doing the actual indexing, then you won't lose consistency. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancy indexing with masks
2011/9/20 Stéfan van der Walt ste...@sun.ac.za On Tue, Sep 20, 2011 at 12:43 AM, Robert Kern robert.k...@gmail.com wrote: If the array is short in a dimension, it gets implicitly continued with Falses. You can see this in one dimension: [...] I honestly don't know if this is documented or tested anywhere or even if this existed in older versions. The behaviour is already present in 1.4, so I guess it's too late to insert a shape check now? There already is a shape check present in the development version[1]: a = np.arange(10) b = np.array([False, True, False]) a[b] Traceback (most recent call last): File stdin, line 1, in module ValueError: operands could not be broadcast together with shapes (10) (3) But it does not seem to work on multidimensional arrays: c = np.arange(12).reshape((4,3)) c[b] array([[3, 4, 5]]) I also noticed that it does strange things when using a list: c[[True, False, True]] array([[3, 4, 5], [0, 1, 2], [3, 4, 5]]) Regards, Han [1] See also: http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057870.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion