Re: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython
Yang Zhang yanghatespam at gmail.com writes: I'm curious how to disable threads in numpy (not an ideal solution). Googling seems to point me to setting NPY_ALLOW_THREADS to 0somewhere. Anyone? It's appearing to me I had to face this very issue, which I reported @Numpy TRAC : http://projects.scipy.org/numpy/ticket/2213. I just tried your suggestion : set NPY_ALLOW_THREADS to 0 in numpy/core/include/numpy/ndarraytypes.h. It allowed my atomic example to run without stalling, and also fixed the issue in my application. Though i'm not entirely satisfied by this workaround, which might slow down heavy computations. I also find it too intrusive in numpy source code and don't wish to maintain a powerless numpy fork. Has anyone else settled with this fix ? Or may anybody have any other suggestion / comments ? Thanks. Raphael. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Double-ended queues
Hi all, I want to be able to within a loop a) apply a mathematical operation to all elements in a vector (can be done atomically) then b) pop zero or more elements from one end of the vector and c) push zero or more elements on to the other end. So far I've used a collections.deque to store my vector as it should be more efficient than a numpy array for the appending and deletion of elements. However, I was wondering whether performance could be improved through the use of a homogeneously-typed double-ended queue i.e. a linked list equivalent of numpy.ndarray. Has anyone previously considered whether it would be worth including such a thing within the numpy package? Cheers, Will ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] variable number of columns in loadtxt/genfromtxt
Hi, I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array. Is there any way around this? Thanks for your insight, Andreas. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Double-ended queues
On Tue, Sep 25, 2012 at 10:03 AM, William Furnass w...@thearete.co.uk wrote: Hi all, I want to be able to within a loop a) apply a mathematical operation to all elements in a vector (can be done atomically) then b) pop zero or more elements from one end of the vector and c) push zero or more elements on to the other end. So far I've used a collections.deque to store my vector as it should be more efficient than a numpy array for the appending and deletion of elements. However, I was wondering whether performance could be improved through the use of a homogeneously-typed double-ended queue i.e. a linked list equivalent of numpy.ndarray. Implementing a ring buffer on top of ndarray would be pretty straightforward and probably work better than a linked-list implementation. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython
Can you expand a bit? Are you trying to disable threads at compile-time or at run-time? Which threaded functionality are you trying to disable? Are you using numpy as a computational library with multiple threads making calls into its functions? I think NPY_ALLOW_THREADS is for interacting with the GIL, but I have not played with it much. A On Mon, Sep 24, 2012 at 6:54 PM, Raphael de Feraudy fera...@phimeca.com wrote: Yang Zhang yanghatespam at gmail.com writes: I'm curious how to disable threads in numpy (not an ideal solution). Googling seems to point me to setting NPY_ALLOW_THREADS to 0somewhere. Anyone? It's appearing to me I had to face this very issue, which I reported @Numpy TRAC : http://projects.scipy.org/numpy/ticket/2213. I just tried your suggestion : set NPY_ALLOW_THREADS to 0 in numpy/core/include/numpy/ndarraytypes.h. It allowed my atomic example to run without stalling, and also fixed the issue in my application. Though i'm not entirely satisfied by this workaround, which might slow down heavy computations. I also find it too intrusive in numpy source code and don't wish to maintain a powerless numpy fork. Has anyone else settled with this fix ? Or may anybody have any other suggestion / comments ? Thanks. Raphael. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Double-ended queues
On 25.09.2012 11:38, Nathaniel Smith wrote: Implementing a ring buffer on top of ndarray would be pretty straightforward and probably work better than a linked-list implementation. Amazingly, many do not know that a ringbuffer is simply an array indexed modulus its length: foo = np.zeros(n) i = 0 while 1: foo[i % n] # access ringbuffer i += 1 Also, instead of writing a linked list, consider collections.deque. A deque is by definition a double-ended queue. It is just waste of time to implement a deque (double-ended queue) and hope it will perform better than Python's standard lib collections.deque object. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] API, ABI compatibility
___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Double-ended queues
On Tue, Sep 25, 2012 at 12:31 PM, Sturla Molden stu...@molden.no wrote: On 25.09.2012 11:38, Nathaniel Smith wrote: Implementing a ring buffer on top of ndarray would be pretty straightforward and probably work better than a linked-list implementation. Amazingly, many do not know that a ringbuffer is simply an array indexed modulus its length: foo = np.zeros(n) i = 0 while 1: foo[i % n] # access ringbuffer i += 1 Good trick, but to be reliable I think you need to either be willing for i to overflow into a long (arbitrary width) integer, or else make sure that i is an unsigned integer and that n is 2**k where k = sizeof(i)? Just doing i %= n on each pass through the loop might be less error-prone. Also, instead of writing a linked list, consider collections.deque. A deque is by definition a double-ended queue. It is just waste of time to implement a deque (double-ended queue) and hope it will perform better than Python's standard lib collections.deque object. The original poster is using collections.deque now, but wants a version that supports efficient vectorized operations. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0b2 release
On Tue, Sep 25, 2012 at 1:27 AM, Ondřej Čertík ondrej.cer...@gmail.com wrote: On Mon, Sep 24, 2012 at 3:49 PM, Nathaniel Smith n...@pobox.com wrote: On Mon, Sep 24, 2012 at 10:47 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 24, 2012 at 2:25 PM, Frédéric Bastien no...@nouiz.org wrote: Hi, I tested this new beta on Theano and discovered an interface change that was not there in the beta 1. New behavior: numpy.ndindex().next() (0,) Old behavior: numpy.ndindex().next() () This break some Theano code that look like this: import numpy shape=() out_shape=[12] random_state=numpy.random.RandomState() out = numpy.zeros(out_shape, int) for i in numpy.ndindex(*shape): out[i] = random_state.permutation(5) I suppose this is an regression as the only mention of ndindex in the first email of this change is that it is faster. I think this problem has been brought up on the list. It is interesting that it turned up after the first beta. Could you do a bisection to discover which commit is responsible? No need, the problem is already known. It was introduced by that ndindex speed up patch, PR #393, which was backported into the first beta as well. There's a follow-up patch in PR #445 that fixes both of these issues, though it also exposes some more fundamental issues with the nditer API, so there's lots of discussion there about if we want some more changes... this is a good summary: https://github.com/numpy/numpy/pull/445#issuecomment-8740982 For 1.7 purposes though the bottom line is that we already have multiple acceptable solutions, so both the issues reported here should definitely be fixed. Should we just remove (revert) this PR #393 patch from the release branch? It shouldn't have been there in the first place, the only reason I included it is because other patches depended on it and I would have to fix collisions, and we thought it would be harmless to just include it. Which turned out to be a mistake, for which I apologize. That way we'll feel confident that the branch works, and we can get the right solution into master and test it there. So I am actually convinced I should simply revert this patch in the release branch. Let me know what you think. Sounds good to me. (I also thought it would be harmless to include it, and also missed that the other patch that depended on it was part of the same change and could be reverted too.) -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Double-ended queues
On Tue, Sep 25, 2012 at 6:50 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Sep 25, 2012 at 12:31 PM, Sturla Molden stu...@molden.no wrote: On 25.09.2012 11:38, Nathaniel Smith wrote: Implementing a ring buffer on top of ndarray would be pretty straightforward and probably work better than a linked-list implementation. Amazingly, many do not know that a ringbuffer is simply an array indexed modulus its length: foo = np.zeros(n) i = 0 while 1: foo[i % n] # access ringbuffer i += 1 Good trick, but to be reliable I think you need to either be willing for i to overflow into a long (arbitrary width) integer, or else make sure that i is an unsigned integer and that n is 2**k where k = sizeof(i)? Just doing i %= n on each pass through the loop might be less error-prone. Also, instead of writing a linked list, consider collections.deque. A deque is by definition a double-ended queue. It is just waste of time to implement a deque (double-ended queue) and hope it will perform better than Python's standard lib collections.deque object. The original poster is using collections.deque now, but wants a version that supports efficient vectorized operations. The C++ stdlib has an efficient deque object, but it moves through memory. Hmm, it wouldn't be easy to make that work with numpy arrays what with views and all. Efficient circular lists are often implemented using powers of two so that modulo indexing can be done using a mask. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0b2 release
Hi, thanks for that script. It seam very useful for that case. As other people know about this problem, I won't need to bisect. thanks Fred On Mon, Sep 24, 2012 at 6:52 PM, Pauli Virtanen p...@iki.fi wrote: 25.09.2012 00:55, Frédéric Bastien kirjoitti: On Mon, Sep 24, 2012 at 5:47 PM, Charles R Harris [clip] I think this problem has been brought up on the list. It is interesting that it turned up after the first beta. Could you do a bisection to discover which commit is responsible? I'll check that. Do I need to reinstall numpy from scratch everytimes or is there a better way to do that? Reinstallation is needed, but this is reasonably simple to automate, check git bisect run. For instance like so: https://github.com/pv/scipy-build-makefile/blob/master/bisectrun.py -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0b2 release
On Thu, Sep 20, 2012 at 12:24 AM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I'm pleased to announce the availability of the second beta release of NumPy 1.7.0b2. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ Please test this release and report any issues on the numpy-discussion mailing list. Since beta1, we've fixed most of the known (back then) issues, except: http://projects.scipy.org/numpy/ticket/2076 http://projects.scipy.org/numpy/ticket/2101 http://projects.scipy.org/numpy/ticket/2108 http://projects.scipy.org/numpy/ticket/2150 And many other issues that were reported since the beta1 release. The log of changes is attached. The full list of issues that we still need to work on is at: https://github.com/numpy/numpy/issues/396 Any help is welcome, the best is to send a PR fixing any of the issues -- against master, and I'll then back-port it to the release branch (unless it is something release specific, in which case just send the PR against the release branch). Cheers, Ondrej * f217517 Release 1.7.0b2 * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. * fcacdcc FIX: use py24-compatible version of virtualenv on Travis * d01354e FIX: loosen numerical tolerance in test_pareto() * 65ec87e TST: Add test for boolean insert * 9ee9984 TST: Add extra test for multidimensional inserts. * 8460514 BUG: Fix for issues #378 and #392 This should fix the problems with numpy.insert(), where the input values were not checked for all scalar types and where values did not get inserted properly, but got duplicated by default. * 07e02d0 BUG: fix npymath install location. * 6da087e BUG: fix custom post_check. * 095a3ab BUG: forgot to build _dotblas in bento build. * cb0de72 REF: remove unused imports in bscript. * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 * 3dc3b1b Retain backward compatibility. Enforce C order. * 5a471b5 Improve ndindex execution speed. * 2f28db6 FIX: Add a test for Ticket #2066 * ca29849 BUG: Add a test for Ticket #2189 * 1ee4a00 BUG: Add a test for Ticket #1588 * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip * f65ff87 FIX: simplify the import statement * 124a608 Fix returned copy * 996a9fb FIX: bug in np.where and recarray swapping * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). * 416af9a pavement.py: rename yop to atlas * 3930881 BUG: fix bento build. * fbad4a7 Remove test_recarray_from_long_formats * 5cb80f8 Add test for long number in shape specifier of dtype string * 24da7f6 Add test for long numbers in numpy.rec.array formats string * 77da3f8 Allow long numbers in numpy.rec.array formats string * 99c9397 Use PyUnicode_DecodeUTF32() * 31660d0 Follow the C guidelines * d5d6894 Fix memory leak in concatenate. * 8141e1e FIX: Make sure the tests produce valid unicode * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 * a022015 Re-enable unpickling optimization for large py3k bytes objects. * 470486b Copy bytes object when unpickling an array * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 * a1561c2 [FIX] Add missing header so separate compilation works again * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. * 28ffac7 REL: set version number to 1.7.0rc1-dev. Ticket #2218 needs to be fixed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0b2 release
On Tue, Sep 25, 2012 at 8:56 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Sep 20, 2012 at 12:24 AM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I'm pleased to announce the availability of the second beta release of NumPy 1.7.0b2. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ Please test this release and report any issues on the numpy-discussion mailing list. Since beta1, we've fixed most of the known (back then) issues, except: http://projects.scipy.org/numpy/ticket/2076 http://projects.scipy.org/numpy/ticket/2101 http://projects.scipy.org/numpy/ticket/2108 http://projects.scipy.org/numpy/ticket/2150 And many other issues that were reported since the beta1 release. The log of changes is attached. The full list of issues that we still need to work on is at: https://github.com/numpy/numpy/issues/396 Any help is welcome, the best is to send a PR fixing any of the issues -- against master, and I'll then back-port it to the release branch (unless it is something release specific, in which case just send the PR against the release branch). Cheers, Ondrej * f217517 Release 1.7.0b2 * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. * fcacdcc FIX: use py24-compatible version of virtualenv on Travis * d01354e FIX: loosen numerical tolerance in test_pareto() * 65ec87e TST: Add test for boolean insert * 9ee9984 TST: Add extra test for multidimensional inserts. * 8460514 BUG: Fix for issues #378 and #392 This should fix the problems with numpy.insert(), where the input values were not checked for all scalar types and where values did not get inserted properly, but got duplicated by default. * 07e02d0 BUG: fix npymath install location. * 6da087e BUG: fix custom post_check. * 095a3ab BUG: forgot to build _dotblas in bento build. * cb0de72 REF: remove unused imports in bscript. * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 * 3dc3b1b Retain backward compatibility. Enforce C order. * 5a471b5 Improve ndindex execution speed. * 2f28db6 FIX: Add a test for Ticket #2066 * ca29849 BUG: Add a test for Ticket #2189 * 1ee4a00 BUG: Add a test for Ticket #1588 * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip * f65ff87 FIX: simplify the import statement * 124a608 Fix returned copy * 996a9fb FIX: bug in np.where and recarray swapping * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). * 416af9a pavement.py: rename yop to atlas * 3930881 BUG: fix bento build. * fbad4a7 Remove test_recarray_from_long_formats * 5cb80f8 Add test for long number in shape specifier of dtype string * 24da7f6 Add test for long numbers in numpy.rec.array formats string * 77da3f8 Allow long numbers in numpy.rec.array formats string * 99c9397 Use PyUnicode_DecodeUTF32() * 31660d0 Follow the C guidelines * d5d6894 Fix memory leak in concatenate. * 8141e1e FIX: Make sure the tests produce valid unicode * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 * a022015 Re-enable unpickling optimization for large py3k bytes objects. * 470486b Copy bytes object when unpickling an array * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 * a1561c2 [FIX] Add missing header so separate compilation works again * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. * 28ffac7 REL: set version number to 1.7.0rc1-dev. Ticket #2218 needs to be fixed. The all method fails also. In [1]: a = zeros(5, complex) In [2]: a.imag = 1 In [3]: a.all() Out[3]: False Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt
On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll li...@hilboll.de wrote: I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array. Is there any way around this? the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that. Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway) -Chris Thanks for your insight, Andreas. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Double-ended queues
On Tue, Sep 25, 2012 at 4:31 AM, Sturla Molden stu...@molden.no wrote: Also, instead of writing a linked list, consider collections.deque. A deque is by definition a double-ended queue. It is just waste of time to implement a deque (double-ended queue) and hope it will perform better than Python's standard lib collections.deque object. not for insertion, deletion, etc, but there _may_ be a benefit to a class that stores the data in a homogenous data data buffer compatible with numpy: - you could use non-standard data types (uint, etc...) - It would be more memory efficient *not having to store all those python objects for each value) - you could round-trip to/from numpy arrays without data copying (or with efficient data copying...) for other operations. Whether it's worth the work would depend on teh use case, of course. Writing such a thing in Cython would be pretty easy though, particularly if you only needed to support a couple types. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt
On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll li...@hilboll.de wrote: I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array. Is there any way around this? the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that. Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway) Actually, that's just what I did before writing this question ;) I was just wondering if there were some solution available which I didn't know about. Cheers, Andreas. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] API, ABI compatibility
is a good thing:) On Tue, Sep 25, 2012 at 2:06 PM, David Cournapeau courn...@gmail.comwrote: ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] API, ABI compatibility
Ok, so since many people asked: this was sent by mistake, and intended to be a discarded draft instead. David On Tue, Sep 25, 2012 at 7:13 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: is a good thing:) On Tue, Sep 25, 2012 at 2:06 PM, David Cournapeau courn...@gmail.com wrote: ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion