Re: [Numpy-discussion] Sorting objects with ndarrays
su, 2010-02-28 kello 10:25 +0100, Gael Varoquaux kirjoitti: [clip] The problem is that ndarrays cannot be compared. So I have tried to override the 'cmp' in the 'sorted' function, however I am comparing fairly complex objects, and I am having a hard time predicting wich member of the object will contain the array. I don't understand what predicting which member of the object means? Do you mean that in the array, you have classes that contain ndarrays as their attributes, and the classes have __cmp__ implemented? If not, can you tell why def xcmp(a, b): a_nd = isinstance(a, ndarray) b_nd = isinstance(b, ndarray) if a_nd and b_nd: pass # compare ndarrays in some way elif a_nd: return 1 # sort ndarrays first elif b_nd: return -1 # sort ndarrays first else: return cmp(a, b) # ordinary compare does not work? Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting decimal accuracy in array operations (scikits.timeseries)
ke, 2010-03-03 kello 21:09 +0100, Marco Tuckner kirjoitti: am using the scikit.timeseries to convert a hourly timeseries to a lower frequency unsing the appropriate function [1]. When I compare the result to the values calculated with a Pivot table in Excel there is a difference in the values which reaches quite high values in the total sum of all monthly values. I found out that the differnec arises from different decimal settings: In Python the numbers show: 12. whereas in Excel I see: 12.8 Typically, the internal precision used in Python and Numpy is significantly more than what is printed. Most likely, your problem has a different cause. Are you sure Excel is using a high enough accuracy? If you want more help, it would be useful to post a self-contained code that demonstrates the error. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PSF GSoC 2010 (Py3K focus)
Mon, 08 Mar 2010 22:39:00 -0700, Charles R Harris wrote: On Mon, Mar 8, 2010 at 10:29 PM, Jarrod Millman mill...@berkeley.eduwrote: I added Titus' email regarding the PSF's focus on Py3K-related projects to our SoC ideas wiki page: http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas Given Titus' email, this is the most likely list of projects we will get accepted this year: - finish porting NumPy to Py3K I think Numpy is pretty much done. It needs use testing to wring out any small oddities, but it doesn't look to me like a GSOC project at the moment. Maybe Pauli can weigh in here. I think it's pretty much done. Even f2py should work. What's left is mostly testing it, and fixing any issues that crop up. Note that the SVN Numpy should preferably still see more testing on Python 2 against real-world packages that use it, before release. There's been a significant amount of code churn. - port SciPy to Py3K This project might be more appropriate, although I'm not clear on what needs to be done. I think porting Scipy proceeds in these steps: 1. Set up a similar 2to3-using build system for Python 3 as currently in Numpy. 2. Change the code so that it works both on Python 2 and Python 3. This can be done one submodule at a time. For C code, the changes required are mainly PyString usage. Some macros need to be defined to allow the same code to build on Py2 and Py3. Scipy is probably easier to port than Numpy here, since IIRC it doesn't mess much with strings, and its use of the Python C-API is much more limited. Also, parts written with Cython need essentially no porting. For Python code, the main issue is again bytes vs. unicode fixes, mainly inserting numpy.compat.asbytes() at correct locations to make e.g. bytes literals. All I/O code should be changed to work solely with Bytes. Since 2to3 takes care of most the other things, the remaining work is in fixing things it mishandles. I don't have a good estimate for how much work is in making Scipy work. I'd guess the work needed is probably slightly less than for Numpy. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] crash at prompt exit after running test
ti, 2010-03-09 kello 21:14 +0100, Johann Cohen-Tanugi kirjoitti: thinking about it, this morning there was a fedora update to python, so I am using 2.6.2-4.fc12. Looks like the problem is in python itself, hence this piece of info. That the problem would be in Python is not so clear to me. Can you try running it with the previous Python shipped by Fedora? Do you see the problem then? What's the previous version, btw? Memory errors are somewhat difficult to debug. Can you try running only a certain subset of the tests, first nosetests numpy.core Be sure to set Pythonpath so that you get the correct Numpy version. If it segfaults, proceed to (under numpy/core/tests) nosetests test_multiarray.py nosetests test_multiarray.py:TestNewBufferProtocol Since the crash occurs in cyclic garbage collection, I'm doubting a bit the numpy/core/src/multiarray/numpymemoryview.c implementation, since that's the only part in Numpy that supports that. Alternatively, just replace numpymemoryview.c with the attached one which has cyclic GC stripped, and see if you still get the crash. Cheers, Pauli /* * Simple PyMemoryView'ish object for Python 2.6 compatibility. * * On Python = 2.7, we can use the actual PyMemoryView objects. * * Some code copied from the CPython implementation. */ #define PY_SSIZE_T_CLEAN #include Python.h #include structmember.h #define _MULTIARRAYMODULE #define NPY_NO_PREFIX #include numpy/arrayobject.h #include numpy/arrayscalars.h #include npy_config.h #include npy_3kcompat.h #include numpymemoryview.h #if (PY_VERSION_HEX = 0x0206) (PY_VERSION_HEX 0x0207) /* * Memory allocation */ static void memorysimpleview_dealloc(PyMemorySimpleViewObject *self) { Py_CLEAR(self-base); if (self-view.obj != NULL) { PyBuffer_Release(self-view); self-view.obj = NULL; } } static PyObject * memorysimpleview_new(PyTypeObject *subtype, PyObject *args, PyObject *kwds) { PyObject *obj; static char *kwlist[] = {object, 0}; if (!PyArg_ParseTupleAndKeywords(args, kwds, O:memorysimpleview, kwlist, obj)) { return NULL; } return PyMemorySimpleView_FromObject(obj); } /* * Buffer interface */ static int memorysimpleview_getbuffer(PyMemorySimpleViewObject *self, Py_buffer *view, int flags) { return PyObject_GetBuffer(self-base, view, flags); } static void memorysimpleview_releasebuffer(PyMemorySimpleViewObject *self, Py_buffer *view) { PyBuffer_Release(view); } static PyBufferProcs memorysimpleview_as_buffer = { (readbufferproc)0, /*bf_getreadbuffer*/ (writebufferproc)0, /*bf_getwritebuffer*/ (segcountproc)0,/*bf_getsegcount*/ (charbufferproc)0, /*bf_getcharbuffer*/ (getbufferproc)memorysimpleview_getbuffer, /* bf_getbuffer */ (releasebufferproc)memorysimpleview_releasebuffer, /* bf_releasebuffer */ }; /* * Getters */ static PyObject * _IntTupleFromSsizet(int len, Py_ssize_t *vals) { int i; PyObject *o; PyObject *intTuple; if (vals == NULL) { Py_INCREF(Py_None); return Py_None; } intTuple = PyTuple_New(len); if (!intTuple) return NULL; for(i=0; ilen; i++) { o = PyInt_FromSsize_t(vals[i]); if (!o) { Py_DECREF(intTuple); return NULL; } PyTuple_SET_ITEM(intTuple, i, o); } return intTuple; } static PyObject * memorysimpleview_format_get(PyMemorySimpleViewObject *self) { return PyUString_FromString(self-view.format); } static PyObject * memorysimpleview_itemsize_get(PyMemorySimpleViewObject *self) { return PyLong_FromSsize_t(self-view.itemsize); } static PyObject * memorysimpleview_shape_get(PyMemorySimpleViewObject *self) { return _IntTupleFromSsizet(self-view.ndim, self-view.shape); } static PyObject * memorysimpleview_strides_get(PyMemorySimpleViewObject *self) { return _IntTupleFromSsizet(self-view.ndim, self-view.strides); } static PyObject * memorysimpleview_suboffsets_get(PyMemorySimpleViewObject *self) { return _IntTupleFromSsizet(self-view.ndim, self-view.suboffsets); } static PyObject * memorysimpleview_readonly_get(PyMemorySimpleViewObject *self) { return PyBool_FromLong(self-view.readonly); } static PyObject * memorysimpleview_ndim_get(PyMemorySimpleViewObject *self) { return PyLong_FromLong(self-view.ndim); } static PyGetSetDef memorysimpleview_getsets[] = { {format, (getter)memorysimpleview_format_get, NULL, NULL, NULL}, {itemsize, (getter)memorysimpleview_itemsize_get, NULL, NULL, NULL}, {shape, (getter)memorysimpleview_shape_get, NULL, NULL, NULL}, {strides, (getter)memorysimpleview_strides_get, NULL, NULL, NULL}, {suboffsets, (getter)memorysimpleview_suboffsets_get, NULL, NULL, NULL}, {readonly, (getter)memorysimpleview_readonly_get, NULL, NULL, NULL}, {ndim,
Re: [Numpy-discussion] crash at prompt exit after running test
more fun : [co...@jarrett tests]$ pwd /home/cohen/sources/python/numpy/numpy/core/tests [co...@jarrett tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: visit_decref: Assertion `gc-gc.gc_refs != 0' failed. Aborted (core dumped) What happens if you only import the umath_tests module (or something, it's a .so under numpy.core)? Just importing test_ufunc.py probably doesn't run a lot of extension code. Since you don't get crashes only by importing numpy, I really don't see many alternatives any more... Thanks, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] crash at prompt exit after running test
Wed, 10 Mar 2010 10:28:07 +0100, Johann Cohen-Tanugi wrote: On 03/10/2010 01:55 AM, Pauli Virtanen wrote: more fun : [co...@jarrett tests]$ pwd /home/cohen/sources/python/numpy/numpy/core/tests [co...@jarrett tests]$ python -c 'import test_ufunc' python: Modules/gcmodule.c:277: visit_decref: Assertion `gc-gc.gc_refs != 0' failed. Aborted (core dumped) What happens if you only import the umath_tests module (or something, it's a .so under numpy.core)? [co...@jarrett core]$ export PYTHONPATH=/home/cohen/.local/lib/python2.6/site-packages/numpy/core: $PYTHONPATH [co...@jarrett core]$ python Python 2.6.2 (r262:71600, Jan 25 2010, 18:46:45) [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 Type help, copyright, credits or license for more information. import umath_tests python: Modules/gcmodule.c:277: visit_decref: Assertion `gc-gc.gc_refs != 0' failed. Aborted (core dumped) so this import also trigger the crash at exit... Then it is clear that the umath_tests module does something that is not permitted. It's possible that there is a some sort of a refcount error somewhere in the generalized ufuncs mechanisms -- that part of Numpy is not heavily used. Bug spotting challenge: Start from umath_tests.c.src:initumath_tests, follow the execution, and spot the bug (if any). Pauli PS. it might be a good idea to file a bug ticket now ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] crash at prompt exit after running test
Wed, 10 Mar 2010 15:40:04 +0100, Johann Cohen-Tanugi wrote: Pauli, isn't it hopeless to follow the execution of the source code when the crash actually occurs when I exit, and not when I execute. I would have to understand enough of this umath_tests.c.src to spot a refcount error or things like that Yeah, it's not easy, and requires knowing how to track this type of errors. I didn't actually mean that you should try do it, just posed it as a general challenge to all interested parties :) On a more serious note, maybe there's a compilation flag or something in Python that warns when refcounts go negative (or something). Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subclassing ndarray in python3
Hi Darren, to, 2010-03-11 kello 11:11 -0500, Darren Dale kirjoitti: Now that the trunk has some support for python3, I am working on making Quantities work with python3 as well. I'm running into some problems related to subclassing ndarray that can be illustrated with a simple script, reproduced below. It looks like there is a problem with the reflected operations, I see problems with __rmul__ and __radd__, but not with __mul__ and __add__: Thanks for testing. I wish the test suite was more complete (hint! hint! :) Yes, Python 3 introduced some semantic changes in how subclasses of builtin classes (= written in C) inherit the __r*__ operations. Below I'll try to explain what is going on. We probably need to change some things to make things work better on Py3, within the bounds we are able to. Suggestions are welcome. The most obvious one could be to explicitly implement __rmul__ etc. on Python 3. [clip] class A(np.ndarray): def __new__(cls, *args, **kwargs): return np.ndarray.__new__(cls, *args, **kwargs) class B(A): def __mul__(self, other): return self.view(A).__mul__(other) def __rmul__(self, other): return self.view(A).__rmul__(other) def __add__(self, other): return self.view(A).__add__(other) def __radd__(self, other): return self.view(A).__radd__(other) [clip] print('A __rmul__:') print(a.__rmul__(2)) # yields NotImplemented print(a.view(np.ndarray).__rmul__(2)) # yields NotImplemented Correct. ndarray does not implement __rmul__, but relies on an automatic wrapper generated by Python. The automatic wrapper (wrap_binaryfunc_r) does the following: 1. Is `type(other)` a subclass of `type(self)`? If yes, call __mul__ with swapped arguments. 2. If not, bail out with NotImplemented. So it bails out. Previously, the ndarray type had a flag that made Python to skip the subclass check. That does not exist any more on Python 3, and is the root of this issue. print(2*a) # ok !!?? Here, Python checks 1. Does nb_multiply from the left op succeed? Nope, since floats don't know how to multiply ndarrays. 2. Does nb_multiply from the right op succeed? Here the execution passes *directly* to array_multiply, completely skipping the __rmul__ wrapper. Note also that in the C-level number protocol there is only a single multiplication function for both left and right multiplication. [clip] print('B __rmul__:') print(b.__rmul__(2)) # yields NotImplemented print(b.view(A).__rmul__(2)) # yields NotImplemented print(b.view(np.ndarray).__rmul__(2)) # yields NotImplemented print(2*b) # yields: TypeError: unsupported operand type(s) for *: 'int' and 'B' But here, the subclass calls the wrapper ndarray.__rmul__, which wants to be careful with types, and hence fails. Yes, probably explicitly defining __rmul__ for ndarray could be the right solution. Please file a bug report on this. Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem with applying patch
Sat, 13 Mar 2010 12:06:24 -0700, z99719 z99719 wrote: [clip] The patch files are placed in the top level numpy sandbox directory and are in xml format rather than a diff file. I used save link as on the link to download the patch on the Attachments section of the link page. That gives you the HTML page. Click on the link, and choose Download in other formats: Original Format from the bottom of the page. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] integer division rule?
la, 2010-03-13 kello 15:32 -0500, Alan G Isaac kirjoitti: Francesc Altet once provided an example that for integer division, numpy uses the C99 rule: round towards 0. This is different than Python's rule for integer division: round towards negative infinity. But I cannot reproduce his example. (NumPy 1.3.) Did this behavior change at some point? It was changed in r5888. What the rationale was is not clear from the commit message. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal
Anne Archibald wrote: I'm not knocking numpy; it does (almost) the best it can. (I'm not sure of the optimality of the order in which ufuncs are executed; I think some optimizations there are possible.) Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC dimensions are always traversed in order from left to right. Large speedups are possible in some cases, but in a quick try I didn't manage to come up with an algorithm that would always improve the speed (there was a thread about this last year or so, and there's a ticket). Things varied between computers, so this probably depends a lot on the actual cache arrangement. But perhaps numexpr has such heuristics, and we could steal them? -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array bug with character (regression from 1.4.0)
Ryan May wrote: The following code, which works with numpy 1.4.0, results in an error: Python 2.6, I presume? In [1]: import numpy as np In [2]: v = 'm' In [3]: dt = np.dtype('c') In [4]: a = np.asarray(v, dt) On SVN trunk: ValueError: assignment to 0-d array In [5]: np.__version__ Out[5]: '2.0.0.dev8297' Thoughts? Nope, but it's likely my bad. Smells a bit like the dtype 'c' has size 0 that doesn't get automatically adjusted in the array constructor, so you end up assigning size-1 string to size-0 array element... Which is strange sice I don't think this particular code path has been changed at all. One would have to follow the C execution path with gdb to find out what goes wrong. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array bug with character (regression from 1.4.0)
su, 2010-03-21 kello 16:13 -0600, Charles R Harris kirjoitti: I was wondering if this was related to Michael's fixes for character arrays? A little bisection might help localize the problem. It's a bug I introduced in r8144... I forgot one *can* assign strings to 0-d arrays, and strings are indeed one sequence type. I'm going to back that changeset out, since it was only a cosmetic fix. That particular part of code needs some cleanup (it's a bit too hairy if things like this can slip), but I don't have the time at the moment to come up with a more complete fix. Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [OT] Starving CPUs article featured in IEEE's ComputingNow portal
la, 2010-03-20 kello 17:36 -0400, Anne Archibald kirjoitti: I was in on that discussion. My recollection of the conclusion was that on the one hand they're useful, carefully applied, while on the other hand they're very difficult to reliably detect (since you don't want to forbid operations on non-overlapping slices of the same array). I think one alternative brought up was copy if unsure whether the slices overlap which would make A[whatever] = A[whatever2] be always identical in functionality to A[whatever] = A[whatever2].copy() which is how things should work. This would permit optimizing simple cases (at least 1D), and avoids running into NP-completeness (for numpy, the exponential growth is however limited by NPY_MAXDIMS which is 64, IIRC). This would be a change in semantics, but in a very obscure corner that hopefully nobody relies on. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] StringIO test failure with Python3.1.2
ke, 2010-03-24 kello 09:20 -0600, Charles R Harris kirjoitti: What would be the best fix? Should we rename io to something like npyio? That, or: Disable import conversions in tools/py3tool.py for that particular file, and fix any import errors manually so that the same code works both for Python 2 and Python 3. Actually, I suspect there are no import errors, since the top-level imports in that file are already absolute. Anyway, it's a bug in 2to3. I suppose it first converts from StringIO import StringIO to from io import StringIO and then runs the import fixer, which does from .io import StringIO and that's a bug. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] StringIO test failure with Python3.1.2
ke, 2010-03-24 kello 10:28 -0500, Robert Kern kirjoitti: utils.py is the only file in there that imports StringIO. It should probably do a local import from io import BytesIO because io.py already contains some Python3-awareness: if sys.version_info[0] = 3: import io BytesIO = io.BytesIO else: from cStringIO import StringIO as BytesIO The lookfor stuff in utils.py deals with docstrings, so it probably has to use StringIO instead of BytesIO for unicode-cleanliness. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] StringIO test failure with Python3.1.2
Wed, 24 Mar 2010 13:35:51 -0500, Bruce Southey wrote: [clip] elif isinstance(item, collections.Callable): File /usr/local/lib/python3.1/abc.py, line 121, in __instancecheck__ subclass = instance.__class__ AttributeError: 'PyCapsule' object has no attribute '__class__' Seems like another Python bug. All objects probably should have the __class__ attribute... -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] should ndarray implement __round__ for py3k?
Darren Dale wrote: A simple test in python 3: import numpy as np round(np.arange(10)) Traceback (most recent call last): File stdin, line 1, in module TypeError: type numpy.ndarray doesn't define __round__ method I implemented this for array scalars already, but forgot about arrays. Anyway, it could be nice to have. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dealing with roundoff error
Mike Sarahan wrote: However, even linspace shows roundoff error: a=np.linspace(0.0,10.0,endpoint=False) b=np.linspace(0.1,10.1,endpoint=False) np.sum(a[1:]==b[:-1]) # Gives me 72, no 100 Are you sure equally spaced floating point numbers having this property even exist? 0.1 does not have a terminating representation in base-2: 0.1_10 = 0.0001100110011001100110011.._2 -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy
ma, 2010-03-29 kello 19:13 +0900, David Cournapeau kirjoitti: I have worked on porting scipy to py3k, and it is mostly working. One thing which would be useful is to install something similar to npy_3kcompat.h in numpy, so that every scipy extension could share the compat header. Is the current python 3 compatibility header usable in the wild, or will it still significantly change (this requiring a different, more stable one) ? I believe it's reasonably stable, as it contains mostly simple stuff. Something perhaps can be added later, but I don't think anything will need to be removed. At least, I don't see what I would like to change there. The only thing I wouldn't perhaps like to have in the long run are the PyString and possibly PyInt redefinition macros. But if the header is going to be used elsewhere, we can as well freeze it now and promise not to remove or change anything existing. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy
2010/3/30 David Cournapeau da...@silveregg.co.jp Pauli Virtanen wrote: [clip] At least, I don't see what I would like to change there. The only thing I wouldn't perhaps like to have in the long run are the PyString and possibly PyInt redefinition macros. I would also prefer a new name, instead of macro redefinition, but I can do it. For strings, the new names are PyBytes, PyUnicode and PyUString (unicode on Py3, bytes on Py2). One of these should be used instead of PyString in all places, but redefining the macro reduced the work in making a quick'n'dirty port of numpy. For integers, perhaps a separate PyInt on Py2, PyLong on Py3 type should be defined. Not sure if this would reduce work in practice. The header would not be part of the public API proper anyway (I will put it somewhere else than numpy/include), it should never be pulled implicitly in when including one of the .h in numpy/include. Agreed. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Making a 2to3 distutils command ?
2010/3/30 David Cournapeau da...@silveregg.co.jp: Currently, when building numpy with python 3, the 2to3 conversion happens before calling any distutils command. Was there a reason for doing it as it is done now ? This allowed 2to3 to also port the various setup*.py files and numpy.distutils, and implementing it this way required the minimum amount of work and understanding of distutils -- you need to force it to proceed with the build using the set of output files from 2to3. I would like to make a proper numpy.distutils command for it, so that it can be more finely controlled (in particular, using the -j option). It would also avoid duplication in scipy. Are you sure you want to mix distutils in this? Wouldn't it only obscure how things work? If the aim is in making the 2to3 processing reusable, I'd rather simply move tools/py3tool.py under numpy.distutils (+ perhaps do some cleanups), and otherwise keep it completely separate from distutils. It could be nice to have the 2to3 conversion parallelizable, but there are probably simple ways to do it without mixing distutils in. But if you think this is really worth doing, go ahead. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Making a 2to3 distutils command ?
ti, 2010-03-30 kello 07:18 -0600, Ryan May kirjoitti: Out of curiosity, is there something wrong with the support for 2to3 that already exists within distutils? (Other than it just being distutils) http://bruynooghe.blogspot.com/2010/03/using-lib2to3-in-setuppy.html That AFAIK converts only those Python files that will be installed, and I don't know how to tell it to disable some conversion on a per-file basis. (Numpy also contains some files necessary for building that will not be installed...) But OK, to be honest, I didn't look closely if one could make it work using the bundled 2to3 command. Things might be cleaner if it can be made to work. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Ufunc memory access optimizations (Was: ufuncs on funny strides ...)
to, 2010-04-01 kello 11:30 -0700, M Trumpis kirjoitti: [clip] Actually I realized later that the main slow-down comes from the fact that my array was strided in fortran order (ascending strides). But from the point of view of a ufunc that is operating independently at each value, why would it need to respect striding? Correct. There has been discussion about improving ufunc performance by optimizing the memory access pattern. The main issue in your case is that the output array is in C order, so that it is *not* possible to access both the input and the output arrays in the optimal order. Fixing this issue requires allowing ufuncs to allocate arrays that are in non-C order. This needs a design decision that has not so far been made. I'd be for this, I don't think it can break anything. The second issue is that there is no universal access pattern choice for every case that is optimal on all processor cache layouts. This forces to use heuristics to determine the access pattern, which is not so simple to get right, and usually would require some information of the processor's cache architecture. (Even some code has been written along these lines, though mostly addressing the reduction: http://github.com/pv/numpy-work/tree/ticket/1143-speedup-reduce http://projects.scipy.org/numpy/ticket/1143 Not production quality so far, and the non-C-output order would definitely help also here.) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for new ufunc functionality
la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti: [clip] Here are my suggested additions to NumPy: ufunc methods: [clip] * reducein (array, indices, axis=0) similar to reduce-at, but the indices provide both the start and end points (rather than being fence-posts like reduceat). Is the `reducein` important to have, as compared to `reduceat`? [clip] numpy functions (or methods): I'd prefer functions here. ndarray already has a huge number of methods. * segment(array) (produce an array of integers from an array producing the different regions of an array: segment([10,20,10,20,30,30,10]) would produce ([0,1,0,1,2,2,0]) Sounds like `np.digitize(x, bins=np.unique(x))-1`. What would the behavior be with structured arrays? * edges(array, at=True) produce an index array providing the edges (with either fence-post like syntax for reduce-at or both boundaries like reducein. This can probably easily be based on segment(). Thoughts on the general idea? One question is whether these methods should be stuffed to the main namespace, or under numpy.rec. Another addition to ufuncs that should be though about is specifying the Python-side interface to generalized ufuncs. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is this a small bug in numpydoc?
Wed, 14 Apr 2010 01:17:37 -0700, Fernando Perez wrote: [clip] Exception occurred: File /home/fperez/ipython/repo/trunk-lp/docs/sphinxext/numpydoc.py, line 71, in mangle_signature 'initializes x; see ' in pydoc.getdoc(obj.__init__)): AttributeError: class Bunch has no attribute '__init__' The full traceback has been saved in /tmp/sphinx-err-H1VlaY.log, if you want to report the issue to the developers. so it seems indeed that accessing __init__ without checking it's there isn't a very good idea. But I didn't write numpydoc and I'm tired, so I don't want to commit this without a second pair of eyes... Yeah, it's a bug, I think. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
su, 2010-04-18 kello 19:05 +0900, David Cournapeau kirjoitti: [clip] First, an aside: with the recent announcement from pypy concerning the new way of interfacing with C, wouldn't it make more sense to go the other way around - i.e. starting from full numpy, and replace some parts in rpython ? Of course, this assumes that numpy can work with pypy using the new C API emulation, but this may not be a lot of work. This would gives the advantage of having something useful from the start. I think this project is extremely interesting in nature, and the parts which are the most interesting to implement are (IMHO of course): - indexing, fancy indexing - broadcasting - basic ufunc support, although I am not sure how to plug this with the C math library for math functions (cos, log, etc...) I agree here: if it's possible to supplement the existing Numpy code base by interpreter-level (JIT-able) support for indexing, fancy indexing, and broadcasting, this would reduce the roadblock in writing effective algorithms without using vectorization. Also, if you can incrementally replace parts of Numpy, the code could be immediately usable in real world, being less of a proof-of-concept subset. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Disabling Extended Precision in NumPy (like -ffloat-store)
ke, 2010-04-21 kello 10:47 -0400, Adrien Guillon kirjoitti: [clip] My understanding, however, is that Intel processors may use extended precision for some operations anyways unless this is explicitly disabled, which is done with gcc via the -ffloat-store operation. Since I am prototyping algorithms for a different processor architecture, where the extended precision registers simply do not exist, I would really like to force NumPy to limit itself to using single-precision operations throughout the calculation (no extended precision in registers). Probably the only way to ensure this is to recompile Numpy (and possibly, also Python) with the -ffloat-store option turned on. When compiling Numpy, you should be able to insert the flag via OPT=-ffloat-store python setup.py build -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Adding an ndarray.dot method
Wed, 28 Apr 2010 14:12:07 -0400, Alan G Isaac wrote: [clip] Here is a related ticket that proposes a more explicit alternative: adding a ``dot`` method to ndarray. http://projects.scipy.org/numpy/ticket/1456 I kind of like this idea. Simple, obvious, and leads to clear code: a.dot(b).dot(c) or in another multiplication order, a.dot(b.dot(c)) And here's an implementation: http://github.com/pv/numpy-work/commit/414429ce0bb0c4b7e780c4078c5ff71c113050b6 I think I'm going to apply this, unless someone complains, as I don't see any downsides (except maybe adding one more to the huge list of methods ndarray already has). Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Datetime overflow error, attn Stefan.
Sat, 08 May 2010 18:44:59 -0600, Charles R Harris wrote: [clip] Because window's longs are always 32 bit, I think this is a good indication that somewhere a long is being used instead of an intp. The test itself used 32-bit integers. Fix'd. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] pareto docstring
Tue, 11 May 2010 00:23:52 -0700, T J wrote: [clip] It seems reasonable that we might have to follow the deprecation route, but I'd be happier with a faster fix. 1.5 - Provide numpy.random.lomax. Make numpy.random.pareto raise a DeprecationWarning and then call lomax. 2.0 (if there is no 1.6) - Make numpy.random.pareto behave as Pareto distribution of 1st kind. I think the next Numpy release will be 2.0. How things were done with changes in the histogram function were: 1) Add a new=False keyword argument, and raise a DeprecationWarning if new==False. The user then must call it with pareto(..., new=True) to get the correct behaviour. 2) In the next release, change the default to new=True. Another option would be to add a correct implementation with a different name, e.g. `pareto1` to signal it's the first kind, and deprecate the old function altogether. A third option would be just to silently fix the bug. In any case the change should be mentioned noticeably in the release notes. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] savetxt not working with python3.1
Mon, 17 May 2010 13:22:18 -0400, Skipper Seabold wrote: [clip] What version of Numpy? Can you file a bug ticket with a failing example? I couldn't replicated with r8417 on Python 3. It was fixed in r8411. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype and array creation
Thu, 20 May 2010 13:04:11 -0700, Keith Goodman wrote: Why do the follow expressions give different dtype? np.array([1, 2, 3], dtype=str) array(['1', '2', '3'], dtype='|S1') np.array(np.array([1, 2, 3]), dtype=str) array(['1', '2', '3'], dtype='|S8') Scalars seem to be handled specially. Anyway, automatic determination of the string size is a bit dangerous to rely on with non-strings in the array: np.array([np.array(12345)], dtype=str) array(['1234'], dtype='|S4') When I looked at this the last time, it wasn't completely obvious how to make this to do something more sensible. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Build failure at rev8246
Fri, 21 May 2010 08:09:55 -0700, T J wrote: I tried upgrading today and had trouble building numpy (after rm -rf build). My full build log is here: http://www.filedump.net/dumped/build1274454454.txt Your SVN checkout might be corrupted, containing a mix of old and new files. Try building from a clean checkout. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extending documentation to c code
Hi, ma, 2010-05-24 kello 12:01 -0600, Charles R Harris kirjoitti: I'm wondering if we could extend the current documentation format to the c source code. The string blocks would be implemented something like [clip] I'd perhaps stick closer to C conventions and use something like /** * Spam foo out of the parrots. * * Parameters * -- * a * Amount of spam * b * Number of parrots */ int foo(int a, int b) { } Using some JavaDoc-type syntax might also be possible, although I personally find it rather ugly. Also, parsing C might not be too difficult, as projects aimed doing that in Python exist, for instance http://code.google.com/p/pycparser/ -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extending documentation to c code
Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: doxygen] It is yet another format to use inside C sources (I don't think doxygen supports rest), and I would rather have something that is similar, ideally integrated into sphinx. It also generates rather ugly doc by default, Anyway, we can probably nevertheless just agree on a readable plain-text/ rst format, and then just use doxygen to generate the docs, as a band-aid. http://github.com/pv/numpycdoc -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought
Wed, 26 May 2010 10:50:19 +0200, Sebastian Walter wrote: I'm a potential user of the C-API and therefore I'm very interested in the outcome. In the previous discussion (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) many different views on what the new C-API should be were expressed. I believe the aim of the refactor is to *not* change the C-API at all, but separate it internally from the routines that do the heavy lifting. Externally, Numpy would still look the same, but be more easy to maintain. The routines that do the heavy lifting could then be good for reuse and be more easy to maintain, but I think how and where they would be exposed hasn't been discussed so far... -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reading and writing binary data to/from file objects
ke, 2010-05-26 kello 14:07 +0200, Christoph Bersch kirjoitti: f = open(fname2, 'w') [clip] Am I doing something substantially wrong or is this a bug? You are opening files in text mode. Use mode 'wb' instead. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extending documentation to c code
Wed, 26 May 2010 07:15:08 -0600, Charles R Harris wrote: On Wed, May 26, 2010 at 2:59 AM, Pauli Virtanen p...@iki.fi wrote: Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: doxygen] It is yet another format to use inside C sources (I don't think doxygen supports rest), and I would rather have something that is similar, ideally integrated into sphinx. It also generates rather ugly doc by default, Anyway, we can probably nevertheless just agree on a readable plain-text/ rst format, and then just use doxygen to generate the docs, as a band-aid. http://github.com/pv/numpycdoc Neat. I didn't quite see the how how you connected the rst documentation and doxygen. I didn't :) But I just did: doing this it was actually a 10 min job since Doxygen accepts HTML -- now it parses the comments as RST and renders it properly as HTML in the Doxygen output. Of course getting links etc. to work would require more effort, but that's left as an exercise for someone else to finish. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NotImplemented returns
Thu, 27 May 2010 10:21:14 -0600, Charles R Harris wrote: [clip] Maybe an enhancement ticket. The NotImplemented return is appropriate for some functions, but for functions with a single argument we should probably raise an error. A NotImplemented value leaking to the the user is a bug. The function should raise a ValueError instead. NotImplemented is meant only for use as a placeholder singleton in implementation of rich comparison operations etc., and we shouldn't introduce any new meanings IMHO. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate?
Sat, 29 May 2010 11:29:56 -0700, Wayne Watson wrote: [clip] SciPy. Hmm, this looks reasonable. http://scipy.org/Getting_Started. Somehow I missed it and instead landed on http://www.scipy.org/wikis/topical_software/Tutorial, which looks fairly imposing. Those are tar and gzip files. There's also the PDF file that contains the main text. In any case, the whole page looks quite imposing. Where did you find the simple one below? The scipy.org wiki is organically grown and confusing at times (I moved the page you landed on to a more clear place). The correct places where this stuff should be is http://docs.scipy.org/ http://scipy.org/Additional_Documentation -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Tue, 01 Jun 2010 16:59:47 +0900, David Cournapeau wrote: I have looked back into the way to convert the existing numpy svn repository into git. It went quite smoothly using svn2git (developed by the KDE team for their own transition), but there are a few questions which need to be answered: - Shall we keep the old svn branches ? I think most of them are just cruft, and can be safely removed. We would then just keep the release branches They mostly seem like leftovers and mostly merged to trunk, that's true. At least a prefix svnbranch/** should be added to them, if we are going to keep them at all. Personally, I don't see problems in leaving them out. (in maintenance/***) Why not release/** or releases/**? I'd suggest singular form here. What do other projects use? Does having a prefix here imply something to clones? - Tag conversion: svn has no notion of tags, so translating them into git tags cannot be done automatically in a safely manner (and we do have some rewritten tags in the svn repo). I was thinking about creating a small script to create them manually afterwards for the releases, in the svntags/***. Sounds OK. - Author conversion: according to git, there are around 50 committers in numpy. Several of them are double and should be be merged I think (kern vs rkern, Travis' accounts as well), but there is also the option to set up real emails. Since email are private, I don't want to just scrape them without asking permission first. I don't know how we should proceed here. I don't think correcting the email addresses in the SVN history is very useful. Best probably just use some dummy form, maybe forename.surn...@localhost or something similar to keep things simple. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Fri, 04 Jun 2010 15:28:52 -0700, Matthew Brett wrote: I think it should be opt-in. How would opt-out work? Would someone create new accounts for all the contributors and then give them access? Just to be clear, this has nothing to do with accounts on github, or any registered thing. This is *only* about username/email as recognized by git itself (as recorded in the commit objects). Actually - isn't it better if people do give you their github username / email combo - assuming that's the easiest combo to work with later? I think the Github user name is not really needed here, as what goes into the history is the Git ID: name + email address. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Fri, 04 Jun 2010 15:19:27 -0700, Dan Roberts wrote: Sorry to interrupt, but do you have a usable, up to date cloneable git repository somewhere? I noticed that you had a repository on github, but it's about a month out of date. I understand if what you're working on isn't ready to be cloned from yet. In the meantime I can use git-svn, but I figured if you had something better, I'd use that. See also http://projects.scipy.org/numpy/wiki/GitMirror ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.savez does /not/ compress!?
ti, 2010-06-08 kello 12:03 +0200, Hans Meine kirjoitti: On Tuesday 08 June 2010 11:40:59 Scott Sinclair wrote: The savez docstring should probably be clarified to provide this information. I would prefer to actually offer compression to the user. Unfortunately, adding another argument to this function will never be 100% secure, since currently, all kwargs will be saved into the zip, so it could count as behaviour change. Yep, that's the only question to be resolved. I suppose compression is not so usual as a variable name, so it probably wouldn't break anyone's code. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Problems with get_info installing scipy
Tue, 08 Jun 2010 09:47:41 -0400, Jeff Hsu wrote: I tried to install scipy, but I get the error with not being able to find get_info() from numpy.distutils.misc_util. I read that you need the SVN version of numpy to fix this. I recompiled numpy and reinstalled from the SVN, which says is version 1.3.0 (was using 1.4.1 version before) and that function is not found within either versions. What version of numpy should I use? Or maybe I'm not removing numpy correctly. It's included in 1.4.1 and in SVN (which is 1.5.x). You almost certainly have an older version of numpy installed somewhere that overrides the new one. Check import numpy; print numpy.__file__ to see which one is imported. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy re-factoring project
Thu, 10 Jun 2010 18:48:04 +0200, Sturla Molden wrote: [clip] 5. Allow OpenMP pragmas in the core. If arrays are above a certain size, it should switch to multi-threading. Some places where Openmp could probably help are in the inner ufunc loops. However, improving the memory efficiency of the data access pattern is another low-hanging fruit for multidimensional arrays. I'm not sure how Openmp plays together with thread-safety; especially if used in outer constructs. Whether this is possible to do easily is not so clear, as Numpy uses e.g. iterator and loop objects, which cannot probably be shared between different openmp threads. So one would have to do some extra work to parallelize these parts manually. But probably if multithreading is desired, openmp sounds like the way to go... -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy re-factoring project
Thu, 10 Jun 2010 23:56:56 +0200, Sturla Molden wrote: [clip] Also about array iterators in NumPy's C base (i.e. for doing something along an axis): we don't need those. There is a different way of coding which leads to faster code. 1. Collect an array of pointers to each subarray (e.g. using std::vectordtype* or dtype**) 2. Dispatch on the pointer array... This is actually what the current ufunc code does. The innermost dimension is handled via the ufunc loop, which is a simple for loop with constant-size step and is given a number of iterations. The array iterator objects are used only for stepping through the outer dimensions. That is, it essentially steps through your dtype** array, without explicitly constructing it. An abstraction for iteration through an array is useful, also for building the dtype** array, so we should probably retain them. For multidimensional arrays, you run here into the optimization problem of choosing the order of axes so that the memory access pattern makes a maximally efficient use of the cache. Currently, this optimization heuristic is really simple, and makes the wrong choice even in the simple 2D case. We could in principle use OpenMP to parallelize the outer loop, as long as the inner ufunc part is implemented in C, and does not go back into Python. But if the problem is memory-bound, it is not clear that parallelization helps. Another task for optimization would perhaps be to implement specialized loops for each operation, currently we do one function indirection per iteration which probably costs something. But again, it may be that the operations are already bounded by memory speed, and this would not improve performance. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy re-factoring project
Fri, 11 Jun 2010 10:29:28 +0200, Hans Meine wrote: [clip] Ideally, algorithms would get wrapped in between two additional pre-/postprocessing steps: 1) Preprocessing: After broadcasting, transpose the input arrays such that they become C order. More specifically, sort the strides of one (e.g. the first/the largest) input array decreasingly, and apply the same transposition to the other arrays (in + out if given). 2) Perform the actual algorithm, producing a C-order output array. 3) Apply the reverse transposition to the output array. [clip] The post/preprocessing steps are fairly easy to implement in the ufunc machinery. The problem here was that this would result to a subtle semantic change, as output arrays from ufuncs would no longer necessarily be in C order. We need to explicitly to *decide* to make this change, before proceeding. I'd say that we should simply do this, as nobody should depend on this assumption. I think I there was some code ready to implement this shuffling. I'll try to dig it out and implement the shuffling. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy re-factoring project
Fri, 11 Jun 2010 15:31:45 +0200, Sturla Molden wrote: [clip] The innermost dimension is handled via the ufunc loop, which is a simple for loop with constant-size step and is given a number of iterations. The array iterator objects are used only for stepping through the outer dimensions. That is, it essentially steps through your dtype** array, without explicitly constructing it. Yes, exactly my point. And because the iterator does not explicitely construct the array, it sucks for parallel programming (e.g. with OpenMP): - The iterator becomes a bottleneck to which access must be serialized with a mutex. - We cannot do proper work scheduling (load balancing) I don't necessarily agree: you can do for parallelized outer loop { critical section { p = get iterator pointer ++iterator } inner loop in region `p` } This does allow load balancing etc., as a free processor can immediately grab the next available slice. Also, it would be easier to implement with OpenMP pragmas in the current code base. Of course, the assumption here is that the outer iterator overhead is small compared to the duration of the inner loop. This must then be compared to the memory access overhead involved in the dtype** array. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tensor contraction
Sat, 12 Jun 2010 23:15:16 +0200, Friedrich Romstedt wrote: [clip] But note that for: T[:, I, I] the shape is reversed with respect to that of: T[I, :, I] and T[I, I, :] . I think it should be written in the docs how the shape is derived. It's explained there in detail (although maybe not in the simplest possible words): http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer Let index dimension be a dimension for which you supply an array (rather than the :-slice) as an index. Let broadcast shape of indices be the shape to which all the index arrays are broadcasted to. (= same as (i1+i2+...+in).shape). The rule is that 1) if all the index dimensions are next to each other, then the shape of the result array is (dimensions preceding the index dimensions) + (broadcast shape of indices) + (dimensions after the index dimensions) 2) if the index dimensions are not all next to each other, then the shape of the result array is (broadcast shape of indices) + (non-index dimensions) Might be a bit surprising, but at this point it's not going to be changed. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tensor contraction
Sat, 12 Jun 2010 16:30:14 -0400, Alan Bromborsky wrote: If I have a single numpy array, for example with 3 indices T_{ijk} and I want to sum over two them in the sense of tensor contraction - T_{k} = \sum_{i=0}^{n-1} T_{iik}. Is there an easy way to do this with numpy? HTH, (not really tested, so you get what you pay for) def tensor_contraction_single(tensor, dimensions): Perform a single tensor contraction over the dimensions given swap = [x for x in range(tensor.ndim) if x not in dimensions] + list(dimensions) x = tensor.transpose(swap) for k in range(len(dimensions) - 1): x = np.diagonal(x, axis1=-2, axis2=-1) return x.sum(axis=-1) def _preserve_indices(indices, removed): Adjust values of indices after some items are removed for r in reversed(sorted(removed)): indices = [j if j = r else j-1 for j in indices] return indices def tensor_contraction(tensor, contractions): Perform several tensor contractions while contractions: dimensions = contractions.pop(0) tensor = tensor_contraction_single(tensor, dimensions) contractions = [_preserve_indices(c, dimensions) for c in contractions] return tensor # b_{km} = sum_{ij} a_{i,j,k,j,i,m} a = np.random.rand(3,2,4,2,3,5) b = tensor_contraction(a, [(0,4), (1,3)]) b2 = np.zeros((4, 5)) for i in range(3): for j in range(2): b2 += a[i,j,:,j,i,:] assert (abs(b - b2) 1e-12).all() ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy re-factoring project
Sun, 13 Jun 2010 06:54:29 +0200, Sturla Molden wrote: [clip: memory management only in the interface] You forgot views: if memory management is done in the interface layer, it must also make sure that the memory pointed to by a view is never moved around, and not freed before all the views are freed. This probably cannot be done with existing Python objects, since IIRC, they always own their own memory. So one should refactor the memory management out of ndarray on the interface side. The core probably also needs to decide when to create views (indexing), so a callback to the interface is needed. Second: If we cannot figure out how much to allocate before starting to use C (very unlikely), The point is that while you in principle can figure it out, you don't *want to* do this work in the interface layer. Computing the output size is just code duplication. we can always use a callback to Python. Or we can This essentially amounts to having a NpyArray_resize in the interface layer, which the core can call. Here, however, one has the requirement that all arrays needed by core are always passed in to it, including temporary arrays. Sounds a bit like Fortran 77 -- it's not going to be bad, if the number of temporaries is not very large. If the core needs to allocate arrays by itself, this will unevitably lead to memory management that cannot avoided by having the allocation done in the interface layer, as the garbage collector must not mistake a temporary array referenced only by the core as unused. An alternative here is to have an allocation routine NpyArray_new_tmp NpyArray_del_tmp whose allocated memory is released automatically, if left dangling, after the call to the core is completed. This would be useful for temporary arrays, although then one would have to be careful not ever to return memory allocated by this back to the caller. have two C functions, the first determining the amount of allocation, the second doing the computation. That sounds like a PITA, think about LAPACK. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PyArray_Scalar() and Unicode
Sat, 12 Jun 2010 17:33:13 -0700, Dan Roberts wrote: [clip: refactoring PyArray_Scalar] There are a few problems with this. The biggest problem for me is that it appears PyUCS2Buffer_FromUCS4() doesn't produce UCS2 at all, but rather UTF-16 since it produces surrogate pairs for code points above 0x. My first question is: is there any time when the data produced by PyUCS2Buffer_FromUCS4() wouldn't be parseable by a standards compliant UTF-16 decoder? Since UTF-16 = UCS-2 + surrogate pairs, as far as I know, the data produced should always be parseable by DecodeUTF16. Conversion to real UCS-2 from UCS-4 would be a lossy procedure, since not all code points can be represented with 2 bytes. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
ti, 2010-06-15 kello 13:18 +0900, David kirjoitti: On 06/15/2010 12:15 PM, Vincent Davis wrote: Is this http://github.com/cournape/numpy going to become the git repo? No, this is just a mirror of the svn repo, and used by various developers to do their own stuff. That's a mirror of a mirror, the master SVN mirror is here: http://projects.scipy.org/git/numpy/ -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Ufunc memory access optimization
pe, 2010-06-11 kello 10:52 +0200, Hans Meine kirjoitti: On Friday 11 June 2010 10:38:28 Pauli Virtanen wrote: [clip] I think I there was some code ready to implement this shuffling. I'll try to dig it out and implement the shuffling. That would be great! Ullrich Köthe has implemented this for our VIGRA/numpy bindings: http://tinyurl.com/fast-ufunc At the bottom you can see that he basically wraps all numpy.ufuncs he can find in the numpy top-level namespace automatically. Ok, here's the branch: http://github.com/pv/numpy-work/compare/master...feature;ufunc-memory-access-speedup Some samples: (the reported times in braces are times without the optimization) x = np.zeros([100,100]) %timeit x + x 1 loops, best of 3: 106 us (99.1 us) per loop %timeit x.T + x.T 1 loops, best of 3: 114 us (164 us) per loop %timeit x.T + x 1 loops, best of 3: 176 us (171 us) per loop x = np.zeros([100,5,5]) %timeit x.T + x.T 1 loops, best of 3: 47.7 us (61 us) per loop x = np.zeros([100,5,100]).transpose(2,0,1) %timeit np.cos(x) 100 loops, best of 3: 3.77 ms (9 ms) per loop As expected, some improvement can be seen. There's also appears to be an additional 5 us (~ 700 inner loop operations it seems) overhead coming from somewhere; perhaps this can still be reduced. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ufunc memory access optimization
ti, 2010-06-15 kello 10:10 -0400, Anne Archibald kirjoitti: Correct me if I'm wrong, but this code still doesn't seem to make the optimization of flattening arrays as much as possible. The array you get out of np.zeros((100,100)) can be iterated over as an array of shape (1,), which should yield very substantial speedups. Since most arrays one operates on are like this, there's potentially a large speedup here. (On the other hand, if this optimization is being done, then these tests are somewhat deceptive.) It does perform this optimization, and unravels the loop as much as possible. If all arrays are wholly contiguous, iterators are not even used in the ufunc loop. Check the part after /* Determine how many of the trailing dimensions are contiguous */ However, in practice it seems that this typically is not a significant win -- I don't get speedups over the unoptimized numpy code even for shapes (2,)*20 where you'd think that the iterator overhead could be important: x = np.zeros((2,)*20) %timeit np.cos(x) 10 loops, best of 3: 89.9 ms (90.2 ms) per loop This is actually consistent with the result x = np.zeros((2,)*20) y = x.flatten() %timeit x+x 10 loops, best of 3: 20.9 ms per loop %timeit y+y 10 loops, best of 3: 21 ms per loop that you can probably verify with any Numpy installation. This result may actually be because the Numpy ufunc inner loops are not especially optimized -- they don't use fixed stride sizes etc., and are so not much more efficient than using array iterators. On the other hand, it seems to me there's still some question about how to optimize execution order when the ufunc is dealing with two or more arrays with different memory layouts. In such a case, which one do you reorder in favour of? The axis permutation is chosen so that the sum (over different arrays) of strides (for each dimension) is decreasing towards the inner loops. I'm not sure, however, that this is the optimal choice. Things may depend quite a bit on the cache architecture here. Is it acceptable to return freshly-allocated arrays that are not C-contiguous? Yes, it returns non-C-contiguous arrays. The contiguity is determined so that the ufunc loop itself can access the memory with a single stride. This may cause some speed loss in some expressions. Perhaps there should be a subtle bias towards C-order? But I'm not sure this is worth the bother. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ufunc memory access optimization
-defined, but only for one that is good enough in practice. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] checking for array type in C extension
pe, 2010-06-18 kello 12:49 +0200, Berthold Hoellmann kirjoitti: [clip] tst.inttestfunc(np.array((1,2),dtype=np.int)) tst.inttestfunc(np.array((1,2),dtype=np.int8)) tst.inttestfunc(np.array((1,2),dtype=np.int16)) tst.inttestfunc(np.array((1,2),dtype=np.int32)) tst.inttestfunc(np.array((1,2),dtype=np.int64)) h...@pc090498 ~/pytest $ PYTHONPATH=build/lib.win32-2.5/ python xx.py 1.4.1 ['C:\\Python25\\lib\\site-packages\\numpy'] PyArray_TYPE(array): 7; NPY_INT: 5 NPY_INT not found PyArray_TYPE(array): 1; NPY_INT: 5 NPY_INT not found PyArray_TYPE(array): 3; NPY_INT: 5 NPY_INT not found PyArray_TYPE(array): 7; NPY_INT: 5 NPY_INT not found PyArray_TYPE(array): 9; NPY_INT: 5 NPY_INT not found NPY_INT32 is 7, but shouldn't NPY_INT correspond to numpy.int. And what kind of int is NPY_INT in this case? I think the explanation is the following: - NPY_INT is a virtual type that is either int32 or int64, depending on the native platform size. - It has its own type code, distinct from NPY_SHORT (= 3 = int32) and NPY_LONG (= 7 = int64). - But the type specifier is replaced either by NPY_SHORT or NPY_LONG on array creation, so no array is of this dtype. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] checking for array type in C extension
pe, 2010-06-18 kello 16:43 +0200, Berthold Hoellmann kirjoitti: [clip] The documentation (i refere to NumPy User Guide, Release 1.5.0.dev8106) claims that numpy.int is platform int and that NPY_INT is a C based name, thus referring to int also. So if I want to use int platform independent in my extension module, what do I do to check whether the correct data is provided. I want to avoid one of the casting routines, such as PyArray_FROM_OTF or PyArray_FromAny? Do I have to use PyArray_ITEMSIZE? But PyArray_ITEMSIZE indicates that numpy.int are not platform integers. It returns 8 on Linux64, where sizeof(int)==4. So is this a bug in 1.4.1, in the 1.5 beta documentation or a misunderstanding on my side? I guess the documentation needs clarification, as numpy.int is not actually the platform-size integer, but the Python-size integer. The platform-size integer corresponding to C int is IIRC numpy.intc, which should result to the same sizes as NPY_INT. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiplying numpy floats and python lists
su, 2010-06-20 kello 13:56 -0400, Tony S Yu kirjoitti: I came across some strange behavior when multiplying numpy floats and python lists: the list is returned unchanged: In [18]: np.float64(1.2) * [1, 2] Out[18]: [1, 2] Probably a bug, it seems to round the result first to an integer, and then do the usual Python thing (try for example np.float64(2.2)). I'd suggest filing a bug ticket. Looking at CPython source code, the issue seems to be that np.float64 for some reason passes PyIndex_Check. But (without whipping out gdb) I can't understand how this can be: the float64 scalar type is not supposed to have a non-NULL in the nb_index slot. Indeed, a np.float64.__index__ method does not exist, and the following indeed does not work: [1, 2, 3][np.float64(2.2)] Strange. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible to use numexpr with user made ufuncs/scipy ufuncs?
Hi, la, 2010-06-26 kello 14:24 +0200, Francesc Alted kirjoitti: [clip] Yeah, you need to explicitly code the support for new functions in numexpr. But another possibility, more doable, would be to code the scipy.special functions by using numexpr as a computing back-end. Would it be possible to add generic support for user-supplied ufuncs into numexpr? This would maybe be more of a convenience feature than a speed improvement, although perhaps some speed could be gained by evaluating ufuncs per-block. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible to use numexpr with user made ufuncs/scipy ufuncs?
Sat, 26 Jun 2010 17:51:52 +0100, Francesc Alted wrote: [clip] Well, I'd say that this support can be faked in numexpr easily. For example, if one want to compute a certain ufunc called, say, sincos(x) defined as sin(cos(x)) (okay, that's very simple, but it will suffice for demonstration purposes), he can express that as sincos = sin(cos(%s)), and then use it in a more complex expression like: %s+1*cos(%s) % (sincos % 'x', 'y') that will be expanded as: sin(cos(x))+1*cos(y) Of course, this is a bit crude, but I'd say that's more than enough for allowing the evaluation of moderately complex expressions. But what if such an expression does not exist? For example, there is no finite closed form expression for the Bessel functions in terms of elementary functions. An accurate implementation is rather complicated, and usually piecewise defined. For convenience reasons, it could be useful if one could do something like numexpr.evaluate(cos(iv(0, x)), functions=dict(iv=scipy.special.iv)) and this would be translated to numexpr bytecode that would make a Python function call to obtain iv(0, x) for each block of data required, assuming iv is a vectorized function. It's of course possible to precompute the value of iv(0, x), but this is extra hassle and requires additional memory. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.random.poisson docs missing Returns
Sat, 26 Jun 2010 17:37:22 -0700, David Goldsmith wrote: On Sat, Jun 26, 2010 at 3:22 PM, josef.p...@gmail.com wrote: [clip] Is there a chance that some changes got lost? (Almost) anything's possible... :-( There's practically no change of edits getting lost. There's a change of them being hidden if things are moved around in the source code, causing duplicate work, but that's not the case here. Well, here's what happened in the particular case of numpy's pareto: The promotion to Needs review took place - interestingly - 2008-06-26 (yes, two years ago today), despite the lack of a Returns section; the initial check-in of HOWTO_DOCUMENT.txt - which does specify that a Returns section be included (when applicable) - was one week before, 2008-06-19. So, it's not that surprising that this slipped through the cracks. Pauli (or anyone): is there a way to search the Wiki, e.g., using a SQL-like query, for docstrings that saw a change in status before a date, or between two dates? No. The review status is not versioned, so the necessary information is not there. The only chance would be to search for docstrings that haven't been edited after a certain date. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] arr.copy(order='F') doesn't agree with docstring: what is intended behavior?
Sun, 27 Jun 2010 13:41:23 -0500, Kurt Smith wrote: [clip] I misspoke (mistyped...). So long as it has the 'F' behavior I'm happy, and I'll gladly file a bug report patch to get the code to match the docs, assuming that's the intended behavior. The problem is probably that the function is using PyArg_ParseTuple instead of PyArg_ParseTupleAndKeywords Who do I contact about getting a trac account? Go to http://projects.scipy.org/numpy/ and select Register. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible to use numexpr with user made ufuncs/scipy ufuncs?
ma, 2010-06-28 kello 09:48 +0200, Francesc Alted kirjoitti: [clip] But again, the nice thing would be to implement such a special functions in terms of numexpr expressions so that the evaluation itself can be faster. Admittedly, that would take a bit more time. Quite often, you need to evaluate a series or continued fraction expansion to get the value of a special function at some point, limiting the number of terms by stopping when a certain convergence criterion is satisfied. Also, which series to sum typically depends on the point where things are evaluated. These things don't vectorize very nicely. If I understand correctly, numexpr bytecode interpreter does not support conditionals and loops like this at the moment? The special function implementations are typically written in bare C/Fortran. Do you think numexpr could give speedups there? As I see it, speedups (at least with MKL) could come from using faster implementations of sin/cos/exp etc. basic functions. Using SIMD to maximum effect would then require an amount of cleverness in re-writing the evaluation algorithms, which sounds like a major amount of work. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
ke, 2010-06-23 kello 12:46 +0200, Ruben Salvador kirjoitti: [clip] how can I detect the EOF to stop reading r = f.read(1) if not r: break # EOF else: f.seek(-1, 1) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.load raising IOError but EOFError expected
ma, 2010-06-28 kello 15:48 -0400, Anne Archibald kirjoitti: [clip] That said, good exception hygiene argues that np.load should throw EOFErrors rather than the more generic IOErrors, but I don't know how difficult this would be to achieve. np.load is in any case unhygienic, since it tries to unpickle, if it doesn't see the .npy magic header. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] sum up to a specific value
Thu, 01 Jul 2010 06:17:50 -0300, Renato Fabbri wrote: i need to find which elements of an array sums up to an specific value any idea of how to do this? Sounds like the knapsack problem http://en.wikipedia.org/wiki/Knapsack_problem ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] sum up to a specific value
to, 2010-07-01 kello 11:46 -0300, Renato Fabbri kirjoitti: just a solution (not all of them) and the application happen to come up with something like 10k values in the array. don care waiting, but... As said, the problem is a well-known one, and it's not really Python or Numpy-specific, so slightly off-topic for this list. Numpy and Scipy don't ship pre-made algorithms for solving these. But anyway, you'll probably find that the brute force algorithm (e.g. the one from Vincent) takes exponential time (and exp(1) is a big number). So you need to do something more clever. First stop, Wikipedia, http://en.wikipedia.org/wiki/Knapsack_problem http://en.wikipedia.org/wiki/Subset_sum_problem and if you are looking for pre-cooked solutions, second stop stackoverflow, http://stackoverflow.com/search?q=subset+sum+problem Some search words you might want to try on Google: http://www.google.com/search?q=subset%20sum%20dynamic%20programming Generic advice only this time, sorry; I don't have pre-made code for solving this at hand, but hopefully the above links give some pointers for what to do. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] memory leak using numpy and cvxopt
Fri, 02 Jul 2010 14:56:47 +0200, Tillmann Falck wrote: I am hitting a memory leak with the combination of numpy and cvxopt.matrix. As I am not where it occurs, I am cross posting. Probably a bug in cvxopt, as also the following leaks memory: from cvxopt import matrix N = 2000 X = [0]*N Y = matrix(0.0, (N, N)) while True: Y[:N, :1] = X -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 1.5 for Python 2.7 and 3.1
Fri, 25 Jun 2010 12:31:36 -0700, Christoph Gohlke wrote: at http://projects.scipy.org/numpy/ticket/1520 I have posted a patch against numpy svn trunk r8464 that: 1) removes the datetime functionality 2) restores ABI compatibility with numpy 1.4.1 3) enables numpy to build and run on Python 2.7 and 3.1 (at least on Windows) [clip] This should make it easier to keep track of it: http://github.com/pv/numpy-work/tree/1.5.x I note that the patch contains some additional 3K porting. We/I'll need to split the patch into two parts: - Py3K or other miscellaneous fixes that should also go to the trunk - removing datetime and restoring the ABI I can probably take a look on this during EuroScipy. I'd also like to manually check the diff to svn/1.4.x So far, I think the following changes look like they should also go to the trunk: - numpy/numarray/_capi.c - numpy/lib/tests/test_io.py - numpy/core/include/numpy/npy_math.h -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PATCH: reference leaks for 'numpy.core._internal' module object
Sat, 03 Jul 2010 01:11:20 -0300, Lisandro Dalcin wrote: The simple test below show the issue. [clip: patch] Thanks, applied in r8469. [clip] PS: I think that such implementation should at least handle the very simple one/two character format (eg, 'i', 'f', 'd', 'Zf', 'Zd', etc.) without calling Python code... Of course, complaints without patches should not be taken too seriously ;-) We'll optimize once someone complains this makes their code slow ;) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] [OT] Re: debian benchmarks
Sun, 04 Jul 2010 04:32:14 +0200, Sturla Molden wrote: I was just looking at Debian's benchmark. LuaJIT is now (on median) beating Intel Fortran! Consider that Lua is a dynamic language very similar to Python. I know it's just a benchmark but this has to count as insanely impressive. I guess you're talking about shootout.alioth.debian.org tests? Beating Intel Fortran with a dynamic scripting language... How is that even possible? It's possible that in the cases where Lua wins, the Lua code is not completely equivalent to the Fortran code, or uses stuff such as strings for which Lua's default implementation may be efficient. At least in the mandelbrot example some things differ. I wonder if Lua there takes advantage of SIMD instructions because the author of the code has manually changed the inmost loop to process two elements at once? -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subtle behavior when subtracting sub-arrays
Mon, 05 Jul 2010 16:03:56 +0200, Steve Schmerler wrote: [clip] To sum up, I find it a bit subtle that a = a - a[...,0][...,None] works as expected, while a -= a[...,0][...,None] does not. I guess the reason is that in the latter case (and the corresponding loop), a[...,0] itself is changed during the loop, while in the former case, numpy makes a copy of a[...,0] ? Correct. Is this intended? Not really. It's a feature we're planning to get rid of eventually, once a way to do it without sacrificing performance in safe cases is implemented. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy, matplotlib and masked arrays
Sat, 10 Jul 2010 11:03:47 +1000, Peter Isaac wrote: I'm not sure if this is a problem with numpy, matplotlib, EPD or me but here goes. The general problem is that either matplotlib has stopped plotting masked arrays correctly or numpy has changed the way masked arrays are presented. Or there is something strange with the Enthought Python Distribution or me. Many would agree with the latter ... [clip] Since the matplotlib version is the same for the repository Python installation and the EPD-6.1-1 installation and the script works in the first but not the second, it suggests the problem is not just matplotlib. That seems to leave the numpy version change (from 1.3 to 1.4) or something in the EPD. Or me. Note that EPD-6.2-2 works fine with this script on WinXP. It works fine for me with Numpy 1.4.0 and matplotlib 0.99.1.2, on Ubuntu. So it seems your problem is localised on the older EPD-6.1-1. The question is: do you need to support this older version of EPD at all? The problem does not appear to be that matplotlib has stopped plotting masked arrays properly, or that something crucial has changed in Numpy. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: Numpy runs on Python 3
Hi, As many of you probably already know, Numpy works fully on Python 3 and Python 2, with a *single code base*, since March. This work is scheduled to be included in the next releases 1.5 and 2.0. Porting Scipy to work on Python 3 has proved to be much less work, and will probably be finished soon. (Ongoing work is here: http:// github.com/cournape/scipy3/commits/py3k , http://github.com/pv/scipy-work/ commits/py3k ) For those who are interested in already starting to port their stuff to Python 3, you can use Numpy's SVN trunk version. Grab it: svn clone http://svn.scipy.org/svn/numpy/trunk/ numpy cd numpy python3 setup.py build An important point is that supporting Python 3 and Python 2 in the same code base can be done, and it is not very difficult either. It is also much preferable from the maintenance POV to creating separate branches for Python 2 and 3. We attempted to log changes needed in Numpy at http://projects.scipy.org/numpy/browser/trunk/doc/Py3K.txt which may be useful (although not completely up-to-date) information for people wanting to do make the same transition in their own code. (Announcement as recommended by our PR department @ EuroScipy :) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Difference between shape=() and shape=(1,)
ti, 2010-07-13 kello 10:06 -0700, Keith Goodman kirjoitti: No need to use where. You can just do a[np.isnan(a)] = 0. But you do have to watch out for 0d arrays, can't index into those. You can, but the index must be appropriate: x = np.array(4) x[()] = 3 x array(3) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] isinf raises in inf
Thu, 15 Jul 2010 09:54:12 -0500, John Hunter wrote: [clip] In [4]: np.isinf(x) Warning: invalid value encountered in isinf Out[4]: True As far as I know, isinf has always created NaNs -- since 2006 it has been defined on unsupported platforms as (!isnan((x)) isnan((x)-(x))) I'll replace it by the obvious ((x) == NPY_INFINITY || (x) == -NPY_INFINITY) which is true only for +-inf, and cannot raise any FPU exceptions. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] isinf raises in inf
Testing with arithmetic can raise overflows and underflows. I think the correct isinf is to compare to NPY_INFINITY and -NPY_INFINITY. Patch is attached to #1500 - Alkuperäinen viesti - On Thu, Jul 15, 2010 at 6:42 PM, John Hunter jdh2...@gmail.com wrote: On Thu, Jul 15, 2010 at 7:27 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Jul 15, 2010 at 6:11 PM, John Hunter jdh2...@gmail.com wrote: On Thu, Jul 15, 2010 at 6:14 PM, Eric Firing efir...@hawaii.edu wrote: Is it certain that the Solaris compiler lacks isinf? Is it possible that it has it, but it is not being detected? Just to clarify, I'm not using the sun compiler, but gcc-3.4.3 on solaris x86 Might be related to this thread. What version of numpy are you using? svn HEAD (2.0.0.dev8480) After reading the thread you suggested, I tried forcing the CFLAGS=-DNPY_HAVE_DECL_ISFINITE flag to be set, but this is apparently a bad idea for my platform... File /home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/__init__.py, line 5, in ? import multiarray ImportError: ld.so.1: python: fatal: relocation error: file /home/titan/johnh/dev/lib/python2.4/site-packages/numpy/core/multiarray.so: symbol isfinite: referenced symbol not found so while I think my bug is related to that thread, I don't see anything in that thread to help me fix my problem. Or am I missing something? __ Can you try out this patch without David's fixes? diff --git a/numpy/core/include/numpy/npy_math.h b/numpy/core/include/numpy/npy_ index d53900e..341fb58 100644 --- a/numpy/core/include/numpy/npy_math.h +++ b/numpy/core/include/numpy/npy_math.h @@ -151,13 +151,13 @@ double npy_spacing(double x); #endif #ifndef NPY_HAVE_DECL_ISFINITE - #define npy_isfinite(x) !npy_isnan((x) + (-x)) + #define npy_isfinite(x) (((x) + (x)) != (x) (x) == (x)) #else #define npy_isfinite(x) isfinite((x)) #endif #ifndef NPY_HAVE_DECL_ISINF - #define npy_isinf(x) (!npy_isfinite(x) !npy_isnan(x)) + #define npy_isinf(x) (((x) + (x)) == (x) (x) != 0) #else #define npy_isinf(x) isinf((x)) #endif Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] isinf raises in inf
Thu, 15 Jul 2010 19:09:15 -0600, Charles R Harris wrote: [clip] PS, of course we should fix the macro also. Since the bit values of +/- infinity are known we should be able to define them as constants using a couple of ifdefs and unions. We already do that, NPY_INFINITY and -NPY_INFINITY. [clip] int (isinf)(double x) { if ((-1.0 x) (x 1.0))/* x is small, and thus finite */ return (0); else if ((x + x) == x)/* only true if x == Infinity */ return (1); else /* must be finite (normal or subnormal), or NaN */ return (0); } This function can generate overflows, for example for x = 0.6 * np.finfo(np.float64).max -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-User] Saving Complex Numbers
Thu, 15 Jul 2010 20:00:09 -0400, David Warde-Farley wrote: (CCing NumPy-discussion where this really belongs) On 2010-07-08, at 1:34 PM, cfra...@uci.edu wrote: Need Complex numbers in the saved file. Ack, this has come up several times according to list archives and no one's been able to provide a real answer. It seems that there is nearly no formatting support for complex numbers in Python. for a single value, {0.real:.18e}{0.imag:+.18e}.format(val) will get the job done, but because of the way numpy.savetxt creates its format string this isn't a trivial fix. Anyone else have ideas on how complex number format strings can be elegantly incorporated in savetxt? The bug here is that z = np.complex64(1+2j) '%.18e' % z '1.00e+00' float(np.complex64(1+2j)) 1.0 This should raise an exception. I guess the culprit is at scalarmathmodule.c.src:1023. It should at least be changed to raise a ComplexWarning, which we now anyway do in casting. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array__struct__: about using PyCapsule instead of PyCObject for Python 2.7
Wed, 30 Jun 2010 12:13:38 -0600, Charles R Harris wrote: [clip] Grrr... I didn't see the point, myself, I'm tempted to deprecate 2.7 just to get even. There are some routines in the numpy/core/src includes that you might want to copy, they will allow you to use a common interface for PyCObject and PyCapsule if you need to do that. I've already fixed my code for PyCapsule. What's not clear to me is how to build (i.e, use the old cobject or the new capsules) __array_struct__ across NumPy and Python versions combinations. Will NumPy 1.x series ever support Python 2.7? In such case, should I use cobjects or capsules? We do support 2.7, but with PyCapsule. You might want to take a look at f2py also, as it also uses PyCapsule for Python = 2.7. I think we need to change this decision. PyCObject is still available on Python 2.7, and will only raise a PendingDeprecationWarning, which does not show up by default. I believe the Python devs reversed the full deprecation before the final 2.7 release. So I think we should just stick with PyCObject on 2.x, as we have done so far. I'll just bump the version checks so that PyCapsule is used only on 3.x. I'll commit this to Numpy SVN soon. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] 1.5.x branched
Dear all, Based on patches contributed by Christoph Gohlke, I've created a branch for 1.5.x: http://svn.scipy.org/svn/numpy/branches/1.5.x It should be a) Binary compatible with Numpy 1.4 on Python 2.x. This meant rolling back the datetime and a couple other changes. b) Support Python 3.1 and 2.7. I expect it will be easy to track changes in the trunk, even if preparing the release will still take some time. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PyDocWeb the doc editor and github transition
Sat, 17 Jul 2010 13:11:46 -0600, Vincent Davis wrote: I have not seen anything about the move to github and how this effects pydocweb or did I just miss it. I am willing to adapt pydocweb to work with github but not sure what the desired integration would be. This should not be to difficult but should/needs to be part of the plan and documented. You do not need to make any changes to the doc editor because of the github move -- only to its configuration. The documentation editor app itself does not know or care what the version control system is. The required change is essentially replacing svn checkout ... svn update by git clone ... git pull ... in a shell script, and clearing up the checkout directories. I can take care of this on the server machine, once the move to git has been made. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy svn down
Sat, 17 Jul 2010 16:06:40 -0600, Charles R Harris wrote: At the moment... Chuck Worksforme at the moment. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.5.x branched
Sun, 18 Jul 2010 21:13:43 +0800, Ralf Gommers wrote: [clip] Builds fine on OS X 10.6 with both 2.7 and 3.1, and all tests pass. With one exception: in-place build for 3.1 is broken. Does anyone know is this is a distutils or numpy issue? The problem is that on import numpy.__config__ can not be found. The inplace build goes to build/py3k I don't think we can easily support inplace build in the main directory. [clip] Don't think that's going to happen, but it would be good to discuss the release schedule. What changes need to go in before a first beta, and how much time do you all need for that? I don't have anything urgent. It might be good to glance through the tickets, however, to see if there's something serious pending. Also, I'm not 100% sure what's the Python 3 status on Windows, so that should be checked too. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Caching 2to3 results
[shameless plug] For those of you tired of waiting for 2to3 after rm -rf build: http://github.com/pv/lib2to3cache and cd numpy/ USE_2TO3CACHE=1 python3 setup.py build -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy Trac emails
Sun, 18 Jul 2010 23:59:24 +0800, Ralf Gommers wrote: Scipy Trac seems to work very well now, I get notification emails for comments on tickets etc. For numpy Trac, nothing right now. Can this be fixed? Works for me. Did you check if your email address is correct there? (- Preferences) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Complex nan ordering
Hi, The current way of Numpy handles ordering of complex nan is not very well defined. We should attempt to clarify this for 1.5.0. For example, what should these return: r1 = np.maximum(complex(1, nan), complex(2, 0)) r2 = np.complex64(complex(1, nan)) np.complex64(complex(2, 0)) or, what should `r3` be after this: r3 = np.array([complex(3, nan), complex(1, 0), complex(nan, 2)]) r3.sort() Previously, we have defined a lexical ordering relation for complex numbers, x y iff x.real y.real or (x.real == y.real and x.imag y.imag) but applying this to the above can cause some surprises: amax([1, 2, 4, complex(3, nan)]) == 4 which breaks nan propagation, and moreover the result depends on the order of the items, and the precise way the algorithm is written. *** Numpy IIRC has specified a lexical order between complex numbers for some time now, unlike Python in which complex numbers are unordered. So we won't change how finite numbers are handled, only the nan handling needs to be specified. *** I suggest the following, aping the way the real nan works: - (z, nan), (nan, z), (nan, nan), where z is any fp value, are all equivalent representations of cnan, as far as comparisons, sort order, etc are concerned. - The ordering between (z, nan), (nan, z), (nan, nan) is undefined. This means e.g. that maximum([cnan_1, cnan_2]) can return either cnan_1 or cnan_2 if both are some cnans. - Moreover, all comparisons , , ==, =, = where one or more operands is a cnan are false. - Except that when sorting, cnans are to be placed last. The advantages are now that nan propagation is now easier to implement, and we get faster code. Moreover, complex nans start to behave more similarly as their real counterparts in comparisons etc.; for instance in the above cases r1 = (1, nan) r2 = False r3 = [complex(1, 0), complex(3, nan), complex(nan, 2)] where in `r3` the order of the last two elements is unspecified. This is in fact the SVN trunk now works (final tweaks in r8508, 8509). Comments are welcome. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Complex nan ordering
Sun, 18 Jul 2010 15:57:47 -0600, Charles R Harris wrote: On Sun, Jul 18, 2010 at 3:36 PM, Pauli Virtanen p...@iki.fi wrote: [clip] I suggest the following, aping the way the real nan works: - (z, nan), (nan, z), (nan, nan), where z is any fp value, are all equivalent representations of cnan, as far as comparisons, sort order, etc are concerned. - The ordering between (z, nan), (nan, z), (nan, nan) is undefined. This means e.g. that maximum([cnan_1, cnan_2]) can return either cnan_1 or cnan_2 if both are some cnans. The sort and cmp order was defined in 1.4.0, see the release notes. (z,z), (z, nan), (nan, z), (nan, nan) are in correct order and there are tests to enforce this. Sort and searchsorted need to work together. Ok, now we're diving into an obscure corner that hopefully many people don't care about :) There are several issues here: 1) We should not use lexical order in comparison operations, since this contradicts real-valued nan arithmetic. Currently (and in 1.4) we do some weird sort of mixture, which seems inconsistent. 2) maximum/minimum should propagate nans, fmax/fmin should not 3) sort/searchsorted, and amax/argmax need to play together 4) as long as 1)-3) are valid, I don't think anybody cares what what exactly we mean by a complex nan, as long as np.isnan(complex nan) == True The fact that there are happen to be several different representations of a complex nan should not be important. *** 1) Unless we want to define (complex(nan, 0) complex(0, 0)) == True we cannot strictly follow the lexical order in comparisons. And if we define it like this, we contradict real-valued nan arithmetic, which IMHO is quite bad. Here, it would make sense to me to lump all the different complex nans into a single cnan, as far as the arithmetic comparison operations are concerned. Then, z OP cnan == False for all comparison operations. In 1.4.1 we have import numpy as np np.__version__ '1.4.1' x = np.complex64(complex(np.nan, 1)) y = np.complex64(complex(0, 1)) x = y False x y False x = np.complex64(complex(1, np.nan)) y = np.complex64(complex(0, 1)) x = y True x y False which seems an obscure mix of real-valued nan arithmetic and lexical ordering -- I don't think it's the correct choice... Of course, the practical importance of this decision approaches zero, but it would be nice to be consistent. *** 2) For maximum/amax, strict lexical order contradicts nan propagation: maximum(1+nan*j, 2+0j) == 2+0j ??? I don't see why we should follow the lexical order when both arguments are nans. The implementation will be faster if we don't. Also, this way argmax (which should be nan-propagating) can stop looking once it finds the first nan -- and it does not need to care if later on in the array there would be a greater nan. *** 3) For sort/searchsorted we have a technical reason to do something more, and there the strict lexical order seems the correct decision. For `argmax` it was possible to be compatible with `amax` when lumping cnans in maximum -- just return the first cnan. *** 4) As far as np.isnan is concerned, np.isnan(complex(0, nan)) True np.isnan(complex(nan, 0)) True np.isnan(complex(nan, nan)) True So I think nobody should care which complex nan a function such as maximum or amax returns. We can of course give up some performance to look for the greatest nan in these cases, but I do not think that it would be very worthwhile. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Complex nan ordering
However, nans have been propagated by maximum and minimum since 1.4.0. There was a question, discussed on the list, as to what 'nan' complex to return in the propagation, but it was still a nan complex in your definition of such objects. The final choice was driven by using the first of the already available complex nans as it was the easiest thing to do. However, (nan, nan) would be just as easy to do now that nans are available. Then we would be creating an inconsistency between amax/argmax. Of course if we say all cnans are equivalent, it doesn't matter. I'm not sure what your modifications to the macros buys us, what do you want to achieve? 1) fix bugs in nan propagation, maximum(1, complex(0, nan)) used to return 1. The result also depended previously on the order of arguments. 2) make complex nan behave similarly as a real nan in comparisons (non-sorting). I think both of these are worthwhile. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Complex128
Sun, 18 Jul 2010 21:15:15 -0500, Ross Harder wrote: mac os x leopard 10.5.. EPD installed i just don't understand why i get one thing when i ask for another. i can get what i want, but only by not asking for it. Do you get the same behavior also from import numpy as np np.array([0,0], dtype=np.complex256) -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] recarray slow?
Wed, 21 Jul 2010 15:12:14 -0400, wheres pythonmonks wrote: I have an recarray -- the first column is date. I have the following function to compute the number of unique dates in my data set: def byName(): return(len(list(set(d['Date'])) )) What this code does is: 1. d['Date'] Extract an array slice containing the dates. This is fast. 2. set(d['Date']) Make copies of each array item, and box them into Python objects. This is slow. Insert each of the objects in the set. Also this is somewhat slow. 3. list(set(d['Date'])) Get each item in the set, and insert them to a new list. This is somewhat slow, and unnecessary if you only want to count. 4. len(list(set(d['Date']))) So the slowness arises because the code is copying data around, and boxing it into Python objects. You should try using Numpy functions (these don't re-box the data) to do this. http://docs.scipy.org/doc/numpy/reference/routines.set.html -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] summarizing blocks of an array using a moving window
Thu, 22 Jul 2010 00:47:20 -0400, Robin Kraft wrote: [clip] Let's say the image looks like this: np.random.randint(0,2, 16).reshape(4,4) array([[0, 0, 0, 1], [0, 0, 1, 1], [1, 1, 0, 0], [0, 0, 0, 0]]) I want to use a square, non-overlapping moving window for resampling, so that I get a count of all the 1's in each 2x2 window. 0, 0, 0, 1 0, 0, 1, 1 0 3 = 2 0 1, 1, 0, 0 0, 0, 0, 0 In another situation with similar data I'll need the average, or the maximum value, etc.. Non-overlapping windows can be done by reshaping: x = np.array([[0, 0, 0, 1, 1, 1], [0, 0, 1, 1, 0, 0], [1, 1, 0, 0, 1, 1], [0, 0, 0, 0, 1, 1], [1, 0, 1, 0, 1, 1], [0, 0, 1, 0, 0, 0]]) y = x.reshape(3,2,3,2) y2 = y.sum(axis=3).sum(axis=1) # - array([[0, 3, 2], # [2, 0, 4], # [1, 2, 2]]) y2 = x.reshape(3,2,3,2).transpose(0,2,1,3).reshape(3,3,4).sum(axis=-1) # - array([[0, 3, 2], # [2, 0, 4], # [1, 2, 2]]) The above requires no copying of data, and should be relatively fast. If you need overlapping windows, those can be emulated with strides: http://mentat.za.net/numpy/scipy2009/stefanv_numpy_advanced.pdf http://conference.scipy.org/scipy2010/slides/tutorials /stefan_vd_walt_numpy_advanced.pdf -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Can't get ufunc to work for integers
Thu, 22 Jul 2010 08:49:09 -0700, John Salvatier wrote: I am trying to learn how to create ufuncs, and I got a ufunc to compile correctly with the signature int - double, but I can't get it to accept any arguments. My function is testfunc and I used NPY_INT as the first signature and NPY_DOUBLE as the second signature. What should I look at to figure this out? I think you need to specify NPY_LONG instead of NPY_INT -- the NPY_INT is IIRC sort of a virtual type that you can use when creating arrays, but no array is ever of type NPY_INT, since it maps to either NPY_LONG or whatever the C int type happens to be. Maybe someone can correct me here, but AFAIK it goes like so. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subtract.reduce behavior
Thu, 22 Jul 2010 15:00:50 -0500, Johann Hibschman wrote: [clip] Now, I'm on an older version (1.3.0), which might be the problem, but which is correct here, the code or the docs? The documentation is incorrect. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Arrays of Python Values
Fri, 23 Jul 2010 01:16:41 -0700, Ian Mallett wrote: [clip] Because I've never used arrays of Python objects (and Googling didn't turn up any examples), I'm stuck on how to sort the corresponding array in NumPy in the same way. I doubt you will gain any speed by switching from Python lists to Numpy arrays containing Python objects. Of course, perhaps I'm just trying something that's absolutely impossible, or there's an obviously better way. I get the feeling that having no Python objects in the NumPy array would speed things up even more, but I couldn't figure out how I'd handle the different attributes (or specifically, how to keep them together during a sort). What're my options? One option could be to use structured arrays to store the data, instead of Python objects. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subtract.reduce behavior
Fri, 23 Jul 2010 10:29:47 -0400, Alan G Isaac wrote: [clip] np.subtract.reduce([]) 0.0 Getting a right identity for an empty array is surprising. Matching Python's behavior (raising a TypeError) seems desirable. (?) I don't think matching Python's behavior is a sufficient argument for a change. As far as I see, it'd mostly cause unnecessary breakage, with no significant gain. Besides, it's rather common to define sum_{z in Z} z = 0 prod_{z in Z} z = 1 if Z is an empty set -- this can then be extended to other reduction operations. Note that changing reduce behavior would require us to special-case the above two operations. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subtract.reduce behavior
Fri, 23 Jul 2010 11:17:56 -0400, Alan G Isaac wrote: [clip] I also do not understand why there would have to be any special cases. That's a technical issue: e.g. prod() is implemented via np.multiply.reduce, and it is not clear to me whether it is possible, in the ufunc machinery, to leave the identity undefined or whether it is needed in some code paths (as the right identity). It's possible to define binary Ufuncs without an identity element (e.g. scipy.special.beta), so in principle the machinery to do the right thing is there. Returning a *right* identity for an operation that is otherwise a *left* fold is very odd, no matter how you slice it. That is what looks like special casing... I think I see your point now. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion