Re: [Numpy-discussion] NumPy re-factoring project
Den 11.06.2010 03:02, skrev Charles R Harris: But for an initial refactoring it probably falls in the category of premature optimization. Another thing to avoid on the first go around is micro-optimization, as it tends to complicate the code and often doesn't do much for performance. First, to refractor NumPy we must get rid of array iterators as Python objects. Retaining them as Python objects defies the whole purpose. Second, creating an array of pointers has the advantage of being more friendly to OpenMP and multi-core CPUs. We don't need to serialize access to an iterator, and we allow OpenMP to do load balancing. For things like FFT on vectors along an axis in a multidimensional array (e.g. in multi-dimensional FFTs), this is not a micro-optimization. Not to mention that it is easier to iterate over an array than use a complicated iterator object in C. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
I also tried to Install numpy with intel mkl 9.1 I still used gfortran for numpy installation as intel mkl 9.1 supports gnu compiler. I would suggest using GotoBLAS instead of ATLAS. It is easier to build then ATLAS (basically no configuration), and has even better performance than MKL. http://www.tacc.utexas.edu/tacc-projects/ S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Sturla Molden wrote: I would suggest using GotoBLAS instead of ATLAS. http://www.tacc.utexas.edu/tacc-projects/ That does look promising -- nay idea what the license is? They don't make it clear on the site UT TACC Research License (Source Code) The Texas Advanced Computing Center of The University of Texas at Austin has developed certain software and documentation that it desires to make available without charge to anyone for academic, research, experimental or personal use. This license is designed to guarantee freedom to use the software for these purposes. If you wish to distribute or make other use of the software, you may purchase a license to do so from the University of Texas. The accompanying source code is made available to you under the terms of this UT TACC Research License (this UTTRL). By clicking the ACCEPT button, or by installing or using the code, you are consenting to be bound by this UTTRL. If you do not agree to the terms and conditions of this license, do not click the ACCEPT button, and do not install or use any part of the code. The terms and conditions in this UTTRL not only apply to the source code made available by UT TACC, but also to any improvements to, or derivative works of, that source code made by you and to any object code compiled from such source code, improvements or derivative works. 1. DEFINITIONS. 1.1 Commercial Use shall mean use of Software or Documentation by Licensee for direct or indirect financial, commercial or strategic gain or advantage, including without limitation: (a) bundling or integrating the Software with any hardware product or another software product for transfer, sale or license to a third party (even if distributing the Software on separate media and not charging for the Software); (b) providing customers with a link to the Software or a copy of the Software for use with hardware or another software product purchased by that customer; or (c) use in connection with the performance of services for which Licensee is compensated. 1.2 Derivative Products means any improvements to, or other derivative works of, the Software made by Licensee. 1.3 Documentation shall mean all manuals, user documentation, and other related materials pertaining to the Software that are made available to Licensee in connection with the Software. 1.4 Licensor shall mean The University of Texas. 1.5 Licensee shall mean the person or entity that has agreed to the terms hereof and is exercising rights granted hereunder. 1.6 Software shall mean the computer program(s) referred to as GotoBLAS2 made available under this UTTRL in source code form, including any error corrections, bug fixes, patches, updates or other modifications that Licensor may in its sole discretion make available to Licensee from time to time, and any object code compiled from such source code. 2. GRANT OF RIGHTS. Subject to the terms and conditions hereunder, Licensor hereby grants to Licensee a worldwide, non-transferable, non-exclusive license to (a) install, use and reproduce the Software for academic, research, experimental and personal use (but specifically excluding Commercial Use); (b) use and modify the Software to create Derivative Products, subject to Section 3.2; and (c) use the Documentation, if any, solely in connection with Licensee's authorized use of the Software. 3. RESTRICTIONS; COVENANTS. 3.1 Licensee may not: (a) distribute, sub-license or otherwise transfer copies or rights to the Software (or any portion thereof) or the Documentation; (b) use the Software (or any portion thereof) or Documentation for Commercial Use, or for any other use except as described in Section 2; (c) copy the Software or Documentation other than for archival and backup purposes; or (d) remove any product identification, copyright, proprietary notices or labels from the Software and Documentation. This UTTRL confers no rights upon Licensee except those expressly granted herein. 3.2 Licensee hereby agrees that it will provide a copy of all Derivative Products to Licensor and that its use of the Derivative Products will be subject to all of the same terms, conditions, restrictions and limitations on use imposed on the Software under this UTTRL. Licensee hereby grants Licensor a worldwide, non-exclusive, royalty-free license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute Derivative Products. Licensee also hereby grants Licensor a worldwide, non-exclusive, royalty-free patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Derivative Products under those patent claims licensable by Licensee that are necessarily infringed by the Derivative Products. 4. PROTECTION OF SOFTWARE. 4.1 Confidentiality. The Software and Documentation are the confidential and proprietary information of Licensor. Licensee agrees to take adequate steps to protect the Software and Documentation from unauthorized
Re: [Numpy-discussion] non-standard standard deviation
Colin J. Williams skrev: When one has a smallish sample size, what give the best estimate of the variance? What do you mean by best estimate? Unbiased? Smallest standard error? In the widely used Analysis of Variance (ANOVA), the degrees of freedom are reduced for each mean estimated, That is for statistical tests, not to compute estimators. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
Colin J. Williams skrev: suggested that 1 (one) would be a better default but Robert Kern told us that it won't happen. I don't even see the need for this keyword argument, as you can always multiply the variance by n/(n-1) to get what you want. Also, normalization by n gives the ML estimate (yes it has a bias, but it is better anyway). It is a common novice mistake to use 1/(n-1) as nomalization, probably due to poor advice in introductory statistics textbooks. It also seems that frequentists are more scared about this bias boogey monster than Bayesians. It may actually help beginners to avoid this mistake if numpy's implementation prompts them to ask why the normalization is 1/n. If numpy is to change the implementation of std, var, and cov, I suggest using the two-pass algorithm to reduce rounding error. (I can provide C code.) This is much more important than changing the normalization to a bias-free but otherwise inferior value. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
Colin J. Williams skrev: Where the distribution of a variate is not known a priori, then I believe that it can be shown that the n-1 divisor provides the best estimate of the variance. Have you ever been shooting with a rifle? What would you rather do: - Hit 9 or 10, with a bias to the right. - Hit 7 or better, with no bias. Do you think it can be shown that the latter option is the better? No? Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is anyone knowledgeable about dll deployment on windows ?
David Cournapeau skrev: If every python package starts to put its extensions (*.pyd) into a directory, what happens when two different packages have an extension with the same name (e.g. package foo has a package multiarray.pyd) ? I would also be really annoyed if a 3rd party extension starts polluting my C:\Python26. In disutils, data_files can install DLLs where Python wants them. Just pass 'DLLs' as path, and the rest it up to distutils. If anyone pollutes your c:\Python26 it is distutils, not a third party extension. This is not any different from installing other data files. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is anyone knowledgeable about dll deployment on windows ?
David Cournapeau skrev: We are talking about the numpy extensions here, which are not installed through the install_data command. The problem is about how windows looks for dll with the manifest mechanism, and how to build/install extensions when the C runtime (or any other system dll) is not installed in the SxS cache. Do we need to support Windows 2000? S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is anyone knowledgeable about dll deployment on windows ?
David Cournapeau skrev: We are talking about the numpy extensions here, which are not installed through the install_data command. The problem is about how windows looks for dll with the manifest mechanism, and how to build/install extensions when the C runtime (or any other system dll) is not installed in the SxS cache. Is COM (aka ActiveX) out of the question? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy on Python3
Pauli Virtanen skrev: XXX: 3K: numpy.random is disabled for now, uses PyString_* XXX: 3K: numpy.ma is disabled for now -- some issues I thought numpy.random uses Cython? Is it just a matter of recompiling the pyx-file? I remember Dag was working on this a bit: how far did it go? Cython's include file numpy.pxd has an ndarray class that extend PyArrayObject with PEP 3118 buffer compatibility. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bus error in embedded numpy
Robin skrev: I had assumed when matlab unloads the mex function it would also unload python - but it looks like other dynamic libs pulled in from the mex function (in this case python and in turn numpy) aren't unloaded... Matlab MEX functions are DLLs, Python interpreter is a DLL, NumPy .pyd files are DLLs... If you cannot unload Python or NumPy, you cannot unload MEX functions either. The same OS constraints apply. If you are using Windows, I would recommend that you expose your NumPy code as a COM object using pywin32, and make a Matlab wrapper (ordinary .m file) that calls the COM object. If you are not using Windows, you could create an XMLRPC server in Python and call that using Matlab's built-in Java VM. Or you can spawn a separate Python process from Matlab, and communicate using pipes or a tcp/ip socket (Matlab's Java or MEX). There are many solutions that are easier than embedding Python in Matlab. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bus error in embedded numpy
Robin skrev: So far the only remotely tricky thing I did was redirect sys.stdout and sys.stderr to a wrapper that uses mexPrintf so output goes to the matlab console. Be careful when you are using file handles. You have to be sure that Matlab, Python and NumPy are all linked against the same CRT. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bus error in embedded numpy
Robin skrev: Ah, I hadn't realised it was an OS constraint - I thought it was possible to unload dlls - and that was why matlab provides the clear function. mex automatically clears a function when you rebuild it - I thought that was how you can rebuild and reload mex functions without restarting matlab. It is OS dependent if DLLs can be unloaded or not. In Windows you can call CloseHandle on the DLL handle and it's gone. If you can unload a MEX function you can also unload Python or NumPy. But when you unload the MEX function, the DLLs loaded by the MEX are not automatically cleared. There is no garbage collector. In Windows, a load and an unload will result in calls to a function called DllMain in the DLL. And if the DLL has other DLLs to load or clear, that is where you need to place it. You can put a custom DllMain function in a MEX file. But be careful: a DLL is only loaded once, i.e. DLLs are imported as singletons in the process. If two MEX functions load Python26.dll, they get handles to the same instance. If you unload Python26.dll in one MEX, it is unloaded globally from the process, so the other MEX will crash. There is no reference counting by the OS kernel. The solution to this type of DLL Hell in Windows is COM. A COM object is a DLL but not a singleton. You can have multiple instances of the save COM object in one process. On Mac I guess your options are to either statically link everything to the MEX file, or find a way for multiple MEX files to share Python interpreter, e.g. implement some sort of reference counting scheme, which by the way is what COM does on Windows. I really want a cross platform solution so that rules out COM. CORBA or XMLRPC seem to be the standards. I'm not sure I would use either. Do you think I'm likely to run into more problems? I got the feeling from asking questions on IRC that embedding Python is kind of discouraged but I thought in this case it seemed the neatest way. It is discouraged because you get a singleton. You can create subinterpreters (cf. Apache's mod_python), but that will bring problems of it's own. For example you cannot use extensions that require the simplified GIL API (e.g. ctypes). I think this is a major design flaw of CPython. For example with Java's JNI, you get a context pointer to the VM, so you don't have any of these problems. But with CPython, both the interpreter and extensions are basically implemented to be loaded as singletons. I was aware of this - I thought I would be OK on the mac - at least Python and Numpy and my mex function are built with apple gcc although I'm not sure about Matlab. I guess Windows will be more difficult... But in any case I don't plan to pass any file handles around. It applies to any CRT resource, not just files. Compiler is not important, but which CRT is loaded. And if you statically link the same CRT twice, that becomes two CRTs that cannot share resources. In Windows, Microsoft has made sure there are multiple versions of their CRT (msvcrt.dll, msvcr71.dll, msvcr80.dll, msvcr90.dll, ...) that cannot share resources. And anyone not careful with this can experice crashes at random locations. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Multiple Regression
Alexey Tigarev skrev: I have implemented multiple regression in a following way: You should be using QR or SVD for this. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
David Cournapeau wrote: On Fri, Nov 6, 2009 at 6:54 AM, David Goldsmith d.l.goldsm...@gmail.com wrote: Interesting thread, which leaves me wondering two things: is it documented somewhere (e.g., at the IEEE site) precisely how many *decimal* mantissae are representable using the 64-bit IEEE standard for float representation (if that makes sense); and are such decimal mantissae uniformly distributed They are definitely not uniformly distributed: that's why two numbers are close around 1 when they have only a few EPS difference, but around 1e100, you have to add quite a few EPS to even get a different number at all. That may be my audio processing background, but I like to think about float as numbers which have the same relative precision at any level - a kind of dB scale. If you want numbers with a fixed number of decimals, you need a fixed point representation. David Godsmith was asking about the mantissae. For a double, that is a 53 bit signed integer. I.e. you have 52 bit fractional part (bit 0-51), 11 bit exponent (bit 52-62), and one sign bit (bit 63). The mantissae is uniformly distributed like any signed integer. The mantissae of a double have 2**53 different integer values: -2**52 to 2**52-1. But the value of a floating point number is value = (-1)**signbit * 2**(exponent - bias) * (1 - fraction) with bias = 1023 for a double. Thus, floating point numbers are not uniformly distributed, but the mantissae is. For numerical illiterates this might come as a surprise. But in numerical mathematics, the resolution is in the number of significant digits, not in the number of decimals. 101 and .00201 have the same numerical precision. A decimal, on the other hand, can be thought of as a floating point number using base-10 instead of base-2 for the exponent: value = (-1)**signbit * 10**(exponent - bias) * (1 - fraction) Decimals and floats are not fundamentally different. There are number exactly representable with a decimal that cannot be exactly represented with a float. But numerical computation do not become more precise with a decimal than a float. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Accessing LAPACK and BLAS from the numpy C API
Jake VanderPlas wrote: Does anybody know a way to directly access the numpy.linalg routines from a C extension, without the overhead of a python callback? Thanks for the help. You find a C function pointer wrapped in a CObject in the ._cpointer attribute. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Single view on multiple arrays
Anne Archibald skrev: The short answer is, you can't. Not really true. It is possible create an array (sub)class that stores memory addresses (pointers) instead of values. It is doable, but I am not wasting my time implementing it. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Single view on multiple arrays
Bill Blinn skrev: v = multiview((3, 4)) #the idea of the following lines is that the 0th row of v is #a view on the first row of a. the same would hold true for #the 1st and 2nd row of v and the 0th rows of b and c, respectively v[0] = a[0] This would not even work, becuase a[0] does not return a view array but a scalar arrays, which is a different type of numpy objects. To get a view, you will need to: v = a[0:1] # view of element 0 in a Also you cannot assign to v[0], as that would trigger a copy as well. v[1] = b[0] v[2] = c[0] As I mentioned in the answer to Anne, it would take a completely different array object. It would need to internally store an array with memory addresses. I have not made up my mind if ndarray can be subclassed for this, or if it takes a completely different object (e.g. similar to numpy.memmap). What it would require is __setitem__ to store pointers and __getitem__ to dereference (return an ndarray with values). Good look hacking, it is not even difficult, just tedious. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Thomas Robitaille skrev: np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo (np.int32).max,size=10) which gives array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, -604311122, 2034533014, 449680073, -444302414, -1924170329]) This fails on my computer (Python 2.6.4, NumPy 1.3.0 on Win32). np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo (np.int32).max,size=10) Traceback (most recent call last): File pyshell#33, line 2, in module (np.int32).max,size=10) File mtrand.pyx, line 950, in mtrand.RandomState.random_integers File mtrand.pyx, line 746, in mtrand.RandomState.randint OverflowError: long int too large to convert to int It might have something to do with this: 2**31-1 2147483647L -2**31 -2147483648L In light of this annoying behaviour: def random_int64(size): a0 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a1 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a2 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a3 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a = a0 + (a116) + (a2 32) + (a3 48) return a.view(dtype=np.int64) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: 64-bit and larger integers could be done, but it requires modification. The integer distributions were written to support C longs, not anything larger. You could also use .bytes() and np.fromstring(). But as of Python 2.6.4, even 32-bit integers fail, at least on Windows. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: Then let me clarify: it was written to support integer ranges up to sys.maxint. Absolutely, it would be desirable to extend it. I know, but look at this: import sys sys.maxint 2147483647 2**31-1 2147483647L sys.maxint becomes a long, which is what confuses mtrand. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: Then let me clarify: it was written to support integer ranges up to sys.maxint. Absolutely, it would be desirable to extend it. Actually it only supports integers up to sys.maxint-1, as random_integers call randint. random_integers includes the upper range, but randint excludes the upper range. Thus, this happens on line 1153 in mtrand.pyx: return self.randint(low, high+1, size) The main source of the problem is that number smaller than sys.maxint can become a long. (I have asked why on python-dev, it does not make any sence.) So when random_integers pass high+1 to randint, it is unneccesarily converted to a long. Then, there is an exception on line 847: hi = high With hi previously declared to long, Cython refuses the conversion. Now, we could try a downcast to int like this: hi = int(high) which would make Cython only raise an exception in case of an integer overflow. int(2**31) 2147483648L int(2**31-1) 2147483647 If there is no overflow, high becomes an int and conversion to C long is allowed. Still, this will only support integer ranges up to sys.maxint - 1. We thus have to swap the order of randint and random_intgers. The one with the inclusive upper interval should call rk_interval. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Sturla Molden skrev: Robert Kern skrev: Then let me clarify: it was written to support integer ranges up to sys.maxint. Absolutely, it would be desirable to extend it. Actually it only supports integers up to sys.maxint-1, as random_integers call randint. random_integers includes the upper range, but randint excludes the upper range. Thus, this happens on line 1153 in mtrand.pyx: return self.randint(low, high+1, size) inclusive upper interval should call rk_interval I love this one: cdef long lo, hi, diff [...] diff = hi - lo - 1 which silently overflows, and is the reason for this strange exception: np.random.random_integers(-2147483648,high=2147483646,size=10) Traceback (most recent call last): File pyshell#68, line 1, in module np.random.random_integers(-2147483648,high=2147483646,size=10) File mtrand.pyx, line 950, in mtrand.RandomState.random_integers File mtrand.pyx, line 750, in mtrand.RandomState.randint ValueError: low = high I'll call this a bug. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Syntax highlighting for Cython and NumPy
Here is an XML for Cython syntax highlighting in katepart (e.g. KATE and KDevelop). I made this because KATE is my faviourite text edior (feel free to call me a heretic for not using emacs). Unfortunately, the Python highlighting for KDE contains several bugs. And the Pyrex/Cython version that circulates on the web builds on this and introduces a couple more. I have tried to clean it up. Note that this will also highlight numpy.* or np.*, if * is a type or function you get from cimport numpy or import numpy. This works on Windows as well, if you have installed KDE for Windows. Just copy the XML to: ~/.kde/share/apps/katepart/syntax/ C:\kde\share\apps\katepart\syntax (or whereever you have KDE installed) and Cython with NumPy shows up under Sources. Anyway, this is the syntax high-lighter I use to write Cython. Feel free to use it as you wish. P.S. I am also cleaning up Python high-lighting for KDE. Not done yet, but I will post a Python with NumPy highlighter later on if this is interesting. P.P.S. This also covers Pyrex, but add in some Cython stuff. Sturla Molden ?xml version=1.0 encoding=UTF-8? !DOCTYPE language !-- Python syntax highlightning v0.9 by Per Wigren -- !-- Python syntax highlighting v1.9 by Michael Bueker (improved keyword differentiation) -- !-- Python syntax highlighting v1.97 by Paul Giannaros -- !-- Pyrex syntax highlighting v0.1 by Matthew Marshall -- !-- Cython syntax highlighting v1.0 by Martin Gysel -- !-- Cython syntax highlighting v1.1 by Sturla Molden -- language name=Cython with NumPy version=1.1 kateversion=2.4 section=Sources extensions=*.pyx;*.pxi;*.pxd mimetype=application/x-cython;text/x-cython casesensitive=1 author=Sturla Molden license= highlighting list name=imports item import /item item from /item item as /item /list list name=cimport !-- might be used to trigger special highlighting after cimport cython and cimport numpy -- item cimport /item /list list name=prep item DEF /item item IF /item item ELIF /item item ELSE /item item include /item /list list name=defs item class /item item cdef /item item cpdef /item item extern /item item ctypedef /item item def /item item del /item item global /item item lambda /item item struct /item item enum /item item property /item item readonly /item item public /item item nogil /item item gil /item /list list name=operators item and /item item assert /item item in /item item is /item item not /item item or /item item sizeof /item /list list name=commands !-- item exec /item -- item print /item /list list name=flow item break /item item continue /item item elif /item item else /item item except /item item finally /item item for /item item if /item item pass /item item raise /item item return /item item try /item item while /item item with /item !-- item yield /item -- /list list name=builtinfuncs item __import__ /item item abs /item item all /item item any /item item apply /item item basestring /item item buffer /item item callable /item item chr /item item classmethod /item item cmp /item item coerce /item item compile /item item delattr /item item dir /item item divmod /item item enumerate /item item eval /item item execfile /item item filter /item item getattr /item !-- item globals /item -- item hasattr /item item hash /item item hex /item item id /item item input /item item intern /item item isinstance /item item issubclass /item item iter /item item len /item !-- item locals /item -- item map /item item max /item item min /item item oct /item item open /item item ord /item item pow /item !-- item property /item -- item range /item item raw_input /item item reduce /item item reload /item item repr /item item reversed /item item round /item item setattr /item item sorted /item item staticmethod /item item sum /item item super /item item type /item item unichr /item item unicode /item !-- item vars /item -- item xrange /item item zip /item /list list name=ctypes !-- native types -- item unsigned /item item void /item item enum /item item double /item item long /item item short /item item char /item item Py_ssize_t /item item Py_intptr_t /item item Py_buffer /item item bint /item !-- python types -- /list list name=pytypes item int /item item float /item item object /item item list /item item tuple /item item str /item item dict /item
Re: [Numpy-discussion] [Cython] Syntax highlighting for Cython and NumPy
Sturla Molden skrev: and Cython with NumPy shows up under Sources. Anyway, this is the syntax high-lighter I use to write Cython. It seems I posted the wrong file. :-( S.M. ?xml version=1.0 encoding=UTF-8? !DOCTYPE language !-- Python syntax highlightning v0.9 by Per Wigren -- !-- Python syntax highlighting v1.9 by Michael Bueker (improved keyword differentiation) -- !-- Python syntax highlighting v1.97 by Paul Giannaros -- !-- Pyrex syntax highlighting v0.1 by Matthew Marshall -- !-- Cython syntax highlighting v1.0 by Martin Gysel -- !-- Cython syntax highlighting v1.1 by Sturla Molden -- language name=Cython with NumPy version=1.1 kateversion=2.4 section=Sources extensions=*.pyx;*.pxi;*.pxd mimetype=application/x-cython;text/x-cython casesensitive=1 author=Sturla Molden license= highlighting list name=as item as /item /list list name=imports item cimport /item item import /item item from /item item as /item /list list name=prep item DEF /item item IF /item item ELIF /item item ELSE /item item include /item /list list name=defs item class /item item cdef /item item cpdef /item item extern /item item ctypedef /item item def /item item del /item item global /item item lambda /item item struct /item item enum /item item property /item item readonly /item item public /item item nogil /item item gil /item item inline /item /list list name=operators item and /item item assert /item item in /item item is /item item not /item item or /item item sizeof /item /list list name=commands !-- item exec /item -- item print /item /list list name=flow item break /item item continue /item item elif /item item else /item item except /item item finally /item item for /item item if /item item pass /item item raise /item item return /item item try /item item while /item item with /item !-- item yield /item -- /list list name=builtinfuncs item __import__ /item item abs /item item all /item item any /item item apply /item item basestring /item item buffer /item item callable /item item chr /item item classmethod /item item cmp /item item coerce /item item compile /item item delattr /item item dir /item item divmod /item item enumerate /item item eval /item item execfile /item item filter /item item getattr /item !-- item globals /item -- item hasattr /item item hash /item item hex /item item id /item item input /item item intern /item item isinstance /item item issubclass /item item iter /item item len /item !-- item locals /item -- item map /item item max /item item min /item item oct /item item open /item item ord /item item pow /item !-- item property /item -- item range /item item raw_input /item item reduce /item item reload /item item repr /item item reversed /item item round /item item setattr /item item sorted /item item staticmethod /item item sum /item item super /item item type /item item unichr /item item unicode /item !-- item vars /item -- item xrange /item item zip /item /list list name=ctypes !-- native types -- item unsigned /item item void /item item enum /item item double /item item long /item item short /item item char /item item Py_ssize_t /item item Py_intptr_t /item item Py_buffer /item item bint /item !-- python types -- /list list name=pytypes item int /item item float /item item object /item item list /item item tuple /item item str /item item dict /item item set /item item frozenset /item item slice /item item bool /item item complex /item item file /item !-- numpy types -- /list list name=numpyselector item np /item item numpy /item /list list name=numpymodule item numpy /item /list list name=cythonmodule item cython /item /list list name=numpytypes item dtype /item item flatiter /item item broadcast /item item ndarray /item item int8_t /item item int16_t /item item int32_t /item item int64_t /item item uint8_t /item item uint16_t /item item uint32_t /item item uint64_t /item item float32_t /item item
Re: [Numpy-discussion] [Cython] Syntax highlighting for Cython and NumPy
Lisandro Dalcin skrev: Is there any specific naming convention for these XML files to work with KATE? Would it be fine to call it 'cython-mode-kate.xml' to push it to the repo? Will it still work (I mean, with that name) when placed appropriately in KATE config dirs or whatever? ... Just concerned that 'cython.xml' is a bit too generic filename... You can name it anything you want. The file has an entry like this: language name=Cython with NumPy version=1.1 kateversion=2.4 section=Sources extensions=*.pyx;*.pxi;*.pxd that tells KDE what this is and where to put it. I was thinking of changing section to Scientific, because that I where I want to put Python with NumPy as well. It makes sence to have NumPy highlighting alongside Matlab and Octave. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fftpack_lite question
Ralf Gommers skrev: If anyone with knowledge of the differences between the C and Fortran versions could add a few notes at the above link, that would be great. The most notable difference (from a user perspective) is that the Fortran version has more transforms, such as discrete sine and cosine transforms. It also supports single and double precision. The older Fortran version is used in SciPy. FFTs from FFTW and MKL tend to be faster than FFTPACK, at least on Intel hardware. FFTPACK was originally written for running fast on vector machines like the Cray and NEC. FFTPACK-lite: http://projects.scipy.org/numpy/browser/trunk/scipy/basic/fftpack_lite?rev=1676 Older Fortran version: http://www.netlib.org/fftpack/ Fortran 90 version (no license): http://orion.math.iastate.edu/burkardt/f_src/fftpack/fftpack.html Another C version: http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txtfilename=fftpack/fft.c http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txtfilename=fftpack/fft.c S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy and C99
Dag Sverre Seljebotn skrev: Microsoft's compilers don't support C99 (or, at least, versions that still has to be used doesn't). Except for automatic arrays, they do support some of the more important parts of C99 as extensions to C89: inline functions restrict qualifier for (int i=0; i; i++) Personally I think all of NumPy's C base should be moved to Cython. With your excellent syntax for PEP 3118 buffers, I see no reason to keep NumPy in C. This would make porting to Py3k as well as maintainence easier. When Cython can build Sage, it can be used for a smaller project like NumPy as well. The question of using C89, C99 or C++ would be deferred to the Cython compiler. We could use C++ on one platform (MSVC) and C99 on another (GCC). We would also get direct support for C99 _Complex and C++ std::complex types. I'd also suggest that ndarray subclasses memoryview in Py3k. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Robert Kern skrev: No, I think you're right. Using SIMD to refer to numpy-like operations is an abuse of the term not supported by any outside community that I am aware of. Everyone else uses SIMD to describe hardware instructions, not the application of a single syntactical element of a high level language to a non-trivial data structure containing lots of atomic data elements. Then you should pick up a book on parallel computing. It is common to differentiate between four classes of computers: SISD, MISD, SIMD, and MIMD machines. A SISD system is the classical von Neuman machine. A MISD system is a pipelined von Neuman machine, for example the x86 processor. A SIMD system is one that has one CPU dedicated to control, and a large collection of subordinate ALUs for computation. Each ALU has a small amount of private memory. The IBM Cell processor is the typical SIMD machine. A special class of SIMD machines are the so-called vector machines, of which the most famous is the Cray C90. The MMX and SSE instructions in Intel Pentium processors are an example of vector instructions. Some computer scientists regard vector machines a subtype of MISD systems, orthogonal to piplines, because there are no subordinate ALUs with private memory. MIMD systems multiple independent CPUs. MIMD systems comes in two categories: shared-memory processors (SMP) and distributed-memory machines (also called cluster computers). The dual- and quad-core x86 processors are shared-memory MIMD machines. Many people associate the word SIMD with SSE due to Intel marketing. But to the extent that vector machines are MISD orthogonal to piplined von Neuman machines, SSE cannot be called SIMD. NumPy is a software simulated vector machine, usually executed on MISD hardware. To the extent that vector machines (such as SSE and C90) are SIMD, we must call NumPy an object-oriented SIMD library. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Matthieu Brucher skrev: I agree with Sturla, for instance nVidia GPUs do SIMD computations with blocs of 16 values at a time, but the hardware behind can't compute on so much data at a time. It's SIMD from our point of view, just like Numpy does ;) A computer with a CPU and a GPU is a SIMD machine by definition, due to the single CPU and the multiple ALUs in the GPU, which are subordinate to the CPU. But with modern computers, these classifications becomes a bit unclear. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Mathieu Blondel skrev: Peter Norvig suggested to merge Numpy into Cython but he didn't mention SIMD as the reason (this one is from me). I don't know what Norvig said or meant. However: There is NumPy support in Cython. Cython has a general syntax applicable to any PEP 3118 buffer. (As NumPy is not yet PEP 3118 compliant, NumPy arrays are converted to Py_buffer structs behind the scenes.) Support for optimized vector expressions might be added later. Currently, slicing works as with NumPy in Python, producing slice objects and invoking NumPy's own code, instead of being converted to fast inlined C. The PEP 3118 buffer syntax in Cython can be used to port NumPy to Py3k, replacing the current C source. That might be what Norvig meant if he suggested merging NumPy into Cython. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Mathieu Blondel skrev: As I wrote earlier in this thread, I confused Cython and CPython. PN was suggesting to include Numpy in the CPython distribution (not Cython). The reason why was also given earlier. First, that would currently not be possible, as NumPy does not support Py3k. Second, the easiest way to port NumPy to Py3k is Cython, which would prevent adoption in the Python standard library. At least they have to change their current policy. Also with NumPy in the standard library, any modification to NumPy would require a PEP. But Python should have a PEP 3118 compliant buffer object in the standard library, which NumPy could subclass. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Robert Kern skrev: I would be delighted to see a reference to one that refers to a high level language's API as SIMD. Please point one out to me. It's certainly not any of the ones I have available to me. Numerical Receipes in Fortran 90, page 964 and 985-986, describes the syntax of Fortran 90 and 95 as SIMD. Peter Pacheco's book on MPI describes the difference between von Neumann machines and vector machines as analogous to the difference between Fortran77 and Fortran 90 (with an example from Fortran90 array slicing). He is ambigous as to whether vector machines really are SIMD, or more related to pipelined von Neumann machines. Grama et al. Introduction to Parallel Computing describes SIMD as an architecture, but it is more or less clear that the mean hardware. They do say the Fortran 90 where statement is a primitive used to support selective execution on SIMD processors, as conditional execution (if statements) are detrimental to performance. So at least we here have three books claiming that Fortran is a language with special primities for SIMD processors. That's a fair point, but unrelated to whether or not numpy can be labeled SIMD. These all refer to hardware. Actually I don't think the distinction is that important as we are taking about Turing machines. Also, a lot of what we call hardware is actually implemented as software on the chip: The most extreme example would be Transmeta, which completely software emulated x86 processors. The vague distinction between hardware and software is why we get patents on software in Europe, although pure software patents are prohibited. One can always argue that the program and the computer together constitutes a physical device; and circumventing patents by moving hardware into software should not be allowed. The distinction between hardware and software is not as clear as programmers tend to believe. Another thing is that performance issues for vector machines and vector languages (Fortran 90, Matlab, NumPy) are similar. Precisely the same situations that makes NumPy and Matlab code slow are detrimental on SIMD/vector hardware. That would for example be long for loops with conditional if statements. On the other hand, vectorized operations over arrays, possibly using where/find masks, are fast. So although NumPy is not executed on a vector machine like the Cray C90, it certainly behaves like one performance wise. I'd say that a MIMD machine running NumPy is a Turing machine emulating a SIMD/vector machine. And now I am done with this stupid discussion... Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy
Mathieu Blondel skrev: Hello, About one year ago, a high-level, objected-oriented SIMD API was added to Mono. For example, there is a class Vector4f for vectors of 4 floats and this class implements methods such as basic operators, bitwise operators, comparison operators, min, max, sqrt, shuffle directly using SIMD operations. I think you are confusing SIMD with Intel's MMX/SSE instruction set. SIMD means single instruction - multiple data. NumPy is interherently an object-oriented SIMD API: array1[:] = array2 + array3 is a SIMD instruction by definition. SIMD instructions in hardware for length-4 vectors are mostly useful for 3D graphics. But they are not used a lot for that purpose, because GPUs are getting common. SSE is mostly for rendering 3D graphics without a GPU. There is nothing that prevents NumPy from having a Vector4f dtype, that internally stores four float32 and is aligned at 16 byte boundaries. But it would not be faster than the current float32 dtype. Do you know why? The reason is that memory access is slow, and computation is fast. Modern CPUs are starved. The speed of NumPy is not limited by not using MMX/SSE whenever possible. It is limited from having to create and delete temporary arrays all the time. You are suggesting to optimize in the wrong place. There is a lot that can be done to speed up computation: There are optimized BLAS libraries like ATLAS and MKL. NumPy uses BLAS for things like matrix multiplication. There are OpenMP for better performance on multicores. There are OpenCL and CUDA for moving computation from CPUs to GPU. But the main boost you get from going from NumPy to hand-written C or Fortran comes from reduced memory use. existing discussion here. Memory-alignment is an import related issue since non-aligned movs can tank the performance. You can align an ndarray on 16-byte boundary like this: def aligned_array(N, dtype): d = dtype() tmp = numpy.zeros(N * d.nbytes + 16, dtype=numpy.uint8) address = tmp.__array_interface__['data'][0] offset = (16 - address % 16) % 16 return tmp[offset:offset+N].view(dtype=dtype) Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Optimized sum of squares
Skipper Seabold skrev: I'm curious about this as I use ss, which is just np.sum(a*a, axis), in statsmodels and didn't much think about it. Do the number of loops matter in the timings and is dot always faster even without the blas dot? The thing is that a*a returns a temporary array with the same shape as a, and then that is passed to np.sum. The BLAS dot product don't need to allocate and deallocate temporary arrays. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object array alignment issues
Francesc Alted skrev: The response is clear: avoid memcpy() if you can. It is true that memcpy() performance has improved quite a lot in latest gcc (it has been quite good in Win versions since many years ago), but working with data in-place (i.e. avoiding a memory copy) is always faster (and most specially for large arrays that don't fit in cache processors). My own experiments says that, with an Intel Core2 processor the typical speed- ups for avoiding memcpy() are 2x. If the underlying array is strided, I have seen the opposite as well. Copy-in copy-out is a common optimization used by Fortran compilers when working with strided arrays. The catch is that the work array has to fit in cache for this to make any sence. Anyhow, you cannot use memcpy for this kind of optimization - it assumes both buffers are contiguous. But working with arrays directly instead of copies is not always the faster option. S.M. And I've read somewhere that both AMD and Intel are trying to make unaligned operations to go even faster in next architectures (the goal is that there should be no speed difference in accessing aligned or unaligned data). I believe the memcpy approach is used for other unaligned parts of void types. There is an inherent performance penalty there, but I don't see how it can be avoided when using what are essentially packed structures. As to memcpy, it's performance seems to depend on the compiler/compiler version, old versions of gcc had *horrible* implementations of memcpy. I believe the situation has since improved. However, I'm not sure we should be coding to compiler issues unless it is unavoidable or the gain is huge. IMO, NumPy can be improved for unaligned data handling. For example, Numexpr is using this small snippet: from cpuinfo import cpu if cpu.is_AMD() or cpu.is_Intel(): is_cpu_amd_intel = True else: is_cpu_amd_intel = False for detecting AMD/Intel architectures and allowing the code to avoid memcpy() calls for the unaligned arrays. The above code uses the excellent ``cpuinfo.py`` module from Pearu Peterson, which is distributed under NumPy, so it should not be too difficult to take advantage of this for avoiding unnecessary copies in this scenario. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Resize Method for Numpy Array
Robert Kern skrev: While this description is basically true of numpy arrays, I would caution you that every language has a different lexicon, and the same word can mean very different things in each. For example, Python lists are *not* linked lists; they are like C++'s std::vectors with a preallocation strategy to make appending cheap on average. In Java and .NET jargon, Python lists are array lists, not linked lists. It is sad there is no cons or llist built-in type, something like: mycons = cons(car, cdr) mylist = llist(iterable) Of course we can write [car, cdr] or (car, cdr) for making linked lists in pure Python (without having to define class types), but both have issues.The first is storage inefficient, the latter is not mutable. Yes I know Guido left out linked lists for a purpose, so there is probably no use complaining on the Python ideas of Python dev lists... S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Resize Method for Numpy Array
Robert Kern skrev: collections.deque() is a linked list of 64-item chunks. Thanks for that useful information. :-) But it would not help much for a binary tree... Since we are on the NumPy list... One could image making linked lists using NumPy arrays with dtype=object. They are storage efficient like tuples, and mutable like lists. def cons(a,b): return np.array((a,b),dtype=object) But I guess the best way is to implement a real linked extension type in Cython. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy and cython in pure python mode
Sebastian Haase skrev: I know that cython's numpy is still getting better and better over time, but is it already today possible to have numpy support when using Cython in pure python mode? I'm not sure. There is this odd memoryview syntax: import cython view = cython.int[:,:](my2darray) print view[3,4] # fast when compiled, according to Sverre S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Best way to insert C code in numpy code
René Dudfield skrev: Another way is to make your C function then load it with ctypes(or wrap it with something else) and pass it pointers with array.ctype.data. numpy.ctypeslib.ndpointer is preferred when using ndarrays with ctypes. You can find the shape of the array in python, and pass it to your C function. The benefit is it's just C code, and you can avoid the GIL too if you want. Then if you keep your C code separate from python stuff other people can use your C code in other languages more easily. You can do this with Cython as well, just use Cython for the glue code. The important difference is this: Cython is a language for writing C extensions, ctypes is a module for calling DLLs. One important advantage of Cython is deterministic clean-up code. If you put a __dealloc__ method in a cdef class, it will be called on garbage collection. Another nice way of interfacing C with numpy is f2py. It also works with C, not just Fortran. Yet another way (Windows specific) is to use win32com.client and pass array.ctype.data. That is nice if you have an ActiveX control; for Windows you often get commercial libraries like that. Also if you have .NET or Java objects, you can easily expose them to COM. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Best way to insert C code in numpy code
René Dudfield skrev: Another way is to make your C function then load it with ctypes Also one should beware that ctypes is a stable part of the Python standard library. Cython is still unstable and in rapid development. Pyrex is more stabile than Cython, but interfacing with ndarrays is harder. If you have a requirement on not using experimental code, then Cython is not an option. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Best way to insert C code in numpy code
Xavier Gnata skrev: I have a large 2D numpy array as input and a 1D array as output. In between, I would like to use C code. C is requirement because it has to be fast and because the algorithm cannot be written in a numpy oriented way :( (no way...really). There are certain algorithms that cannot be vectorized, particularly those that are recursive/iterative. One example is MCMC methods such as the Gibbs sampler. You can get around it by running multiple Markov chains in parallel, and vectorizing this parallelism with NumPy. But you cannot vectorize one long chain. Vectorizing with NumPy only applies to data parallel problems. But then there is a nice tool you should not about: Cython in pure Python mode. You just add some annotations to the Python code, and the .py file can be compiled to efficient C. http://wiki.cython.org/pure This is quite similar in spirit to the optional static typing that makes certain implementations of Common Lisp (CMUCL, SBCL, Franz) so insanely fast. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matlab for numpy users
Christian K. skrev: Is there something like broadcasting in matlab? Matlab does not do automatic broadcasting like NumPy and Fortran 95. You have to broadcast manually, mostly using repmat (but there are other ways as well). This should help: http://home.online.no/~pjacklam/matlab/doc/mtt/doc/mtt.pdf http://home.online.no/%7Epjacklam/matlab/doc/mtt/doc/mtt.pdf Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dot product performance on python 2.6 (windows)
V. Armando Solé skrev: import numpy import time a=numpy.arange(100.) a.shape=1000,1000 t0=time.time() b=numpy.dot(a.T,a) print Elapsed time = ,time.time()-t0 reports an Elapsed time of 1.4 seconds under python 2.5 and 15 seconds under python 2.6 My computer reports 0.34 seconds (Pytin 2.6.2, Win32). S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dot product performance on python 2.6 (windows)
V. Armando Solé skrev: In python 2.6: import numpy.core._dotblas as dotblas ... ImportError: No module named _dotblas import numpy.core._dotblas as dotblas dotblas.__file__ 'C:\\Python26\\lib\\site-packages\\numpy\\core\\_dotblas.pyd' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Citi, Luca skrev: That is exactly why numexpr is faster in these cases. I hope one day numpy will be able to perform such optimizations. I think it is going to require lazy evaluation. Whenever possible, an operator would just return a symbolic representation of the operation. This would gradually build up a tree of operators and buffers. When someone tries to read the data from an array, the buffer is created on-demand by flushing procratinated expressions. One must be sure that the buffers referenced in an incomplete expression never change. This would be easiest to ensure with immutable buffers. Numexpr is the kind of back-end a system like this would require. But a lot of the code in numexpr can be omitted because Python creates the parse tree; we would not need the expression parser in numexpr as frontend. Well... this plan is gradually getting closer to a specialized SciPy JIT-compiler. I would be fun to make if I could find time for it. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Rohit Garg skrev: gtx280--141GBps--has 1GB ati4870--115GBps--has 1GB ati5870--153GBps (launches sept 22, 2009)--2GB models will be there too That is going to help if buffers are kept in graphics memory. But the problem is that graphics memory is a scarse resource. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
George Dahl skrev: I know that for my work, I can get around an order of a 50-fold speedup over numpy using a python wrapper for a simple GPU matrix class. So I might be dealing with a lot of matrix products where I multiply a fixed 512 by 784 matrix by a 784 by 256 matrix that changes between each matrix product, although to really see the largest gains I use a 4096 by 2048 matrix times a bunch of 2048 by 256 matrices. Matrix multiplication is at the core of 3D graphics, and the raison d'etre for GPUs. That is specifically what they are designed to do. Matrix multiplication scale O(n**3) with floating point operations and O(n**2) with memory access. That is GPUs gives fast 3D graphics (matrix multiplications) by speeding up floating point operations. GPUs makes sence for certain level-3 BLAS calls, but that really belongs in BLAS, not in NumPy's core. One could e.g. consider linking with a BLAS wrapper that directs these special cases to the GPU and the rest to ATLAS / MKL / netlib BLAS. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
James Bergstra skrev: Suppose you want to evaluate dot(a*b+c*sqrt(d), e). The GPU is great for doing dot(), The CPU is equally great (or better?) for doing dot(). In both cases: - memory access scale O(n) for dot producs. - computation scale O(n) for dot producs. - memory is low - computation is fast (faster for GPU) In both cases, the floating point unit is starved. That means it could do a lot more work if memory were faster. For the GPU to be faster than CPU, you have to have a situation where computation dominates over memory access. Matrix-matrix multiplication is one such example. This is what GPUs are designed to do, as it is the major bootleneck in 3D graphics. The proper way to speed up dot(a*b+c*sqrt(d), e) is to get rid of temporary intermediates. That is, in Python pseudo-code: result = 0 for i in range(n): result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i] instead of: tmp0 = empty(n) for i in range(n): tmp0[i] = a[i] * b[i] tmp1 = empty(n) for i in range(n): tmp1[i] = sqrt(d[i]) tmp2 = empty(n) for i in range(n): tmp2[i] = c[i] * tmp1[i] tmp3 = empty(n) for i in range(n): tmp3[i] = tmp0[i] + tmp2[i] result = 0 for i in range(n): result += tmp3[i] * e[i] It is this complication that makes NumPy an order of magnitude slower than hand-crafted C (but still much faster than pure Python!) Adding in GPUs will not change this. The amount of computation (flop count) is the same, so it cannot be the source of the slowness. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Huge arrays
Daniel Platz skrev: data1 = numpy.zeros((256,200),dtype=int16) data2 = numpy.zeros((256,200),dtype=int16) This works for the first array data1. However, it returns with a memory error for array data2. I have read somewhere that there is a 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still be below that? I use Windows XP Pro 32 bit with 3GB of RAM. There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You have other programs running as well, so this is still too much. Also Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to play with. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object dtype questions
Mark Wendell skrev: for i in range(5): for j in range(5): a[i,j].myMethod(var3,var4) print a[i,j].attribute1 Again, is there a quicker way than above to call myMethod or access attribute1 One option is to look up the name of the method unbound, and then use built-in function map. map( cls.myMethod, a ) is similar to: [aa.myMethod() for aa in a] Using map avoids looking up the name 'myMethod' for each instance, it is done only once. map can be a lot faster or hardly faster at all, depending on the amount of work done by the function. The more work being performed, the less you benefit from map. You can pass in variables (var3,var4) by giving additional sequences to map. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object dtype questions
Alan G Isaac skrev: http://article.gmane.org/gmane.comp.python.general/630847 Yes, but here you still have to look up the name 'f' from locals in each iteration. map is written in C, once it has as PyObject* to the callable it does not need to look up the name anymore. The dictionary lookup(s) to get the callable is done just once. map is also more thread-safe. During the iteration another thread could rebind the name of the callable, but map is impervious to that. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fwrite() failure in PyArray_ToFile
David Warde-Farley skrev: The odd values might be from the format code in the error message: PyErr_Format(PyExc_ValueError, %ld requested and %ld written, (long) size, (long) n); Yes, I saw that. My C is rusty, but wouldn't the cast take care of it? n is of type size_t, which is pretty big, and a cast to long shouldn't be an issue. And if (hopefully) PyErr_Format is correct... A long is (usually) 32 bit, even on 64 bit systems. This means that size is casted to an integer between - 2**31 and 2*31 - 1. As 2**31 bytes are 2 GB, the expression (long) size will overflow if a write of 2GB or more failed. Thus you get some bogus numbers in the formatted message. There is thus a bug in the call to PyErr_Format, as it only works reliably on 32 bits systems. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fwrite() failure in PyArray_ToFile
Charles R Harris skrev: The size of long depends on the compiler as well as the operating system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe it is 32. Ints always seem to be 32 bits. If I remember the C standard correctly, a long is guaranteed to be at least 32 bits, whereas an int is guaranteed to be at least 16 bits. A short is at least 16 bits and a long long is 64 bits. Then there is intptr_t which is wide enough to store a pointer, but could be wider. A size_t usually has the size of a pointer but not always (on segment and offset architectures they might differ). Considering PEP353, should we be indexing with Py_ssize_t instead of npy_intp? I believe (correct me if I'm wrong) npy_intp is typedef'ed to Py_intptr_t. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Chad Netzer skrev: That's right, Robert. Basically, I meant doing a median on a square (or rectangle) view of an array, without first having to ravel(), thus generally saving a copy. But actually, since my selection based median overwrites the source array, it may not save a copy anyway. Avoiding copies of timy buffers is futile optimization. QuickSelect is overkill for tiny buffers like common filter kernels. Insertion sort is fine. Getting rid of loops and multiple function calls in Python helps a lot. If memory is not an issue, with np.median you can actully create an quite an efficient median filter using a 3D ndarray. For example if you use an image of 640 x 480 pixels and want a 9 pixel median filter, you can put shifted images in an 640 x 480 x 9 ndarray, and call median with axis=2. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Dag Sverre Seljebotn skrev: a) Is the cast to numpy.npy_intp really needed? I'm pretty sure shape is defined as numpy.npy_intp*. I don't know Cython internals in detail but you do, I so take your word for it. I thought shape was a tuple of Python ints. b) If you want higher performance with contiguous arrays (which occur a lot as inplace=False is default I guess) you can do np.ndarray[T, ndim=1, mode=c] to tell the compiler the array is contiguous. That doubles the number of function instances though... Thanks. I could either double the number of specialized select functions, or I could make a local copy using numpy.ascontiguousarray in the select function. Quickselect touch the discontiguous array on average 2*n times, whereas numpy.ascontiguousarray touch the discontiguous array n times (but in orderly). Then there is the question of cache use: Contiguous arrays are the more friendly case, and numpy.ascontiguousarray is more friendly than quickselect. Also if quickselect is not done inplace (the common case for medians), we always have contigous arrays, so mode=c is almost always wanted. And when quickselect is done inplace, we usually have a contiguous input. This is also why I used a C pointer instead of your buffer syntax in the first version. Then I changed my mind, not sure why. So I'll try with a local copy first then. I don't think we want close to a megabyte of Cython generated gibberish C just for the median. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Citi, Luca skrev: Hello Sturla, In _median how can you, if n==2, use s[] if s is not defined? What if n==1? That was a typo. Also, I think when returning an empty array, it should be of the same type you would get in the other cases. Currently median returns numpy.nan for empty input arrays. I'll do that instead. I want it to behave exactly as the current implementation, except for the sorting. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to concatenate two arrays without duplicating memory?
V. Armando Solé skrev: I am looking for a way to have a non contiguous array C in which the left (1, 2000) elements point to A and the right (1, 4000) elements point to B. Any hint will be appreciated. If you know in advance that A and B are going to be duplicated, you can use views: C = np.zeros((1, 6000)) A = C[:,:2000] B = C[:,2000:] Now C is A and B concatenated horizontally. If you can't to this, you could write the data to a temporary file and read it back, but it would be slow. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory?
Sebastian Haase skrev: A mockarray is initialized with a list of nd-arrays. The result is a mock array having one additional dimention in front. This is important, because often in the case of 'concatenation' a real concatenation is not needed. But then there is a common tool called Matlab, which unlike Python has no concept of lists and make numerical programmers think they do. C = [A, B] is a horizontal concatenation in Matlab. Too much exposure to Matlab cripples the mind easily. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fastest way to parsing a specific binay file
Gökhan Sever skrev: What would be wisest and fastest way to tackle this issue? Get the format, read the binary data directly, skip the ascii/regex part. I sometimes use recarrays with formatted binary data; just constructing a dtype and use numpy.fromfile to read. That works when the binary file store C structs written successively. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Chad Netzer skrev: By the way, as far as I can tell, the above algorithm is exactly the same idea as a non-recursive Hoare (ie. quicksort) selection: Do the partition, then only proceed to the sub-partition that must contain the nth element.My version is a bit more general, allowing partitioning on a range of elements rather than just one, but the concept is the same. The numpy quicksort already does non recursive sorting. I'd also like to, if possible, have a specialized 2D version, since image media filtering is one of my interests, and the C version works on 1D (raveled) arrays only. I agree. NumPy (or SciPy) could have a select module similar to the sort module. If the select function takes an axis argument similar to the sort functions, only a small change to the current np.median would needed. Take a look at this: http://projects.scipy.org/numpy/attachment/ticket/1213/_selectmodule.pyx Here is a select function that takes an axis argument. There are specialized versions for 1D, 2D, and 3D. Input can be contiguous or not. For 4D and above, axes are found by recursion on the shape array. Thus it should be fast regardless of dimensions. I haven't tested the Cython code /thoroughly/, but at least it does compile. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Robert Kern skrev: When he is talking about 2D, I believe he is referring to median filtering rather than computing the median along an axis. I.e., replacing each pixel with the median of a specified neighborhood around the pixel. That's not something numpy's median function should be specialized to do. IMHO, median filtering belongs to scipy. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ticket for O(n) median function?
I could not find any, so I'll ask if it's ok to create one. I have a patch for /numpy/lib/function_base.py that uses any 'select' function to obtain the median. I'll also submit the Cython code for quickselect. Attachment (median.py.gz) contains the suggested implementation of median. I disabled overwrite_input because the median function calls numpy.apply_along_axis. Regards, Sturla Molden median.py.gz Description: application/gzip ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Dag Sverre Seljebotn skrev: Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the right type to use in this case? Yup. You are right. Thanks. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Dag Sverre Seljebotn skrev: Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the right type to use in this case? By the way, here is a more polished version, does it look ok? http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx Cython needs something like Java's generics by the way :-) Regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Sturla Molden skrev: By the way, here is a more polished version, does it look ok? No it doesn't... Got to keep the GIL for the general case (sorting object arrays). Fixing that. SM ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Sturla Molden skrev: http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx My suggestion for a new median function is here: http://projects.scipy.org/numpy/attachment/ticket/1213/median.py The quickselect extension module can be compiled from C source: http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.c.gz Feel free to look at it when you have time. :-) Regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] A faster median (Wirth's method)
We recently has a discussion regarding an optimization of NumPy's median to average O(n) complexity. After some searching, I found out there is a selection algorithm competitive in speed with Hoare's quick select. It has the advantage of being a lot simpler to implement. In plain Python: import numpy as np def wirthselect(array, k): Niklaus Wirth's selection algortithm a = np.ascontiguousarray(array) if (a is array): a = a.copy() l = 0 m = a.shape[0] - 1 while l m: x = a[k] i = l j = m while 1: while a[i] x: i += 1 while x a[j]: j -= 1 if i = j: tmp = a[i] a[i] = a[j] a[j] = tmp i += 1 j -= 1 if i j: break if j k: l = i if k i: m = j return a Now, the median can be obtained in average O(n) time as: def median(x): median in average O(n) time n = x.shape[0] k = n 1 s = wirthselect(x, k) if n 1: return s[k] else: return 0.5*(s[k]+s[:k].max()) The beauty of this is that Wirth select is extremely easy to migrate to Cython: import numpy ctypedef numpy.double_t T # or whatever def wirthselect(numpy.ndarray[T, ndim=1] array, int k): cdef int i, j, l, m cdef T x, tmp cdef T *a _array = np.ascontiguousarray(array) if (_array is array): _array = _array.copy() a = T * _array.data l = 0 m = int a.shape[0] - 1 with nogil: while l m: x = a[k] i = l j = m while 1: while a[i] x: i += 1 while x a[j]: j -= 1 if i = j: tmp = a[i] a[i] = a[j] a[j] = tmp i += 1 j -= 1 if i j: break if j k: l = i if k i: m = j return _array For example, we could have a small script that generates withselect for all NumPy dtypes (T as template), and use a dict as jump table. Chad, you can continue to write quick select using NumPy's C quick sort in numpy/core/src/_sortmodule.c.src. When you are done, it might be about 10% faster than this. :-) Reference: http://ndevilla.free.fr/median/median.pdf Best regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Sturla Molden skrev: We recently has a discussion regarding an optimization of NumPy's median to average O(n) complexity. After some searching, I found out there is a selection algorithm competitive in speed with Hoare's quick select. Reference: http://ndevilla.free.fr/median/median.pdf After som more googling, I found this text by Wirth himself: http://www.oberon2005.ru/book/ads2004.pdf The method is described on page 61 (57 in the PDF) as Hoare's quick select. So it seems it's just a less optimized version than that of Numerical Receipes, and the first reference (Devillard) was confused. Anyhow, it still has the advantage of looking nice in Cython and being very different from the Numerical Receipes code. We can rename wirthselect to quickselect then. Sorry for the confusion. I should have checked the source better. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A better median function?
Chad Netzer skrev: My current plan of attack is to deliver a partition() function that basically returns an array such that elements less than the pivot(s) come first, then the pivot(s), then the elements greater than the pivot(s). I'm actually trying to write a fast median replacement myself. I was thinking in the same lines, except I don't store those two arrays. I just keep track of counts in them. For the even case, I also keep track the elements closest to the pivot (smaller and bigger). It's incredibly simple actually. So lets see who gets there first :-) Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] PRNGs and multi-threading
I am not sure if this is the right place to discuss this issue. However, I problem I keep running into in Monte Carlo simulations is generating pseudorandom numbers with multiple threads. PRNGs such as the Mersenne Twister keep an internal state, which prevents the PRNG from being re-entrant and thread-safe. Possible solutions: 1. Use multiple instances of PRNG states (numpy.random.RandomState), one for each thread. This should give no contention, but is this mathematically acceptable? I don't know. At least I have not seen any proof that it is. 2. Protect the PRNG internally with a spinlock. In Windows lingo, that is: #include windows.h static volatile long spinlock = 0; #define ACQUIRE_SPINLOCK while(InterlockedExchangeAcquire(spinlock, 1)); #define RELEASE_SPINLOCK InterlockedExchangeAcquire(spinlock, 0); Problem: possible contention between thread, idle work if threads spins a lot. 3. Use a conventional mutex object to protect the PRNG (e.g. threading.Lock in Python or CRITICAL_SECTION in Windows). Problem: contention, context shifting, and mutexes tend to be slow. Possibly the worst solution. 4. Put the PRNG in a dedicated thread, fill up rather big arrays with pseudo-random numbers, and write them to a queue. Problem: Context shifting unless a CPU is dedicated to this task. Unless producing random numbers consitutes a major portion of the simulation, this should not lead to much contention. import threading import Queue import numpy from numpy.random import rand class PRNG(threading.Thread): ''' a thread that generate arrays with random numbers and dumps them to a queue ''' def __init__(self, nthreads, shape): self.shape = shape self.count = numpy.prod(shape) self.queue = Queue.Queue(4 * nthreads) # magic number self.stop_evt = threading.Event() def generate(self, block=True, timeout=None): return self.queue.get(block,timeout) def join(self, timeout=None): self.stop_evt.set() super(PRNG,self).join(timeout) def run(self): tmp = rand(count).reshape(shape) while 1: try: self.queue.put(tmp, block=True, timeout=2.0) except Queue.Full: if self.stop_evt.isSet(): break else: tmp = rand(count).reshape(shape) Do you have any view on this? Is there any way of creating multiple independent random states that will work correctly? I know of SPRNG (Scalable PRNG), but it is made to work with MPI (which I don't use). Regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PRNGs and multi-threading
Robert Kern skrev: Although you don't really have re-entrancy issues, you will usually want one PRNG per thread for determinism. I see... numpy.random.rand does not have re-entrancy issues because of the GIL, but I get indeterminism from the OS scheduling the threads. RandomState might not release the GIL either, but preserves determinism in presence of multiple threads. Thanks. :-) Regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PRNGs and multi-threading
Sturla Molden skrev: It seems there is a special version of the Mersenne Twister for this. The code is LGPL (annoying for SciPy but ok for me). http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf http://www.math.sci.hiroshima-u.ac.jp/%7Em-mat/MT/DC/dgene.pdf Basically it encodes the thread-ids in the characteristic polynomial of the MT, producing multiple small-period, independent MTs. That solves it then. Too bad this is LGPL. It would be a very useful enhancement to RandomState. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PRNGs and multi-threading
Xavier Saint-Mleux skrev: Of course, the mathematically correct way would be to use a correct jumpahead function, but all the implementations that I know of are GPL. A recent article about this is: www.iro.umontreal.ca/~lecuyer/myftp/papers/jumpmt.pdf I know of no efficient jumpahead function for MT. Several seconds for 1000 jumps ahead is not impressive -- just generating the deviates is faster! With DCMT it is easy to create independent MTs with smaller periods. Independence here means that the characteristic polynomials are relatively prime to each other. A small period of e.g. 2**521 - 1 means that if we produce 1 billion deviates per minute, it would still take the MT about 10**143 years to cycle. Chances are we will not be around to see that happen. It also seems that nvidia has endorsed this method: http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/MersenneTwister/doc/MersenneTwister.pdf S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Erik Tollerud skrev: NumPy arrays on the GPU memory is an easy task. But then I would have to write the computation in OpenCL's dialect of C99? This is true to some extent, but also probably difficult to do given the fact that paralellizable algorithms are generally more difficult to formulate in striaghtforward ways. Then you have misunderstood me completely. Creating an ndarray that has a buffer in graphics memory is not too difficult, given that graphics memory can be memory mapped. This has nothing to do with parallelizable algorithms or not. It is just memory management. We could make an ndarray subclass that quickly puts is content in a buffer accessible to the GPU. That is not difficult. But then comes the question of what you do with it. I think many here misunderstands the issue here: Teraflops peak performance of modern GPUs is impressive. But NumPy cannot easily benefit from that. In fact, there is little or nothing to gain from optimising in that end. In order for a GPU to help, computation must be the time-limiting factor. It is not. There is not more to say about using GPUs in NumPy right now. Take a look at the timings here: http://www.scipy.org/PerformancePython It shows that computing with NumPy is more than ten times slower than using plain C. This is despite NumPy being written in C. The NumPy code does not incur 10 times more floating point operations than the C code. The floating point unit does not run in turtle mode when using NumPy. NumPy's relative slowness compared to C has nothing to do with floating point computation. It is due to inferior memory use (temporary buffers, multiple buffer traversals) and memory access being slow. Moving computation to the GPU can only make this worse. Improved memory usage - e.g. through lazy evaluation and JIT compilaton of expressions - can give up to a tenfold increase in performance. That is where we must start optimising to get a faster NumPy. Incidentally, this will also make it easier to leverage on modern GPUs. Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GPU Numpy
Olivier Grisel wrote: As usual, MS reinvents the wheel with DirectX Compute but vendors such as AMD and nvidia propose both the OpenCL API +runtime binaries for windows and their DirectX Compute counterpart, based on mostly the same underlying implementation, e.g. CUDA in nvidia's case. Here is a DirectX Compute tutorial I found: http://www.gamedev.net/community/forums/topic.asp?topic_id=516043 It pretty much says all we need to know. I am not investing any of my time learning that shitty API. Period. Lets just hope OpenCL makes it to Windows without Microsoft breaking it for security reasons (as they did with OpenGL). Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Now linear algebra or FFTs on a GPU would probably be a huge boon, I'll admit - especially if it's in the form of a drop-in replacement for the numpy or scipy versions. NumPy generate temporary arrays for expressions involving ndarrays. This extra allocation and copying often takes more time than the computation. With GPGPUs, we have to bus the data to and from VRAM as well. D. Knuth quoted Hoare saying that premature optimization is the root of all evil. Optimizing computation when the bottleneck is memory is premature. In order to improve on this, I think we have to add lazy evaluation to NumPy. That is, an operator should not return a temporary array but a symbolic expression. So if we have an expression like y = a*x + b it should not evalute a*x into a temporary array. Rather, the operators would build up a parse tree like y = add(multiply(a,x),b) and evalute the whole expression later on. This would require two things: First we need dynamic code generation, which incidentally is what OpenCL is all about. I.e. OpenCL is dynamically invoked compiler; there is a function clCreateProgramFromSource, which does just what it says. Second, we need arrays to be immutable. This is very important. If arrays are not immutable, code like this could fail: y = a*x + b x[0] = 1235512371235 With lazy evaluation, the memory overhead would be much smaller. The GPGPU would also get a more complex expressions to use as a kernels. There should be an option of running this on the CPU, possibly using OpenMP for multi-threading. We could either depend on a compiler (C or Fortran) being installed, or use opcodes for a dedicated virtual machine (cf. what numexpr does). In order to reduce the effect of immutable arrays, we could introduce a context-manager. Inside the with statement, all arrays would be immutable. Second, the __exit__ method could trigger the code generator and do all the evaluation. So we would get something like this: # normal numpy here with numpy.accelerator(): # arrays become immutable # lazy evaluation # code generation and evaluation on exit # normal numpy continues here Thus, here is my plan: 1. a special context-manager class 2. immutable arrays inside with statement 3. lazy evaluation: expressions build up a parse tree 4. dynamic code generation 5. evaluation on exit I guess it is possibly to find ways to speed up this as well. If a context manager would always generate the same OpenCL code, the with statement would only need to execute once (we could raise an exception on enter to jump directly to exit). It is possibly to create a superfast NumPy. But just plugging GPGPUs into the current design would be premature. In NumPy's current state, with mutable ndarrays and operators generating temporary arrays, there is not much to gain from introducing GPGPUs. It would only be beneficial in computationally demanding parts like FFTs and solvers for linear algebra and differential equations. Ufuncs with trancendental functions might also benefit. SciPy would certainly benefit more from GPGPUs than NumPy. Just my five cents :-) Regards, Sturla Molden ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Robert Kern wrote: I believe that is exactly the point that Erik is making. :-) I wasn't arguing against him, just suggesting a solution. :-) I have big hopes for lazy evaluation, if we can find a way to to it right. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Charles R Harris wrote: Whether the code that gets compiled is written using lazy evaluation (ala Sturla), or is expressed some other way seems like an independent issue. It sounds like one important thing would be having arrays that reside on the GPU. Memory management is slow compared to computation. Operations like malloc, free and memcpy is not faster for VRAM than for RAM. There will be no benefit from the GPU if the bottleneck is memory. That is why we need to get rid of the creation of temporary arrays, hence lazy evaluation. Having arrays reside in VRAM would reduce the communication between RAM and VRAM, but the problem with temporary arrays is still there. Also VRAM tends to be a limited resource. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Sturla Molden wrote: Memory management is slow compared to computation. Operations like malloc, free and memcpy is not faster for VRAM than for RAM. Actually it's not VRAM anymore, but whatever you call the memory dedicated to the GPU. It is cheap to put 8 GB of RAM into a computer, but graphics cards with more than 1 GB memory are expensive and uncommon on e.g. laptops. And this memory will be needed for other things as well, e.g. graphics. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Charles R Harris wrote: I mean, once the computations are moved elsewhere numpy is basically a convenient way to address memory. That is how I mostly use NumPy, though. Computations I often do in Fortran 95 or C. NumPy arrays on the GPU memory is an easy task. But then I would have to write the computation in OpenCL's dialect of C99? But I'd rather program everything in Python if I could. Details like GPU and OpenCL should be hidden away. Nice looking Python with NumPy is much easier to read and write. That is why I'd like to see a code generator (i.e. JIT compiler) for NumPy. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
James Bergstra wrote: The plan you describe is a good one, and Theano (www.pylearn.org/theano) almost exactly implements it. You should check it out. It does not use 'with' syntax at the moment, but it could provide the backend machinery for your mechanism if you want to go forward with that. Theano provides - symbolic expression building for a big subset of what numpy can do (and a few things that it doesn't) - expression optimization (for faster and more accurate computations) - dynamic code generation - cacheing of compiled functions to disk. Thank you James, theano looks great. :-D Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] improved NumPy support for boolean arrays?
If x and y are numpy arrays of bools, I'd like to be able to create expressions like the following: not x (to invert each element of x) x and y x or y x xor y (not x) or y The usual array broadcasting rules should apply. Is there any chance of getting something like this into NumPy? There is a reason for this related to Python. In Python an object will often have a boolean truth value. How would you cast an ndarray to bool? If you write something like (x and y) the Python interpreter expects this to evaluate to True or False. Thus is cannot evaluate to an ndarray with booleans. NumPy cannot change the syntax of Python. Another thing: An empty list evaluates to False in a boolean context, whereas a non-empty list evaluates to True. ndarrays behave differently. Why? Sturla Moldem ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling for free on Windows32
On 4/15/2009 6:44 PM, Matthieu Brucher wrote: There is a Python limitation for the compiler. There is a common misunderstanding that only VS2003 can be used to compile extension objects for Python 2.5. Don't believe it. There is no Python limitation for the compiler. There is a Python limitation for the C runtime (crt), but only if you share crt objects with Python. That is, if you e.g. open a FILE* pointer in C, create a Python file object from the pointer, and then read that file in Python. In this case Python would do a fread on the pointer, and it is your responsibility to make sure the same crt DLL does the fopen, fread and fclose. If you don't share any crt resources with Python, there is no limitation on the crt either. If you do a malloc and a corresponding free in C, it does not matter what crt you use. It may even be used by Python objects in the meanwhile. mingw (gcc, gfortran) can be used freely, and afaik is used to compile the official NumPy and SciPy releases for Windows. The gfortran team has made a binary installer for version 4.4 available in their Wiki. When using mingw you have to link with the the same crt as Python if crt resources are shared. That is -lmsvcr71 for Python 2.5 and -lmsvcr90 for Python 2.6. If crt resources are unshared, link with whatever crt you want. Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling for free on Windows32
On 4/17/2009 10:50 AM, David Cournapeau wrote: I think Matthieu meant you have to use VS2003 as a MS compiler. Mingw is obviously fine, since that's how numpy binaries are built for quite a long time That is what I thought he meant as well, and it seems to be a widespread belief. The fact that numpy is built with mingw proves it wrong, obviously. I just wanted to point out what the real limitation is: don't mix and blend CRTs and CRT resources. This is obviously not a Windows or Python specific problem. gfortran 4.4 is pre-released software, though, so take care. I know for sure that gfortran + MS compiler can mix on 32 bits http://gcc.gnu.org/wiki/GFortranBinaries I use this 32 bit mingw binary to build my Cython and f2py extensions. I works like a charm. I have licenses for Intel compilers at work, but I prefer gfortran 4.4. Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling for free on Windows32
On 4/17/2009 11:50 AM, Gael Varoquaux wrote: Could you elaborate on your reason. Probably silly reasons though... I have a distutils.cfg file and a build system set up that works. I don't want to bother setting up a different build system when I already have one that works. I can use the Intel compilers at home, but only if I have an open VPN connection to work. I think just using a free gcc compiler is easier. I don't want to remember switches for two compilers. Syntax for inline assembly is different (on Windows), requiring preprocessor magic and code duplication. Object files and libraries are named differently (on Windows). There are numerous resons, but they all boil down to me being lacy. S.M. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dpss windows
NumPy has support for various window functions (von Hann, hamming, blackman, bartlet, kaiser). DPSS windows (e.g. used by multitaper spectral estimators) are missing. Is there any particular reason for this? (No this is not a request, I already have my own dpss code. I just notice while reading the NumPy and SciPy docs today that dpss windows are missing.) Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.ctypeslib.ndpointer and the restype attribute
Jens Rantil wrote: Thanks Sturla. However numpy.uint8 seem to be lacking attributes 'str' and 'descr'. I'm using installed Ubuntu package 1:1.1.1-1. Is it too old or is the code broken? Oops, my fault :) def fromaddress(address, nbytes, dtype=float): class Dummy(object): pass d = Dummy() bytetype = numpy.dtype(numpy.uint8) d.__array_interface__ = { 'data' : (address, False), 'typestr' : bytetype.str, 'descr' : bytetype.descr, 'shape' : (nbytes,), 'strides' : None, 'version' : 3 } return numpy.asarray(d).view(dtype=dtype) You will have to make sure the address is an integer. Also, could you elaborate why dtype=float would work better? Because there is no such thing as a double type in Python? Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] String arrays from Python to Fortran
How did you import the function? f2py? What did you put in your .pyf file? *My Fortran code:* subroutine print_string (a, c) implicit none character(len=255), dimension(c), intent(inout):: a integer, intent(in) :: c integer :: i do i = 1, size(a) print*, a(i) end do end subroutine print_string *My Python code:* from test import * from numpy import * a = this is the test string. a = a.split() b = a a = char.array(a, itemsize=1, order = 'Fortran') print_string(a, len(a)) #this is imported from the compiled Fortran code ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.ctypeslib.ndpointer and the restype attribute
Sturla Molden wrote: def fromaddress(address, nbytes, dtype=double): I guess dtype=float works better... S.M. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Concatenating string arrays
On 3/18/2009 7:30 PM, Thomas Robitaille wrote: import numpy as np arr1 = np.array(['a','b','c']) arr2 = np.array(['d','e','f']) I would like to produce a third array that would contain ['ad','be','cf']. Is there an efficient way to do this? I could do this element by element, but I need a faster method, as I need to do this on arrays with several million elements. arr1 = np.array(['a','b','c']) arr2 = np.array(['d','e','f']) arr3 = np.zeros(6, dtype='|S1') arr3[::2] = arr1 arr3[1::2] = arr2 arr3.view(dtype='|S2') array(['ad', 'be', 'cf'], dtype='|S2') Does this help? Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Superfluous array transpose (cf. ticket #1054)
On 3/16/2009 9:27 AM, Pearu Peterson wrote: If a operation produces new array then the new array should have the storage properties of the lhs operand. That would not be enough, as 1+a would behave differently from a+1. The former would change storage order and the latter would not. Broadcasting arrays adds futher to the complexity of the problem. It seems necessary to something like this to avoid the trap when using f2py: def some_fortran_function(x): if x.flags['C_CONTIGUOUS']: shape = x.shape[::-1] _x = x.reshape(shape, order='F') _y = _f2py_wrapper(_x) shape = _y.shape[::-1] return y.reshape(shape, order='C') else: return _f2py_wrapper(x) And then preferably never use Fortran ordered arrays directly. Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Enhancements for NumPy's FFTs
1) I have noticed that fftpack_litemodule.c does not release the GIL around calls to functions in fftpack.c. I cannot se any obvious reason for this. As far as I can tell, the functions in fftpack.c are re-entrant. 2) If fftpack_lite did release the GIL, it would allow functions in numpy.fft to use multithreading for multiple FFTs in parallel (threading.Thread are ok, not special compilation needed). Both are fines to modify for 1.3. There is a version of fftpack_litemodule.c, fftpack.c and fftpack.h that does this attached to ticket #1055. The two important changes are releasing the GIL and using npy_intp for 64 bit support. Minor changes: There is a restrict qualifier in fftpack.c. If it is not compiled with C99, it tries to use similar GNU or MS extensions. There is some OpenMP pragmas in ftpack_litemodule.c. If you don't compile with OpenMP support, they do nothing. If you do compile with OpenMP, they will make certain FFTs run in parallel. I can comment them out if you prefer. Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Enhancements for NumPy's FFTs
Would it be possible to make the changes as a patch (svn diff) - this makes things easier to review. I've added diff files to ticket #1055. Yes, I would be more comfortable without them (for 1.3). This is typically the kind of small changes which can be a PITA to deal with just before a release because it breaks some platforms in non obvious ways. Ok, they are commented out. For the restrict keyword support, I will add a distutils check to avoid the compiler-specifics (again after 1.3). I've added a header file npy_restrict.h that defines a NPY_RESTRICT symbol. Best regards, Sturla ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Enhancements for NumPy's FFTs
Mon, 16 Mar 2009 00:33:28 +0900, David Cournapeau wrote: Also, you could post the patch on the http://codereview.appspot.com site. Then it would be easier to both review and to keep track of its revisions I have posted the files here: http://projects.scipy.org/numpy/ticket/1055 Sturla ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Enhancements for NumPy's FFTs
Well, that's nearly as good. (Though submitting a single svn diff containing all changes could have been a bit more easy to handle than separate patches for each file. But a small nitpick only.) The problem is I am really bad at using these tools. I have TortoiseSVN installed, but no idea how to use it. :( I cannot delete any file attachment in trac, but I can overwrite the files I've posted. S.M. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Superfluous array transpose (cf. ticket #1054)
Regarding ticket #1054. What is the reason for this strange behaviour? a = np.zeros((10,10),order='F') a.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False (a+1).flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reminder: code freeze for bet at the end of the WE
Will memmap be fixed to use offsets correctly before 1.3? hi, Just a friendly reminder that I will close the trunk for 1.3.0 at the end of 15th March (I will more likely do it at the end of Monday Japan time which roughly corresponds to 15th March midnight Pacific time), cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reminder: code freeze for bet at the end of the WE
Will memmap be fixed to use offsets correctly before 1.3? I posted this to scipy-dev (possibly wrong list) on March 9, so I'll repeat it here: In Python 2.6, mmap has a offset keyword. NumPy's memmap should use this to allow big files to be memory mapped on 32 bit systems. Only a minor change is required: if float(sys.version[:3]) 2.5: bytes = bytes - offset mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=offset) self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm, offset=0, order=order) else: mm = mmap.mmap(fid.fileno(), bytes, access=acc) self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm, offset=offset, order=order) Instead of just: mm = mmap.mmap(fid.fileno(), bytes, access=acc) self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm, offset=offset, order=order) Reagards, Sturla Molden __all__ = ['memmap'] import warnings from numeric import uint8, ndarray, dtype import sys dtypedescr = dtype valid_filemodes = [r, c, r+, w+] writeable_filemodes = [r+,w+] mode_equivalents = { readonly:r, copyonwrite:c, readwrite:r+, write:w+ } class memmap(ndarray): Create a memory-map to an array stored in a file on disk. Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory. Numpy's memmap's are array-like objects. This differs from Python's ``mmap`` module, which uses file-like objects. Parameters -- filename : string or file-like object The file name or file object to be used as the array data buffer. dtype : data-type, optional The data-type used to interpret the file contents. Default is `uint8` mode : {'r+', 'r', 'w+', 'c'}, optional The file is opened in this mode: +--+-+ | 'r' | Open existing file for reading only.| +--+-+ | 'r+' | Open existing file for reading and writing. | +--+-+ | 'w+' | Create or overwrite existing file for reading and writing. | +--+-+ | 'c' | Copy-on-write: assignments affect data in memory, but | | | changes are not saved to disk. The file on disk is | | | read-only. | +--+-+ Default is 'r+'. offset : integer, optional In the file, array data starts at this offset. `offset` should be a multiple of the byte-size of `dtype`. Requires `shape=None`. The default is 0. shape : tuple, optional The desired shape of the array. By default, the returned array will be 1-D with the number of elements determined by file size and data-type. order : {'C', 'F'}, optional Specify the order of the ndarray memory layout: C (row-major) or Fortran (column-major). This only has an effect if the shape is greater than 1-D. The defaullt order is 'C'. Methods --- close Close the memmap file. flush Flush any changes in memory to file on disk. When you delete a memmap object, flush is called first to write changes to disk before removing the object. Notes - The memmap object can be used anywhere an ndarray is accepted. Given a memmap ``fp``, ``isinstance(fp, numpy.ndarray)`` returns ``True``. Notes - Memory-mapped arrays use the the Python memory-map object which (prior to Python 2.5) does not allow files to be larger than a certain size depending on the platform. This size is always 2GB even on 64-bit systems. Examples data = np.arange(12, dtype='float32') data.resize((3,4)) This example uses a temporary file so that doctest doesn't write files to your directory. You would use a 'normal' filename. from tempfile import mkdtemp import os.path as path filename = path.join(mkdtemp(), 'newfile.dat') Create a memmap with dtype and shape that matches our data: fp = np.memmap(filename, dtype='float32', mode='w+', shape=(3,4)) fp memmap([[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]], dtype=float32) Write data to memmap array: fp[:] = data[:] fp memmap([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32) Deletion flushes memory changes to disk before removing the object: del
Re: [Numpy-discussion] Reminder: code freeze for bet at the end of the WE
Can you open a ticket for this? Done. Ticket #1053 Surla ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion