[Numpy-discussion] ANN: PyCPX 0.01, a numpy/cython wrapper for CPlex.
Hello, I'm pleased to announce the first release of a wrapper I wrote for IBM's CPlex Optimizer Suite. It focuses on ease of use and seamless integration with numpy, and allows one to naturally specify constrained linear and quadratic optimization problems over real, boolean, and integer variables. The wrapper is written in cython/C++ for speed and released under the LGPL license. For documentation and examples, please see the website: http://www.stat.washington.edu/~hoytak/code/pycpx/index.html. Please send me any comments, questions and suggestions! I am quite open to feedback. Thanks, --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm
Well, the hungarian algorithm has a theoretical upper bound of O(n^3), with n being the number of nodes, which is pretty much the best you can do if you have a dense graph and make no assumptions on capacities. OK, your input is making my motivation to fight with Jonker-Volgenant go down. I'll see with the other people involved if we still target Jonger-Volgenant, or if we stick with the hungarian algorithm, in which case the problem is solved. I'd do that route. From my (somewhat anecdotal) observations, a well coded version of the Hungarian algorithm will beat moderately well-coded implementations of other algorithms on most common types of problems; the data structures in other algorithms are harder to get right. Also, they often tend to show their strengths only on sufficiently large problems. I'd guess that for a speed up, your time would be better spent profiling and maybe cythoning your current implementation of the Hungarian algorithm (also note that there's a few bsd/mit C++ implementations floating around, check the wikipedia page). Hope that answers your questions :-). It does. Thanks a lot, it is very useful to have expert knowledge available. Well, I don't know if I'm an expert in this area, but I rub shoulders with a few and I'm glad to be of help. -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm
On Mon, May 16, 2011 at 12:18 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: Following a suggestion by Joseph, I am trying to implement the Jonker-Volgenant algorithm for best linear assignment in Python, using numpy. Unsuprisingly, it is proving time-costly. I cannot afford to spend too much time on this, as it not to solve a problem of mine, but for the scikits.learn. Thus I was wondering if someone had a BSD-licensed Python version of the algorithm that he would be willing to share. There are a number available. The lemon graph library (http://lemon.cs.elte.hu/trac/lemon) has solvers for this problem and has python bindings. It's under the boost license (is that ok?). It might be a bit heavyweight for this, though, but it's great software. -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm
In this case, lemon exists only as C++ header files, with no compiled code. A few cython functions should make it pretty simple. However, yeah, it is a bit heavyweight. Thanks for the suggestion. Unfortunately, I do not want to add compiled code (or an external dependency) to the scikit for this solver, as it is a fairly minor step for us. I particular chances are that it will never be a computational bottleneck on our poblems. If that's the case, I suggest just using the Hungarian algorithm. It's a little slower on large graphs, but it still is pretty quick. And there's tons of implementations out there, as it's often a standard coding project in undergrad algorithms courses. http://en.wikipedia.org/wiki/Hungarian_algorithm has links to a few. -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm
I might go that way: I already have pure-Python code that implements it and that I have been using for a year or so. Fair enough -- though you'd probably get a big speed up moving to cython. It's a little slower on large graphs, but it still is pretty quick. Can you put any numbers on the difference, or rule of thumbs? Is there an algorithmic complexity difference? If so, what kind of dependency are we talking about? Is it only in the prefactors? Well, the hungarian algorithm has a theoretical upper bound of O(n^3), with n being the number of nodes, which is pretty much the best you can do if you have a dense graph and make no assumptions on capacities. If you have a sparse graph, other algorithms can exploit that structure a bit more efficiently (e.g. lemon's implementation is O(nm logn), with m being the number of edges). Other algorithms can use capacities and include the capacity bounds as part of the theoretic performance of the algorithms, e.g. http://arxiv.org/abs/1105.1569. However, all these bounds are generally very much upper bounds and actual practical performance varies greatly. The hungarian algorithm is popular mainly because it's the best one that can be coded up efficiently in a few hundred lines of code. Hope that answers your questions :-). --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling numpy using icc gets missing library error
Thanks for the thorough description of getting everything to work. You're welcome. I'm glad people find it helpful :-). -ipo -- okay for linker / c++, but not elsewhere. For C, it causes the long_double_representation() function in setup_common() to fail (I'll note this on the corresponding ticket), and causes f2py to fail to generate correct wrappers. Would it be enough to put a comment in the intelccompiler.py code not to use the -ipo flag, or do you think there's something to fix here? Doesn't seem to be a new problem by the way: http://cens.ioc.ee/pipermail/f2py-users/2006-January/001229.html Well, to be honest, it's debatable how this should be fixed, if at all. The issue is that the intermediate object files and archives are in their own format -- hence the need for xiar instead of ar -- and I suspect that that's what's messing up f2py and the long_double_representation() stuff (I don't actually know how either really works, so I could be wrong here). Reverse engineering those formats doesn't make sense. Now it's possible that there may be a better way of determining long_double_representation(), but I don't know. IMO it's not worth it. -O3 -- okay for C++ / fortran, but not C. For C, it causes einsum to hang. -O3 is the default optimization level, so this is a bug I guess. There's also another report in #1378 that -O3 doesn't work with numpy 1.4.0 (which does not have einsum). Should it be lowered to -O2 by default? I'm not sure what's going on here. I think it is likely a bug in icc brought out by the really low-level code in einsum. However, the documentation said that the extra optimizations enabled by -O3 are likely not going to help except in large, predictable loops where more aggressive loop transformations are doable and help. Thus my vote is to put in O3 for C++ and fortran, but O2 on C (because of this bug). However, if the einsum bug gets fixed, nice! (BTW, I think it's also be possible to change per-file optimization settings with intel specific pragmas, but I don't know for sure. If so, that could be a workaround.) Fortran: -static -DMKL_LP64 -mkl -xHOST -O3 -fPIC -fp-model strict C: -static -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict C++: -static -DMKL_LP64 -mkl -xHOST -O3 -fno-alias -fPIC -fp-model strict Linker: -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict -openmp -lpthread I'm not sure which of those flags would be appropriate as a default in distutils, perhaps only fp-model-strict? If you could help put together a patch for numpy.distutils, that would be very helpful I think. The rest of your description could be put at http://scipy.org/Installing_SciPy/Linux. -fp-model strict and -fno-alias, as it's analogous to the -fno-strict-aliasing gcc flag required by python. The -static flag is probably optional. -xHOST is the same as gcc's --march=native, so it seems like the nature of the distutils system isn't appropriate. I'll try to get a patch together for the distutils. However, I'd want someone to review it since I'm not that confident in my knowledge of the distutils code. I can also try to turn this into a more complete description for the wiki. -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling numpy using icc gets missing library error
Okay, so here's a follow up on my progress. Apologies in advance for the long email here, but I'd like to be thorough about this before I forget. For the sake of completeness, here's my setup. I'm running python 2.7.1 compiled from source with icc. I'm running ubuntu 10.10 on one of intel's new processors (a i7-2600). The goal is to compile numpy and scipy both with intel's compiler and intel's mkl. I finally got numpy to compile with icc / ifort with pretty much all of the tests passing. It's a bit of work, partly cause I was trying to be an optimization junky, but I thought I'd share my discoveries. Scipy also compiles, but with some errors (which are likely due to me not configuring f2py correctly). First, I wanted to compile things with intel's interprocedural optimization enabled, and that seems to work, but only if -O2 is used for the compiling stage and -O1 is used for the linking stage. If -O3 is given for the compiling stage, then the einsum test goes into some sort of infinite loop and hangs. If -O2 or -O3 are given for the linker, then there are random other segfaults (I forget where). However, with these optimization levels, things are stable. Also, if I turn off -ipo, then -O3 works fine for compiling. I'm not sure if this reflects bugs in the flags I'm passing to the intel compiler or in icc/ifort itself. Second, to use -ipo, it's critical that xiar is used instead of ar to create object archives. This needed to be changed in fcompiler/intel.py and intelccompiler.py. I've attached a diff of these files that gives working options for me. I don't know if these options are set in the correct place or not, but perhaps they would be helpful: The essence of it is the following (from intelccompiler.py) linker_flags = '-O1 -ipo -openmp -lpthread -fno-alias -xHOST -fPIC ' compiler_opt_flags = '-static -ipo -xHOST -O2 -fPIC -DMKL_LP64 -mkl -wd188 -g -fno-alias ' icc_run_string = 'icc ' + compiler_opt_flags icpc_run_string = 'icpc ' + compiler_opt_flags linker_run_string = 'icc ' + linker_flags + ' -shared ' with the rest of this diff setting these options. In this case, the -openmp and -lpthread are required for linking with the threaded layer of the MKL. This could possibly be ripped out of there. Also, the -fno-alias is critical for the c compiler -- random segfaults and memory corruptions occur without it. The -DMKL_LP64 is to ensure proper linking with the lp64 (32 bit indices) part of mkl, instead of the ilp64 (64 bit indices). The latter isn't supported by the lapack_lite module -- things compile, but don't work. -mkl may or may not help things. For the fortran side, this was the compiler string: compiler_opt_flags = '-static -ipo -xHOST -fPIC -DMKL_LP64 -mkl -wd188 -g -fno-alias -O3' Here you don't need the -fno-alias and -O3 seems to work. Third, it was a bit of a pain to figure out how to get the linking/detection done correctly, as somehow order matters, and it was easy to get undefined symbols, runtime errors, etc. Very annoying. In the end, my site.cfg file looked like this: [DEFAULT] library_dirs=/usr/intel/current/mkl/lib/intel64 include_dirs=/usr/intel/current/mkl/include mkl_libs = mkl_rt, mkl_core, mkl_intel_thread, mkl_intel_lp64 blas_libs = mkl_blas95_lp64 lapack_libs = mkl_lapack95_lp64 [lapack_opt] library_dirs=/usr/intel/current/mkl/lib/intel64 include_dirs=/usr/intel/current/mkl/include/intel64/lp64 libraries = mkl_lapack95_lp64 [blas_opt] library_dirs = /usr/intel/current/mkl/lib/intel64 include_dirs = /usr/intel/current/mkl/include/intel64/lp64 libraries = mkl_blas95_lp64 where /usr/intel/current/ points to my intel install location. It's critical that the mkl_libs are given in that order. I didn't find another combination that worked. Finally, I attached my bash setup script for environment variables. I don't know how much of a role those play in things, but I had them in place when things started working, so I should put them here. Now, on to scipy. With all these options in place, scipy compiles fine. However, there are two problems, and these don't seem to go away at any optimization level. I'm looking for suggestions. I'm guessing it's some sort of configuration error. 1) The CloughTocher2DInterpolator segfaults every time it's called to interpret values. I couldn't manage to track it down -- it's in the cython code somewhere -- but I can give more details next time, I disabled it for now. 2) f2py isn't getting the interfaces right. When I run the test suite, I get about 250 errors, all of the form: ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (5,5,) and so on, with different tuples on the end. Other than these errors, everything seemed to work great. What might I be doing wrong there? Thanks! -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy
[Numpy-discussion] ANN: TreeDict 0.12, a lightweight library for hierarchical parameter/data management
Hello, I thought the numpy/scipy community might be particularly interested in this library, so here goes: I'm pleased to announce a release of the next version of TreeDict, a dictionary-like, hierarcical python container to simplify the bookkeeping surrounding parameters, variables and data. It aims to be fast, lightweight, intuitive, feature-rich and stable. While intended for general python coding, it includes a number of features particularly useful for scientific programming. It is similar in basic functionality to MATLAB structures in terms of concise syntax and implicit branch creation, which, I've found, makes organizing parameters/data much easier than with other methods. In addition, though, TreeDict implements all the methods of regular dictionaries, pickling, fast non-intersecting hashing for efficient cache lookups, manipulations on the tree structure, a system for forward referencing branches to make lists of parameters more readable, BSD license, and a full suite of unit tests. More info can be found at http://www.stat.washington.edu/~hoytak/code/treedict/index.html. Questions or comments or feedback are welcome. Thanks! --Hoyt P.S. One thought regarding scipy -- since this library is fairly similar to matlab structs, it may be easy to integrate this with the current matlab io routines to translate things between the two more easily. In this direction, I've added in options to two methods, fromdict and convertTo, to make conversion to/from the existing python representations of the matlab objects. Discussion/comments on this point are welcome. + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling numpy using icc gets missing library error
Okay, last update. I finally have got everything to work. It turns out the problems that I had earlier with f2py were due to intel's -ipo flag. So the only place this flag works is with the C++ code, not fortran or c. Also, I forgot to mention -- the qhull_a.h method has a workaround for some aspect of intel's compiler that is no longer needed and in fact causes an error. It's for a macro that simply suppresses unused variable warnings. In my opinion, it could be removed, as it's only used two places, and scipy spits out enough warnings that that is hardly an issue. Thus my change was around line 102 in qhul_a.h. Replace #if defined(__INTEL_COMPILER) !defined(QHULL_OS_WIN) template typename T inline void qhullUnused(T x) { (void)x; } # define QHULL_UNUSED(x) qhullUnused(x); #else # define QHULL_UNUSED(x) (void)x; #endif with #define QHULL_UNUSED(x) Also, I could still not get the CloughTocher2DInterpolator to not segfault. Thus I had to disable it by raising an exception in the init method. With this in place, everything compiles and the unit tests pretty much all run, with 5 failures mostly due to numerical accuracy stuff and 9 errors due to the interpolator. In summary, my final environment variables that give the flags for compiling stuff are: export FLAGS='-xHOST -static -fPIC -g -fltconsistency' export CFLAGS=$FLAGS -O2 -fno-alias export CPPFLAGS=$FLAGS -fno-alias -ipo -O3 export CXXFLAGS=$CPPFLAGS export FFLAGS=$FLAGS -O3 export F77FLAGS=$FFLAGS export F90FLAGS=$FFLAGS export LDFLAGS=-xHOST -O1 -openmp -lpthread -fPIC And the arguments given to the fortran compiler in fcompiler/intel.py are: compiler_opt_flags = '-static -xHOST -fPIC -DMKL_LP64 -mkl -g -O3' I'd be happy to answer any more questions about the process as needed. Now, back to my real work. -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compiling numpy using icc gets missing library error
Okay, even better news. After a little bit of probing around, it turns out that some of the errors, namely the one in the interpolator, were due to intel's handling of floating point values at higher optimization levels. Thus the two rather important flags, if you wish to avoid difficult-to-explain segfaults, are -fp-model strict ( disables unsafe fp optimizations) -fno-alias (C C++, not fortran) Now here's a summary of the other errors: -ipo -- okay for linker / c++, but not elsewhere. For C, it causes the long_double_representation() function in setup_common() to fail (I'll note this on the corresponding ticket), and causes f2py to fail to generate correct wrappers. -O3 -- okay for C++ / fortran, but not C. For C, it causes einsum to hang. Thus the highest optimization levels I could find in which everything compiled and ran were: Fortran: -static -DMKL_LP64 -mkl -xHOST -O3 -fPIC -fp-model strict C: -static -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict C++: -static -DMKL_LP64 -mkl -xHOST -O3 -fno-alias -fPIC -fp-model strict Linker: -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict -openmp -lpthread Enjoy, --Hoyt On Fri, Mar 25, 2011 at 2:30 PM, Pauli Virtanen p...@iki.fi wrote: On Fri, 25 Mar 2011 13:46:41 -0700, Hoyt Koepke wrote: [clip] Also, I could still not get the CloughTocher2DInterpolator to not segfault. Backtrace would be useful here. It's probably best to recompile with -O0 and some debug flags enabled in the compiler to get something reasonable out. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Compiling numpy using icc gets missing library error
/libpthread.so.0 (0x7f5915b1) libc.so.6 = /lib/libc.so.6 (0x7f591578c000) libdl.so.2 = /lib/libdl.so.2 (0x7f5915588000) libutil.so.1 = /lib/libutil.so.1 (0x7f5915385000) /lib64/ld-linux-x86-64.so.2 (0x7f591770c000) I also attached the build log here (which is the same for the first error up to the crash). Any suggestions of where to look or where something could have gone wrong? Thanks! -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ build.log.gz Description: GNU Zip compressed data lines_dump.p.gz Description: GNU Zip compressed data ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ndarray special methods should raise NotImplemented instead of TypeError
Apologies for cross posting -- I posted this to the scipy developer list a few days ago, since it's really a development question, but after getting no response I reread my question and realized it was a bit unclear and also was more appropriate for numpy. So here goes. Currently, the numerical special methods of the ndarray class, like __add__, __gt__, __mul__, etc., raise a TypeError when they encounter a class or type they don't understand. I do not think this is the correct behavior. Instead, they should raise a NotImplemented error. Here's the logic behind this. Classes can also define right operators that correspond to each of the above methods -- e.g. __radd__, __le__, __rmul__. However, these are only called under three conditions (see http://docs.python.org/reference/datamodel.html#object.__radd__): 1. When the left object doesn't implement the left operator. 2. When the left object raises a NotImplemented exception. 3. When the right object is a subclass of the left object. In this case, the left operator method is never called. In my use case, I've been working on implementing a matrix-like class of symbolic objects that have to interact seamlessly with numpy arrays. It's a wrapper for CPlex's Concert library, which effectively presents variables in an optimization problem as symbolic C++ objects that can be used to build the model. I'm trying to wrap this to work with numpy (I'd be happy to share my code when it's done) and the current behavior is causing a bit of a headache. Consider working with A + X, where A is an ndarray and X is an instance of my class that represents my symbolic objects, and let's suppose my matrix class implements __radd__ sensibly. Obviously, 1. doesn't apply here. If I go with 2., a TypeError is raised, which blows things apart. So the only solution I have is 3. However, it isn't possible for me to use the underlying array machinery provided by ndarray, as the python class simply wraps other C++ containers. Thus, if I subclass ndarray, the underlying array is meaningless -- while I do implement slicing and various other array-like methods, the only reason I'm subclassing ndarray is (3). The big problem with this is that many of the attributes and methods of ndarray don't make sense for my class, so I have to cover them up with dummy methods that simply raise exceptions. Doable, but far from ideal. For now, I'm doing (3) to get it to work, but it would be really nice if ndarray would raise a NotImplemented error instead of a TypeError. Is there any reason / use case justifying the current behavior? Would it be possible/easy to change it? Thanks! -- Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ndarray special methods should raise NotImplemented instead of TypeError
You can also define __array_priority__ in your class. If your class isn't a subclass, then numpy will always call your class methods whether or not your class is on the right or left. Sounds like you don't want to sublass ndarray here so that should work. Excellent! This sounds like exactly what I was hoping for. I assume in my case, I just need __array_priority__ 10 to gain priority over a matrix (from reading http://docs.scipy.org/doc/numpy/reference/arrays.classes.html). Is that correct? Thanks! --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slicing slower than matrix multiplication?
One thing to note is that dot uses optimized atlas if available, which makes it quite faster than equivalent operations you would do using purely numpy. I doubt that's the reason here, since the arrays are small, but that's something to keep in mind when performances matter: use dot wherever possible, it is generally faster than prod/sum, This is quite true; I once had a very large matrix (600 x 200,000) that I needed to normalize. Using .sum( ) and /= took about 30 minutes. When I switched to using dot( ) to do the same operation (matrix multiplication with a vector of 1's, then turning that into a diagonal matrix and using dot() again to normalize it), it dropped the computation time down to about 2 minutes. Most of the gain was likely due to ATLAS using all the cores and numpy only using 1, but I was still impressed. --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Faster way to generate a rotation matrix?
Hello, def rotation(theta, R = np.zeros((3,3))): cx,cy,cz = np.cos(theta) sx,sy,sz = np.sin(theta) R.flat = (cx*cz - sx*cy*sz, cx*sz + sx*cy*cz, sx*sy, -sx*cz - cx*cy*sz, -sx*sz + cx*cy*cz, cx*sy, sy*sz, -sy*cz, cy) return R Pretty evil looking ;) but still wouldn't mind somehow getting it faster I would definitely encourage you to check out cython. I have to write lots of numerically intensive stuff in my python code, and I tend to cythonize it a lot. In cython, the above would be (something like): from numpy cimport ndarray cdef extern from math.h: double cos(double) double sin(double) def rotation(ndarry[double] theta, ndarray[double, ndim=2] R = np.zeros((3,3))): cdef double cx = cos(theta[0]), cy = cos(theta[1]), cz = cos(theta[2]) cdef double sx = sin(theta[0]), sy = sin(theta[1]), sz = sin(theta[2]) R[0,0] = cx*cz - sx*cy*sz R[0,1] = cx*sz + sx*cy*cz R[0,2] = sx*sy ... R[2,2] = cy return R And that will be probably be orders of magnitude faster than what you currently have, as everything but the function call and the return statement would become C code. Compilers these days are very good at optimizing that kind of thing too. --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to get the corresponding eigenvector for a specific eigen value?
I have tried with linalg.solve(a, b), where I put a as the Matrix A - eigen value* unit matrix, and b as the zero matrix. But the solution returned is a zero matrix, which I really find disappointing. So if you're trying to solve (A - \lambda I) x = b, try appending an extra row to your (A - \lambda I) matrix with all ones; the output of this will be the some of the elements of your eigenvector. Append a 1 to the end of your b. This will exclude the case where x = 0 as valid solution, as you'll then require that \sum_i x_i = 1. You'll probably want to renormalize properly afterwards. Quick and dirty, but should work. --Hoyt -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] glibc error
Not sure what the problem is in your case, but I had the same error a while back. For some reason, ATLAS thought it should be compiled as 64 bit but numpy as 32 bit. . It went away after I passed the -b 32 flag to configure in ATLAS (I think). Thus that's one thing to check. Also, make sure you're using ATLAS 3.8.x, as I had problems with the 3.9 series (don't know if that would be an issue still). --Hoyt On Sat, Jan 24, 2009 at 8:47 PM, Gideon Simpson simp...@math.toronto.edu wrote: Having built an up to date lapack and ATLAS against gcc 4.3.2, I tried installing numpy 1.2.1 on Python 2.5.4. When testing I get: Python 2.5.4 (r254:67916, Jan 24 2009, 00:27:20) [GCC 4.3.2] on linux2 Type help, copyright, credits or license for more information. import numpy numpy.test() Running unit tests for numpy NumPy version 1.2.1 NumPy is installed in /usr/local/nonsystem/simpson/lib/python2.5/site- packages/numpy Python version 2.5.4 (r254:67916, Jan 24 2009, 00:27:20) [GCC 4.3.2] nose version 0.10.4 ..F K *** glibc detected *** python: free(): invalid next size (fast): 0x1196b550 *** I then have to kill python to get control again. -gideon ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoy...@gmail.com ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] glibc memory corruption when running numpy.test()
Hi, Sorry; my first message wasn't under 40 KB with the attachments, so here's the same message but with the log files at http://www.stat.washington.edu/~hoytak/logs.tar.bz2. Which ones ? Sorry; ATLAS = 3.9.4 and lapack=3.2. I'll give 3.8.2 a shot per your advice. You should not do that, it won't work as you would expect. It is a good rule to assume that you should never set the *FLAGS variable unless you really know what you are doing. Fair enough. In my case I was having some issues with 32 bit and 64 bit mismatches (I think that fftw defaulted to 32 bit), so I set the flags variables. I also wanted to get the extra few percent of performance by using the tuning flags. I'll back up a bit now before playing with them now, though. First, can you try without any blas/lapack (Do BLAS=None LAPACK=None ATLAS=None python setup.py ) ? This now works in the sense that it doesn't hang. I still get a number of test failures, however (build + test logs attached). Thanks a lot for the help! --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + [EMAIL PROTECTED] ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] glibc memory corruption when running numpy.test()
Hi, I honestly don't think those flags matter much in the case of numpy/scipy. In particular, using SSE and co automatically is simply impossible in numpy case, since the C code is very generic (non-aligned - non contiguous items) and the compiler has no way to know at compile time which cases are contiguous. Good to know. I'll try to surpress my desire to optimize and not care about them :-). Those errors seem link to the flags you have been using. Some errors are really strange (4 vs 8 bytes types), but I don't see how it could be explained by a mismatch of 32 vs 64 bits machine code (to the best of my knowledge, you can't mix 32 and 64 bits machine code in one binary). Maybe a compiler bug when using -march flag. Please try building numpy wo BLAS/LAPACK and wo compiler flags first, to test that the bare configuration does work, and that the problems are not due to some bugs in your toolchain/OS/etc... The test suite should run without any failure in this case; then, we can work on the BLAS/LAPACK thing, I believe the logs I attached (or rather linked to) don't involve atlas or lapack or any compiler flags. I agree that they are strange and I may have something weird floating around. It's getting late here, so I'll double check everything in the morning and may try to run gcc's test suite to verify that isn't the problem. Thanks again! --Hoyt cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + [EMAIL PROTECTED] ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] glibc memory corruption when running numpy.test()
Hello, Sorry about this bug report I know how much us programmers like these kind of bugs :-/. In testing the latest svn version of numpy (6083), I get a memory corruption error: test_csingle (test_linalg.TestSolve) ... ok test_double (test_linalg.TestSolve) ... ok test_empty (test_linalg.TestSolve) ... ok Check that matrix type is preserved. ... ok Check that matrix type is preserved. ... ok test_nonarray (test_linalg.TestSolve) ... ok test_single (test_linalg.TestSolve) ... ok Ticket #652 ... *** glibc detected *** python: free(): invalid next size (fast): 0x02c80ef0 *** ... and it hangs. I don't get any errors in my installation. The system is a 64bit intel xeon server running debian unstable, compiler is gcc-4.3.2 This *doesn't* happen with version 1.2.1. I do get some other test failures with that version, which I can post if you all think they are related, but they don't seem to be at first glance. I've attached the full log. If there is anything more you want me to do with this, I'd be happy to. Thanks! --Hoyt + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + [EMAIL PROTECTED] ++ numpy_test.log.gz Description: GNU Zip compressed data ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] glibc memory corruption when running numpy.test()
Sorry; I also added '-fPIC' to the compile flags, though it may work without having it there. I just got errors related to not having it and so threw it at everything... --Hoyt On Wed, Nov 19, 2008 at 11:41 PM, Hoyt Koepke [EMAIL PROTECTED] wrote: Attached. I had a bunch of issues getting things to install with lapack and ATLAS. In the end I had to specify the following environment variables (+ appropriate command line options to ATLAS lapack) to get it to work. If there's an easier way, let me know. export FLAGS='-march=core2 -mtune=core2 -m64' export CFLAGS=$FLAGS export CPPFLAGS=$FLAGS export CXXFLAGS=$FLAGS export FFLAGS=$FLAGS export F77FLAGS=$FLAGS export LDFLAGS=$FLAGS When compiling numpy, I had to manually add '-shared' to LDFLAGS to get it to work. Thanks! --Hoyt On Wed, Nov 19, 2008 at 11:33 PM, David Cournapeau [EMAIL PROTECTED] wrote: On Wed, 2008-11-19 at 23:23 -0800, Hoyt Koepke wrote: Hello, Sorry about this bug report I know how much us programmers like these kind of bugs :-/. In testing the latest svn version of numpy (6083), I get a memory corruption error: test_csingle (test_linalg.TestSolve) ... ok test_double (test_linalg.TestSolve) ... ok test_empty (test_linalg.TestSolve) ... ok Check that matrix type is preserved. ... ok Check that matrix type is preserved. ... ok test_nonarray (test_linalg.TestSolve) ... ok test_single (test_linalg.TestSolve) ... ok Ticket #652 ... *** glibc detected *** python: free(): invalid next size (fast): 0x02c80ef0 *** ... and it hangs. I don't get any errors in my installation. The system is a 64bit intel xeon server running debian unstable, compiler is gcc-4.3.2 Could you give us the complete build log (e.g. when build from scratch: remove the build directory and give us the log of python setup.py build build.log) ? You have tens of failures which are more likely due to a build problem, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + [EMAIL PROTECTED] ++ -- + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + [EMAIL PROTECTED] ++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy, swig and TNT-Arrays
Hi Rolf, Just curious -- have you considered using the blitz++ library (http://www.oonumerics.org/blitz/)? There seems to be a lot of overlap in terms of functionality. If you use blitz++, it's largely included in scipy as part of weave. Additionally, I already have code that generates wrappers to functions taking such arrays using weave. If blitz++ would work, I'll send them to you. -- Hoyt On Tue, Nov 11, 2008 at 2:28 AM, Rolf Wester [EMAIL PROTECTED] wrote: Charles R Harris wrote: On Tue, Nov 11, 2008 at 2:19 AM, Rolf Wester [EMAIL PROTECTED]wrote: Charles R Harris wrote: On Tue, Nov 11, 2008 at 1:24 AM, Rolf Wester [EMAIL PROTECTED]wrote: Hi all, I would like to wrap some C++ classes that use TNT-Arrays. Is it possible to pass numpy arrays to C++ functions that expect TNT-Arrays as function parameter? Does anybody know how the wrappers could be generated using swig? I would be very appreciative for any help. With kind regards IIRC, TNT does vectors and matrices, they have constructors, and they are contiguous. I think you can make wrappers, but it isn't going to be anything straight forward unless you can reuse the memory from a numpy array and I don't recall that that sort of constructor is available. Is TNT still active? It looked pretty dead last time I looked several years ago. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion TNT has constructors like: TNT::Array1Ddouble(int n, double * data) which do not allocate a new C-array but that use data as their data-array. I don't think there is any easy way to do what you want without writing some code somewhere along the line. You can expose the C++ functions and TNT to python, but to use numpy arrays you will need some way to get the data back and forth between TNT arrays and numpy arrays. I suspect you will end up just copying data into TNT arrays, calling your function, and then copying data back out of the result. Cython might be an alternative to swig for that. It would help to have a better idea of what you want to do. Do you just want to wrap an existing bunch of functions that use TNT? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion It's my own code so I have the choice how to do it. Until now I used the typemaps defined in numpy.i, so I had either to use the 1-dimensional arrays even in case of multidimensional data or to copy the data. I wondered wether there is a more elegant way of using numpy arrays on the python side and TNT::Arrays on the C++ side without having to explicitely write extra code. Rolf -- # Dr. Rolf Wester # Fraunhofer Institut f. Lasertechnik # Steinbachstrasse 15, D-52074 Aachen, Germany. # Tel: + 49 (0) 241 8906 401, Fax: +49 (0) 241 8906 121 # EMail: [EMAIL PROTECTED] # WWW: http://www.ilt.fraunhofer.de ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem installing numpy using scons
Excellent; thank you -- I will test it out soon. -- Hoyt On Sat, Oct 25, 2008 at 2:57 AM, David Cournapeau [EMAIL PROTECTED] wrote: On Sun, Oct 19, 2008 at 7:57 PM, David Cournapeau [EMAIL PROTECTED] wrote: The trunk requires some changes which are not released yet. It will be in the next release of numscons (0.9.3). Which has just been released. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] problem installing numpy using scons
Hello, I'm trying to install the latest numpy using setupscons.py, but it gives me an error that I can't track down. It's in the latest numpy release (5946). with the latest numpyscons from the 0.9.3 branch. Here's the last part of output: is bootstrapping ? True Executing scons command (pkg is numpy.core): /usr/local2/bin/python /usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons.py -f numpy/core/SConstruct -I. scons_tool_path= src_dir=numpy/core pkg_name=numpy.core log_level=50 distutils_libdir=../../../../build/lib.linux-x86_64-2.5 cc_opt=gcc cc_opt_path=/usr/bin f77_opt=gfortran f77_opt_path=/usr/bin cxx_opt=g++ cxx_opt_path=/usr/bin include_bootstrap=../../../../numpy/core/include silent=0 bootstrapping=1 scons: Reading SConscript files ... Checking for C header file Python.h... yes Checking size of short ... yes Checking size of int ... yes Checking size of long ... yes Checking size of float ... yes Checking size of double ... yes Checking size of long double ... yes Checking size of Py_intptr_t ... yes Checking whether PY_LONG_LONG is declared... yes Checking size of PY_LONG_LONG ... yes Checking whether CHAR_BIT is declared... yes Checking whether PRIdPTR is declared... yes Checking if math lib [] is usable for numpy ... No ! Checking for C library m... yes Checking if math lib ['m'] is usable for numpy ... Yes ! AttributeError: SConfBase instance has no attribute 'CheckFuncsAtOnce': File /usr/local2/src/numpy/numpy/core/SConstruct, line 2: GetInitEnvironment(ARGUMENTS).DistutilsSConscript('SConscript') File /usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/core/numpyenv.py, line 108: build_dir = '$build_dir', src_dir = '$src_dir') File /usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons-local-1.0.1/SCons/Script/SConscript.py, line 533: return apply(_SConscript, [self.fs,] + files, subst_kw) File /usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons-local-1.0.1/SCons/Script/SConscript.py, line 256: exec _file_ in call_stack[-1].globals File /usr/local2/src/numpy/build/scons/numpy/core/SConscript, line 165: check_funcs(optional_stdfuncs) File /usr/local2/src/numpy/build/scons/numpy/core/SConscript, line 154: st = config.CheckFuncsAtOnce(funcs) error: Error while executing scons command. See above for more information. If you think it is a problem in numscons, you can also try executing the scons command with --log-level option for more detailed output of what numscons is doing, for example --log-level=0; the lowest the level is, the more detailed the output it. I'm trying to install it on a 8 processor 64 bit Intel machine. I could post more details of the installation if needed, but they don't seem relevant here (except that I've had a lot of issues getting the regular install script to work with the correct compile parameters, and so I thought I'd try scons). If there's something I'm doing wrong, let me know; otherwise I'd be happy to post a bug report. --Hoyt ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slicing without a priori knowledge of dimension
Try using slice (python builtin) to create slice objects (what is created implicitly by :5, 1:20, etc.). slice takes the same arguments as range. A list of these (7 in your case) can then be passed to A[...] as a tuple. That's how I would do it, but maybe someone else has a better idea or can correct me. --Hoyt On Wed, Aug 6, 2008 at 4:02 PM, Matthew Czesarski [EMAIL PROTECTED] wrote: Dear list. I've got a feeling that what I'm trying to do *should* be easy but at the moment I can't find a non-brute-force method. I'm working with quite a high-rank matrix; 7 dimensions filled with chi**2 values. It's form is something like this: chi2 = numpy.ones((3,4,5,6,7,8,9)) What I need to do is slice out certain 2 dimensional grids that I then want to plot for confidence estimation; make a nice graph. This is fine: as easy as it gets if the 2 dimensions are known. E.g. for the 2nd and 5th axes, one could hardcode something like this: subChi2 = chi2[ ia, ib, :, id, ie, :, ig ] The difficulty is that the user is to state which plane (s)he wants to slice out and we can't code the above. The function itself has to convert 2 rank numbers into an expression for a slice and I can't currently figure out how to do that. There are a huge number of options (well, there are 7*6=42). If one could just manually write a colon into a tuple like (2,2,:,2,2,:,2) or something even like 2,2,0:len(2ndDim),2,2,0:len(5thDim),2, things would be fine. But that doesn't seem to be an option. An alternative would be to set up 2 loops and incrementally read out the individual elements of the slice to a new numpy.zeros ( ( len(2ndDim), len(5thDim) ) ), but again we seem to be diverging from elegance. There must be something that I'm missing. Could somebody have a pointer as to what it is? Thanks in advance, Matt ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Large symmetrical matrix
The factor of two memory hit is not really the problem, it is the scaling O(N^2) which is the limitation Have you thought about using kd-trees or similar structures? If you are only concerned about the minimum distances between two points, that is the ideal data structure. It would reduce your running time / memory usage for generating / using the data from O(n^2) to O(n log n), with a O(log n) lookup time for the closest distance to each point. Without doing something like that, looking at the distance between each pair of points is \Omega(n^2). --Hoyt -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] segmentation fault
In my experience tracking down these sorts of things, if the effect is delayed and detected by glibc, it almost always means that a few bytes beyond the end of the data part of an array have been overwritten. This causes glibc's memory management stuff to crash later on when the object is deallocated (or something like that). Of course, I should say I was doing the overwritting in my own c code... --Hoyt On Wed, May 28, 2008 at 8:27 AM, Pauli Virtanen [EMAIL PROTECTED] wrote: ke, 2008-05-28 kello 10:59 -0400, Scott Ransom kirjoitti: Hmmm. Interesting. I'm on a 64-bit Debian Unstable system with numpy 1.0.4 and python 2.5.2 and I don't get this: In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.0.4' In [3]: def fn(): ...: x = np.random.rand(5,2) ...: x.cumsum(None, out=x) ...: return x ...: In [4]: fn() Out[4]: array([[ 0.40329303, 0.45335328], [ 0.85664631, 0.84798294], [ 1.71329262, 0.05877989], [ 2.56127556, 0.99401291], [ 4.27456818, 0.79275409]]) Wonder if the 64-bit thing could be the difference? Try running fn() again to make the bug bomb out: $ python Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type help, copyright, credits or license for more information. imp KeyboardInterrupt import numpy as np def fn(): ... x = np.random.rand(5,2) ... x.cumsum(None, out=x) ... return x ... np.__version__ '1.0.5.dev5024' fn() array([[ 0.59654253, 0.12577169], [ 0.72231422, 0.30600244], [ 1.44462843, 0.35849553], [ 1.75063088, 0.56925858], [ 3.19525931, 0.77487798]]) fn() *** glibc detected *** python: munmap_chunk(): invalid pointer: 0x08439d28 *** -- Pauli Virtanen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
The time has now been shaved down to ~9 seconds with this suggestion from the original 13-14s, with the inclusing of Eric Firing's suggestions. This is without scipy.weave, which at the moment I can't get to work for all lines, and when I just replace one of them sucessfully it seems to run more slowly, I assume because it is converting data back and forth. There's a nontrivial constant time cost to using scipy.weave.blitz, but it isn't copying the data. Thus it will slow you down on smaller arrays, but you'll notice quite a speed-up on much larger ones. I should have mentioned that earlier -- I assumed your arrays were really large. Are there any major pitfalls to be aware of? It sounds like if I do: f = a[n,:] I get a reference, but if I did something like g = a[n,:]*2 I would get a copy. Well, if you do f = a[n, :], you would get a view, another object that shares the data in memory with a but is a separate object. If anyone has any clues on why scipy.weave is blowing (http://pastebin.com/m79699c04) using the code I attached, I wouldn't mind knowing. Most of the times I've attempted using weave, I've been a little baffled about why things aren't working. I also don't have a sense for whether all numpy functions should be weavable or if it's just general array operations that can be put through weave. Sorry I didn't get back to you earlier on this -- I was a bit busy yesterday. It looks like weave.blitz isn't working on your second line because you're not explicitly putting slices in some of the dimensions, In numpy v[0:2] works for 1,2,3,4, dimensions, but for a 2d array in blitz you have to use v[0:2,:], 3d v[0:2,:,:]. It's a bit more picky. I think that's the problem with your second line -- try replacing v[:] with v[0,:] and theta[1-curidx] with theta[1-curidx, :]. (I may have missed some others.) weave.blitz is currently limited to just array operations... it doesn't really support the numpy functions. Hope that helps a little -- Hoyt -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
for n in range(0,time_milliseconds): self.u = self.expfac_m * self.prev_u + (1-self.expfac_m) * self.aff_input[n,:] self.v = self.u + self.sigma * np.random.standard_normal(size=(1,self.naff)) self.theta = self.expfac_theta * self.prev_theta - (1-self.expfac_theta) idx_spk = np.where(self.v=self.theta) self.S[n,idx_spk] = 1 self.theta[idx_spk] = self.theta[idx_spk] + self.b self.prev_u = self.u self.prev_theta = self.theta Copying elements into array objects that already exist will always be faster than creating a new object with separate data. However, in this case, you don't need to do any copying or creation if you use a flip-flopping index to handle keeping track of the previous. If I drop the selfs, you can translate the above into (untested): curidx = 0 # prev will be 2-curidx u = empty( (2, naff) ) v = empty( naff ) theta = empty( (2, naff) ) stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) ) for n in xrange(time_milliseconds): u[curidx, :] = expfac_m * u[2-curidx, :] + (1-expfac_m) * aff_input[n,:] v[:] = u[curidx, :] + sigma * stdnormrvs[n, :] theta[curidx, :] = expfac_theta * theta[2-curidx] - (1-expfac_theta) idx_spk = np.where(v = theta) S[n,idx_spk] = 1 theta[curidx, idx_spk] += b # Flop to handle previous stuff curidx = 2 - curidx This should give you a substantial speedup. Also, I have to say that this is begging for weave.blitz, which compiles such statements using templated c++ code to avoid temporaries. It doesn't work on all systems, but if it does in your case, here's what your code might look like: import scipy.weave as wv curidx = 0 u = empty( (2, naff) ) v = empty( (1, naff) ) theta = empty( (2, naff) ) stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) ) for n in xrange(time_milliseconds): wv.blitz(u[curidx, :] = expfac_m * u[2-curidx, :] + (1-expfac_m) * aff_input[n,:]) wv.blitz(v[:] = u[curidx, :] + sigma * stdnormrvs[n, :]) wv.blitz(theta[curidx, :] = expfac_theta * theta[2-curidx] - (1-expfac_theta)) idx_spk = np.where(v = theta) S[n,idx_spk] = 1 theta[curidx, idx_spk] += b # Flop to handle previous stuff curidx = 2 - curidx -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ On Mon, May 19, 2008 at 11:08 AM, James Snyder [EMAIL PROTECTED] wrote: Hi - First off, I know that optimization is evil, and I should make sure that everything works as expected prior to bothering with squeezing out extra performance, but the situation is that this particular block of code works, but it is about half as fast with numpy as in matlab, and I'm wondering if there's a better approach than what I'm doing. I have a chunk of code, below, that generally iterates over 2000 iterations, and the vectors that are being worked on at a given step generally have ~14000 elements in them. In matlab, doing pretty much exactly the same thing takes about 6-7 seconds, always around 13-14 with numpy on the same machine. I've gotten this on Linux Mac OS X. self.aff_input has the bulk of the data in it (2000x14000 array), and the various steps are for computing the state of some afferent neurons (if there's any interest, here's a paper that includes the model: Brandman, R. and Nelson ME (2002) A simple model of long-term spike train regularization. Neural Computation 14, 1575-1597.) I've imported numpy as np. Is there anything in practice here that could be done to speed this up? I'm looking more for general numpy usage tips, that I can use while writing further code and not so things that would be obscure or difficult to maintain in the future. Also, the results of this are a binary array, I'm wondering if there's anything more compact for expressing than using 8 bits to represent each single bit. I've poked around, but I haven't come up with any clean and unhackish ideas :-) Thanks! I can provide the rest of the code if needed, but it's basically just filling some vectors with random and empty data and initializing a few things. for n in range(0,time_milliseconds): self.u = self.expfac_m * self.prev_u + (1-self.expfac_m) * self.aff_input[n,:] self.v = self.u + self.sigma * np.random.standard_normal(size=(1,self.naff)) self.theta = self.expfac_theta * self.prev_theta - (1-self.expfac_theta) idx_spk = np.where(self.v=self.theta) self.S[n,idx_spk] = 1 self.theta[idx_spk] = self.theta[idx_spk] + self.b self.prev_u = self.u self.prev_theta = self.theta -- James Snyder Biomedical Engineering Northwestern University [EMAIL PROTECTED] ___ Numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
On Mon, May 19, 2008 at 12:53 PM, Robin [EMAIL PROTECTED] wrote: Hi, I think my understanding is somehow incomplete... It's not clear to me why (simplified case) a[curidx,:] = scalar * a[2-curidx,:] should be faster than a = scalar * b In both cases I thought the scalar multiplication results in a new array (new memory allocated) and then the difference between copying that result into the existing array u[curix,:] or reassining the reference u to that result should be very similar? If anything I would have thought the direct assignment would be quicker since then there is no copying. What am I missing? Actually, I think you are correct. My bad. I was mainly thinking in terms of weave.blitz, where it would make a difference, then translating back... --Hoyt +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Faster
and then update the others only if the row you're updating has that minimum value in it. Then, when scanning for the min dist, you only need to scan O(n) rows. Sorry, let me clarify -- Update the entries corresponding to entries in the row you're updating if they are the same as the minimum distance; in that case you need to rescan the row. Also, ySorry, ou only need to scan O(n) entries in your cached array. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Faster
You know, for linkage clustering and BHC, I've found it a lot easier to work with an intermediate 1d map of indices and never resize the distance matrix. I then just remove one element from this map at each iteration, which is a LOT faster than removing a column and a row from a matrix. if idxmap were the map, you would be working with X[idxmap[i], idxmap[j] ] instead of X[i, j]. Also, you could even use a python list in this case, as they are a lot better for deletions than an array. --Hoyt On Fri, May 2, 2008 at 5:47 PM, Keith Goodman [EMAIL PROTECTED] wrote: On Fri, May 2, 2008 at 5:38 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Fri, May 2, 2008 at 6:24 PM, Keith Goodman [EMAIL PROTECTED] wrote: How can I make this function faster? It removes the i-th row and column from an array. Why do you want to do that? Single linkage clustering; x is the distance matrix. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Faster
You could also try complete linkage, where you merge two clusters based on the farthest distance between points in two clusters instead of the smallest. This will tend to get clusters of equal size (which isn't always ideal, either). However, it also uses sufficient statistics, so it will be trivial to change your code to use that merge criteria if you want to try it. --Hoyt On Sat, May 3, 2008 at 5:31 PM, Keith Goodman [EMAIL PROTECTED] wrote: On Sat, May 3, 2008 at 5:05 PM, Christopher Barker [EMAIL PROTECTED] wrote: Robert Kern wrote: I can get a ~20% improvement with the following: In [9]: def mycut(x, i): ...: A = x[:i,:i] ...: B = x[:i,i+1:] ...: C = x[i+1:,:i] ...: D = x[i+1:,i+1:] ...: return hstack([vstack([A,C]),vstack([B,D])]) Might it be a touch faster to built the final array first, then fill it: def mycut(x, i): r,c = x.shape out = np.empty((r-1, c-1), dtype=x.dtype) out[:i,:i] = x[:i,:i] out[:i,i:] = x[:i,i+1:] out[i:,:i] = x[i+1:,:i] out[i:,i+1:] = x[i+1:,i+1:] return out totally untested. That should save the creation of two temporaries. Initializing the array makes sense. And it is super fast: timeit mycut(x, 6) 100 loops, best of 3: 7.48 ms per loop timeit mycut2(x, 6) 1000 loops, best of 3: 1.5 ms per loop The time it takes to cluster went from about 1.9 seconds to 0.7 seconds! Thank you. When I run the single linkage clustering on my data I get one big cluster and a bunch of tiny clusters. So I need to try a different linkage method. Average linkage sounds good, but it sounds hard to code. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] types as functions convert 1 elm arrays to scalars
To be honest, this doesn't seem justifiable. Where it got me is interfacing with c-code that expected a 1d array, and I was calling it with arrays of varying length. I was using this to ensure the proper typing; however, when the array was length 1, the program crashed... Should I file a bug report? --Hoyt On Mon, Apr 28, 2008 at 11:51 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Tue, Apr 29, 2008 at 12:28 AM, Hoyt Koepke [EMAIL PROTECTED] wrote: Hello, I have a quick question that I'm hoping will improve my numpy understanding. I noticed some behavior when using float64 to convert a matrix type that I didn't expect: In [35]: b1 = array([1.0]) In [36]: float64(b1) Out[36]: 1.0 In [37]: b2 = array([1.0, 2.0]) In [38]: float64(b2) Out[38]: array([ 1., 2.]) I didn't expect calling float64 would convert b1 to a scalar. Seems like an inconsistency. I assume this is intentional, as someone would have noticed it a long time ago if not, so could someone explain the reasoning behind it? (or point me to a source that will help?) It's inconsistent and looks like a bug: In [4]: float32(array([[[1]]])) Out[4]: array([[[ 1.]]], dtype=float32) In [5]: float64(array([[[1]]])) Out[5]: 1.0 Float64 is a bit special because it starts as the python float. Maybe Travis can say what the differences are. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Distribution Functions Change Behavior
I may not understand what you are asking, Rich, but I'm not sure I agree with Alan. A Gaussian fit to data x should fit exactly as well as data fit to ax, a 0, just with a variance a^2 times the original. The only way this would not be true is if: 1. You are not fitting the variance, but only the mean 2. There's some numerical issue (like some of the data are represented as integers, etc.) Don't know if it could be one of those issues... --Hoyt On Mon, Apr 28, 2008 at 7:18 AM, Alan G Isaac [EMAIL PROTECTED] wrote: Hi Rich, If your data is truncated at zero, it is not Gaussian (drawn from a normal). You will notice this when you shrink the range of values (unless the variance is tiny). Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Distribution Functions Change Behavior
Wait, I think I see what Alan is saying. When you use a gaussian approximation on truncated data, the accuracy of the truncation is very dependent on where in the interval the mean is. If it's near the edges, the results will be worse. The width of the interval, though, is a separate factor. --Hoyt On Mon, Apr 28, 2008 at 8:29 AM, Hoyt Koepke [EMAIL PROTECTED] wrote: I may not understand what you are asking, Rich, but I'm not sure I agree with Alan. A Gaussian fit to data x should fit exactly as well as data fit to ax, a 0, just with a variance a^2 times the original. The only way this would not be true is if: 1. You are not fitting the variance, but only the mean 2. There's some numerical issue (like some of the data are represented as integers, etc.) Don't know if it could be one of those issues... --Hoyt On Mon, Apr 28, 2008 at 7:18 AM, Alan G Isaac [EMAIL PROTECTED] wrote: Hi Rich, If your data is truncated at zero, it is not Gaussian (drawn from a normal). You will notice this when you shrink the range of values (unless the variance is tiny). Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generating Bell Curves (was: Using normal() )
Another suggestion from machine learning stuff to throw into the mix: A soft step function that we use often is y = e^(ax) / ( 1 + e^(ax)). It has the nice property that the result y is always in (0,1). If you invert this, you get x = -(1/a)*log(y - 1); this maps (0,1) to the whole real line, and the a parameter controls how sharp that mapping is. Now use that as the input to a Gaussian, and you can get a soft truncation at (0,1). Additionally, you can do this twice to have more control over the shape of the resulting distribution. I suspect the resulting kurtosis/skew factors would be calculable. This has the advantage of giving you a calculable pdf (just normalize the resulting distribution using the inverse det-of-Jacobian factor) without too much hassle. Furthermore, it should be easy to fit the parameters to data without too much difficulty (though I haven't tried). Just a thought, though I've never worked with all this much so I can't say for sure how well it would work. --Hoyt On Fri, Apr 25, 2008 at 4:39 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Fri, Apr 25, 2008 at 1:25 PM, Rich Shepard [EMAIL PROTECTED] wrote: On Fri, 25 Apr 2008, Charles R Harris wrote: You can use something like f(x) = (1-x**2)**2 , which has inflection points and vanishes at +/- 1. Any of the B-splines will also do the trick. Chuck, Thank you. I need to make some time to understand the B-splines to use them appropriately. Unfortunately, my mathematical statistics learning was many years in the past ... but we had moved ahead of writing on clay tablets by that time. Not needing to retain that knowledge for many years means it was replaced by more pressing current knowledge. The B-splines do look promising, though. Here's a B-spline approximation to a Gaussian: http://www.doc.ic.ac.uk/~dfg/AndysSplineTutorial/BSplines.html Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] access ndarray in C++
Just to add something from personal experience in case it's useful... I write a lot of code that works on numpy arrays that goes between python and c++ (too used to OO to stick with pure c). I've been using scipy.weave to interface to blitz++ arrays in my c++ code. The blitz++ library has been wonderful in my experience -- it supports arrays up to 11 dimensions, slicing, etc. -- essentially a very useful subset of numpy's operations. It's also nice because it can interface without copying any data; it just provides a wrapper class that sits on a numpy array. The library is included as part of scipy. I recently wrote a function that automatically generates wrappers for a c++ function that accepts blitz++ arrays which I recently posted to the scipy list. You just specify function names and the list of argument types (nparrays included) and it generates the wrapper code. Quite easy. It's pertinent to the discussion here, so I'll send it again. Included is a full working example with setup.py. let me know if it helps/is confusing/doesn't work/etc. --Hoyt On Wed, Apr 23, 2008 at 12:30 PM, Andreas Klöckner [EMAIL PROTECTED] wrote: On Mittwoch 23 April 2008, Christopher Barker wrote: NOTE: Most folks now think that the pain of writing extensions completely by hand is not worth it -- it's just too easy to make reference counting mistakes, etc. Most folks are now using one of: Cython (or Pyrex) SWIG ctypes IMO, all of these deal better with C than they do with C++. There is also a number of more C++-affine solutions: - Boost Python [1]. Especially if you want usable C++ integration. (ie. more than basic templates, etc.) - sip [2]. Used for PyQt. Andreas [1] http://www.boost.org/doc/libs/1_35_0/libs/python/doc/index.html [2] http://www.riverbankcomputing.co.uk/sip/index.php ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ weave_wrapper.tar.gz Description: GNU Zip compressed data ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] #734: interactive docstring search (lookfor)
My personal 2 cents is that we should put it in the most easy-to-find place. I wish I had the propsed lookfor function at my fingertips when I started to learn numpy/scipy a year ago, coming from matlab. Like many others (I assume), I just jumped in and learned what numpy could do by trial and error and I never noticed numpy.lib.utils.info. Thus my personal vote (though I understand this may not be the best and there are arguments against it) is to put a lookfor or info function in the top level namespace, with an obvious name, so newbies could easily run across it and have a more enjoyable numpy learning experience. That's just my humble opinion ... --Hoyt On Thu, Apr 10, 2008 at 2:42 PM, Pauli Virtanen [EMAIL PROTECTED] wrote: Travis E. Oliphant kirjoitti: Pauli Virtanen wrote: [clip] I think this is a good idea: full-namespace docstring search à la Matlab lookfor. I wrote a quick implementation for numpy here: http://scipy.org/scipy/numpy/ticket/734 Cool. I started scipy.misc.info a long time ago to try and do this. I didn't advertise it well enough ;-) scipy.misc.info('eigvals') I'm happy to move this functionality into numpy (but it might be better in IPython). Aha! I was wondering if something this simple was already present, but I didn't notice that numpy.lib.utils.info also accepted strings as parameters. Btw, the info function is already in numpy and the code is duplicated in scipy.misc. However, info is a bit different from lookfor in that it finds documentation by the exact function name, not by looking into the docstrings. So, should info and lookfor be combined, or kept separate? -- Pauli Virtanen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- +++ Hoyt Koepke UBC Department of Computer Science http://www.cs.ubc.ca/~hoytak/ [EMAIL PROTECTED] +++ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] #734: interactive docstring search (lookfor)
In [1]: import numpy In [2]: numpy.info('eigvals') *** Found in numpy.linalg *** eigvals(a) Fair enough Don't know why I missed that, prob relied too much on Google search :-) Having it as part of iPython does make sense. --Hoyt ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quikest way to create a symetric (diagonal???) matrix ?
If the rest of the matrix is already zeros and memory wasn't a problem, you could just use A_sym = A + A.T - diag(diag(A)) If memory was an issue, I'd suggest weave.inline (if that's a viable option) or pyrex to do the loop, which would be about as fast as you could get. --Hoyt On Wed, Mar 26, 2008 at 7:22 AM, Alexandre Fayolle [EMAIL PROTECTED] wrote: On Wed, Mar 26, 2008 at 09:48:02AM -0400, Pierre GM wrote: All, What's the quickest way to create a diagonal matrix ? I already have the elements above the main diagonal. Of course, I could use loops: m=5 z = numpy.arange(m*m).reshape(m,m) for k in range(m): for j in range(k+1,m): z[j,k] = z[k,j] But I was looking for something more efficient. From your code, you certainly meant symetric and not diagonal. Maybe you can speed up things a bit by assigning slices: for k in range(m): ... z[k:, k] = z[k, k:] -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations Développement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iQEVAwUBR+pcDl6T+PKoJ87eAQI1zAf/W7wnB1a6sa4FuHDPTDjU61ZpvDgS41r7 B7EuSDncTluf3Y5ynQ8NroAihX0DvV4F5LTDcbFJbmqnQx8JApVoeQF3wnTnpf24 pUQ5oSB+w0+RtzU0Zu/TBkOh3hM8iPYyB2M7jq9/qakVxEsrlOiTH+j05ysJD9FG GezArMoQu5ycJ26Ir9P7jR0acH/WBA84U524aiDbenLMmpFIZX7mElU47z/Ue5m7 xKTT/lu3BWQAJPoQTiHG7nRLDaAqxKVO0WLXPuUJ7HyCc4qjURhXZMmJQ2FP2ajt H9AQQhNkO7eUAPmMLhK0x262bYIdq699UmjV7YOVmSvCrBM76okqew== =ha+1 -END PGP SIGNATURE- ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dunno what array operation I'm looking for...
Try result = A[1:] - A[:-1] --Hoyt On Fri, Mar 21, 2008 at 7:43 PM, Chris Withers [EMAIL PROTECTED] wrote: Hi All, Say I have an array like: measurements = array([100,109,115,117]) What do I do to it to get: array([9, 6, 2]) Is the following really the best way? result = [] for i in range(1,len(measurements)): ... result.append(measurements[i]-measurements[i-1]) ... array(result) array([9, 6, 2]) cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy.random.RandomState threadsafe?
This should be a really quick question. Is a RandomState object thread safe? I'm wanting to use a common RandomState object in a multithreaded program, and I need to know if it's necessary to protect it with a lock (which wouldn't be difficult). Thanks! --Hoyt ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.random.RandomState threadsafe?
Okay, thanks! I won't be using the multivariate_normal function in my code, so this should work fine. --Hoyt On Tue, Mar 11, 2008 at 3:00 PM, Robert Kern [EMAIL PROTECTED] wrote: On Tue, Mar 11, 2008 at 4:18 PM, Hoyt Koepke [EMAIL PROTECTED] wrote: This should be a really quick question. Is a RandomState object thread safe? I'm wanting to use a common RandomState object in a multithreaded program, and I need to know if it's necessary to protect it with a lock (which wouldn't be difficult). For nearly all of the methods, yes, they should be. RandomState is implemented in C (using Pyrex) and the GIL is acquired before calling any C functions. The caveat here is that the methods multivariate_normal() calls back out to Python-implemented functions. It is possible that the GIL gets released during that call and that another thread can pick up execution then. However, even this should not be a problem as far as safety goes; no internal state is read or changed after the external call. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Optimize speed of for loop using numpy
I would definitely suggest using scipy's weave.inline for this. It seems like this particular function can be translated into C code really easily, which would give you a HUGE speed up. Look at some of the examples in scipy/weave/examples to see how to do this. The numpy book also has a section on it. One of the reasons I've left matlab and never looked back is how easy it is to interweave bits of compiled C code for loops like this. --Hoyt On Mon, Feb 25, 2008 at 6:32 PM, Trond Kristiansen [EMAIL PROTECTED] wrote: Hi again. I have attached the function that the FOR loop is part of as a python file. What I am trying to do is to create a set of functions that will read the output files (NetCDF) from running the ROMS model (ocean model). The output file is organized in xi (x-direction), eta (y-direction), and s (z-direction) where the s-values are vertical layers and not depth. This particular function (z_slice) will find the closest upper and lower s-layer for a given depth in meters (e.g. -100). Then values from the two selcted layers will be interpolated to create a new layer at the selected depth (-100). The problem is that the s-layers follow the bathymetry and a particular s-layer will therefore sometimes be above and sometimes below the selected depth that we want to interpolate to. That's why I need a quick script that searches all of the layers and find the upper and lower layers for a given depth value (which is negative). The z_r is a 3D array (s,eta,xi) that is created using the netcdf module. The main goal of these set of functions is to move away from using matlab, but also to speed things up. The sliced data array will be plotted using GMT or pyNGL. Thanks for helping me. Cheers, Trond On 2/25/08 9:15 PM, Robert Kern [EMAIL PROTECTED] wrote: On Mon, Feb 25, 2008 at 8:08 PM, Trond Kristiansen [EMAIL PROTECTED] wrote: Hi all. This is my first email to the discussion group. I have spent two days trying to get a particular loop to speed up, and the best result I got was this: Can you try to repost this in such a way that the indentation is preserved? Feel free to attach it as a text file. Also, can you describe at a higher level what it is you are trying to accomplish and what the arrays mean? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion