[Numpy-discussion] ANN: PyCPX 0.01, a numpy/cython wrapper for CPlex.

2011-09-18 Thread Hoyt Koepke
Hello,

I'm pleased to announce the first release of a wrapper I wrote for
IBM's CPlex Optimizer Suite. It focuses on ease of use and seamless
integration with numpy, and allows one to naturally specify
constrained linear and quadratic optimization problems over real,
boolean, and integer variables.  The wrapper is written in cython/C++
for speed and released under the LGPL license.

For documentation and examples, please see the website:
http://www.stat.washington.edu/~hoytak/code/pycpx/index.html.

Please send me any comments, questions and suggestions!  I am quite
open to feedback.

Thanks,
--Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm

2011-05-17 Thread Hoyt Koepke
 Well, the hungarian algorithm has a theoretical upper bound of O(n^3),
 with n being the number of nodes, which is pretty much the best you
 can do if you have a dense graph and make no assumptions on
 capacities.

 OK, your input is making my motivation to fight with Jonker-Volgenant go
 down. I'll see with the other people involved if we still target
 Jonger-Volgenant, or if we stick with the hungarian algorithm, in which
 case the problem is solved.

I'd do that route. From my (somewhat anecdotal) observations, a well
coded version of the Hungarian algorithm will beat moderately
well-coded implementations of other algorithms on most common types of
problems; the data structures in other algorithms are harder to get
right.  Also, they often tend to show their strengths only on
sufficiently large problems.  I'd guess that for a speed up, your time
would be better spent profiling and maybe cythoning your current
implementation of the Hungarian algorithm (also note that there's a
few bsd/mit C++ implementations floating around, check the wikipedia
page).


 Hope that answers your questions :-).

 It does. Thanks a lot, it is very useful to have expert knowledge
 available.

Well, I don't know if I'm an expert in this area, but I rub
shoulders with a few and I'm glad to be of help.

-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm

2011-05-16 Thread Hoyt Koepke
On Mon, May 16, 2011 at 12:18 AM, Gael Varoquaux
gael.varoqu...@normalesup.org wrote:
 Following a suggestion by Joseph, I am trying to implement the
 Jonker-Volgenant algorithm for best linear assignment in Python, using
 numpy. Unsuprisingly, it is proving time-costly. I cannot afford to spend
 too much time on this, as it not to solve a problem of mine, but for the
 scikits.learn. Thus I was wondering if someone had a BSD-licensed Python
 version of the algorithm that he would be willing to share.

There are a number available.  The lemon graph library
(http://lemon.cs.elte.hu/trac/lemon) has solvers for this problem and
has python bindings.  It's under the boost license (is that ok?).   It
might be a bit heavyweight for this, though, but it's great software.

-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm

2011-05-16 Thread Hoyt Koepke
In this case, lemon exists only as C++ header files, with no compiled
code.  A few cython functions should make it pretty simple.  However,
yeah, it is a bit heavyweight.

 Thanks for the suggestion. Unfortunately, I do not want to add compiled
 code (or an external dependency) to the scikit for this solver, as it is
 a fairly minor step for us. I particular chances are that it will never
 be a computational bottleneck on our poblems.

If that's the case, I suggest just using the Hungarian algorithm.
It's a little slower on large graphs, but it still is pretty quick.
And there's tons of implementations out there, as it's often a
standard coding project in undergrad algorithms courses.
http://en.wikipedia.org/wiki/Hungarian_algorithm has links to a few.

-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Linear assignment problem: Jonker-Volgenant algorithm

2011-05-16 Thread Hoyt Koepke
 I might go that way: I already have pure-Python code that implements it
 and that I have been using for a year or so.

Fair enough -- though you'd probably get a big speed up moving to cython.

 It's a little slower on large graphs, but it still is pretty quick.

 Can you put any numbers on the difference, or rule of thumbs? Is there an
 algorithmic complexity difference? If so, what kind of dependency are we
 talking about? Is it only in the prefactors?

Well, the hungarian algorithm has a theoretical upper bound of O(n^3),
with n being the number of nodes, which is pretty much the best you
can do if you have a dense graph and make no assumptions on
capacities.  If you have a sparse graph, other algorithms can exploit
that structure a bit more efficiently (e.g. lemon's implementation is
O(nm logn), with m being the number of edges).  Other algorithms can
use capacities and include the capacity bounds as part of the
theoretic performance of the algorithms, e.g.
http://arxiv.org/abs/1105.1569.
However, all these bounds are generally very much upper bounds and
actual practical performance varies greatly.  The hungarian algorithm
is popular mainly because it's the best one that can be coded up
efficiently in a few hundred lines of code.

Hope that answers your questions :-).

--Hoyt



+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiling numpy using icc gets missing library error

2011-03-27 Thread Hoyt Koepke
 Thanks for the thorough description of getting everything to work.

You're welcome.  I'm glad people find it helpful :-).

 -ipo -- okay for linker / c++, but not elsewhere.  For C, it causes
 the long_double_representation() function in setup_common() to fail
 (I'll note this on the corresponding ticket), and causes f2py to fail
 to generate correct wrappers.

 Would it be enough to put a comment in the intelccompiler.py code not
 to use the -ipo flag, or do you think there's something to fix here?
 Doesn't seem to be a new problem by the way:
 http://cens.ioc.ee/pipermail/f2py-users/2006-January/001229.html

Well, to be honest, it's debatable how this should be fixed, if at
all.  The issue is that the intermediate object files and archives are
in their own format -- hence the need for xiar instead of ar -- and I
suspect that that's what's messing up f2py and the
long_double_representation() stuff (I don't actually know how either
really works, so I could be wrong here).  Reverse engineering those
formats doesn't make sense.  Now it's possible that there may be a
better way of determining long_double_representation(), but I don't
know.  IMO it's not worth it.

 -O3 -- okay for C++ / fortran, but not C.  For C, it causes einsum to hang.

 -O3 is the default optimization level, so this is a bug I guess.
 There's also another report in #1378 that -O3 doesn't work with numpy
 1.4.0 (which does not have einsum). Should it be lowered to -O2 by
 default?

I'm not sure what's going on here.  I think it is likely a bug in icc
brought out by the really low-level code in einsum.  However, the
documentation said that the extra optimizations enabled by -O3 are
likely not going to help except in large, predictable loops where more
aggressive loop transformations are doable and help.  Thus my vote is
to put in O3 for C++ and fortran, but O2 on C (because of this bug).
However, if the einsum bug gets fixed, nice! (BTW, I think it's also
be possible to change per-file optimization settings with intel
specific pragmas, but I don't know for sure.   If so, that could be a
workaround.)

 Fortran: -static -DMKL_LP64 -mkl -xHOST -O3 -fPIC  -fp-model strict
 C:     -static -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict
 C++: -static -DMKL_LP64 -mkl -xHOST -O3 -fno-alias -fPIC -fp-model strict
 Linker: -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict
 -openmp -lpthread

 I'm not sure which of those flags would be appropriate as a default in
 distutils, perhaps only fp-model-strict? If you could help put
 together a patch for numpy.distutils, that would be very helpful I
 think. The rest of your description could be put at
 http://scipy.org/Installing_SciPy/Linux.

-fp-model strict and -fno-alias, as it's analogous to the
-fno-strict-aliasing gcc flag required by python.  The -static flag is
probably optional.  -xHOST is the same as gcc's --march=native, so it
seems like the nature of the distutils system isn't appropriate.

I'll try to get a patch together for the distutils.  However, I'd want
someone to review it since I'm not that confident in my knowledge of
the distutils code.  I can also try to turn this into a more complete
description for the wiki.

-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiling numpy using icc gets missing library error

2011-03-25 Thread Hoyt Koepke
Okay, so here's a follow up on my progress.  Apologies in advance for
the long email here, but I'd like to be thorough about this before I
forget.

For the sake of completeness, here's my setup.  I'm running python
2.7.1 compiled from source with icc.  I'm running ubuntu 10.10 on one
of intel's new processors (a i7-2600).  The goal is to compile numpy
and scipy both with intel's compiler and intel's mkl.

I finally got numpy to compile with icc / ifort with pretty much all
of the tests passing.  It's a bit of work, partly cause I was trying
to be an optimization junky, but I thought I'd share my discoveries.
Scipy also compiles, but with some errors (which are likely due to me
not configuring f2py correctly).

First, I wanted to compile things with intel's interprocedural
optimization enabled, and that seems to work, but only if -O2 is used
for the compiling stage and -O1 is used for the linking stage. If -O3
is given for the compiling stage, then the einsum test goes into some
sort of infinite loop and hangs.  If -O2 or -O3 are given for the
linker, then there are random other segfaults (I forget where).
However, with these optimization levels, things are stable.  Also, if
I turn off -ipo, then -O3 works fine for compiling. I'm not sure if
this reflects bugs in the flags I'm passing to the intel compiler or
in icc/ifort itself.

Second, to use -ipo, it's critical that xiar is used instead of ar to
create object archives.  This needed to be changed in
fcompiler/intel.py and intelccompiler.py. I've attached a diff of
these files that gives working options for me.  I don't know if these
options are set in the correct place or not, but perhaps they would be
helpful:

The essence of it is the following (from intelccompiler.py)

linker_flags = '-O1 -ipo -openmp -lpthread -fno-alias -xHOST -fPIC '
compiler_opt_flags = '-static -ipo -xHOST -O2 -fPIC -DMKL_LP64 -mkl
-wd188 -g -fno-alias '
icc_run_string = 'icc ' + compiler_opt_flags
icpc_run_string = 'icpc ' + compiler_opt_flags
linker_run_string = 'icc ' + linker_flags + ' -shared '

with the rest of this diff setting these options.  In this case, the
-openmp and -lpthread are required for linking with the threaded layer
of the MKL.  This could possibly be ripped out of there.  Also, the
-fno-alias is critical for the c compiler -- random segfaults and
memory corruptions occur without it.  The -DMKL_LP64 is to ensure
proper linking with the lp64 (32 bit indices) part of mkl, instead of
the ilp64 (64 bit indices).  The latter isn't supported by the
lapack_lite module -- things compile, but don't work. -mkl may or may
not help things.

For the fortran side, this was the compiler string:

compiler_opt_flags = '-static -ipo -xHOST -fPIC -DMKL_LP64 -mkl -wd188
-g -fno-alias -O3'

Here you don't need the -fno-alias and -O3 seems to work.

Third, it was a bit of a pain to figure out how to get the
linking/detection done correctly, as somehow order matters, and it was
easy to get undefined symbols, runtime errors, etc. Very annoying.  In
the end, my site.cfg file looked like this:

[DEFAULT]
library_dirs=/usr/intel/current/mkl/lib/intel64
include_dirs=/usr/intel/current/mkl/include
mkl_libs = mkl_rt, mkl_core, mkl_intel_thread, mkl_intel_lp64
blas_libs = mkl_blas95_lp64
lapack_libs = mkl_lapack95_lp64

[lapack_opt]
library_dirs=/usr/intel/current/mkl/lib/intel64
include_dirs=/usr/intel/current/mkl/include/intel64/lp64
libraries = mkl_lapack95_lp64

[blas_opt]
library_dirs = /usr/intel/current/mkl/lib/intel64
include_dirs = /usr/intel/current/mkl/include/intel64/lp64
libraries = mkl_blas95_lp64

where /usr/intel/current/ points to my intel install location.  It's
critical that the mkl_libs are given in that order.  I didn't find
another combination that worked.

Finally, I attached my bash setup script for environment variables.  I
don't know how much of a role those play in things, but I had them in
place when things started working, so I should put them here.

Now, on to scipy.  With all these options in place, scipy compiles
fine.  However, there are two problems, and these don't seem to go
away at any optimization level. I'm looking for suggestions.  I'm
guessing it's some sort of configuration error.

1) The CloughTocher2DInterpolator segfaults every time it's called to
interpret values.  I couldn't manage to track it down -- it's in the
cython code somewhere -- but I can give more details next time,  I
disabled it for now.

2) f2py isn't getting the interfaces right.  When I run the test
suite, I get about 250 errors, all of the form:

ValueError: failed to create intent(cache|hide)|optional array-- must
have defined dimensions but got (5,5,)

and so on, with different tuples on the end.

Other than these errors, everything seemed to work great.  What might
I be doing wrong there?

Thanks!
-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy

[Numpy-discussion] ANN: TreeDict 0.12, a lightweight library for hierarchical parameter/data management

2011-03-25 Thread Hoyt Koepke
Hello,

I thought the numpy/scipy community might be particularly interested
in this library, so here goes:

I'm pleased to announce a release of the next version of TreeDict, a
dictionary-like, hierarcical python container to simplify the
bookkeeping surrounding parameters, variables and data. It aims to be
fast, lightweight, intuitive, feature-rich and stable.  While intended
for general python coding, it includes a number of features
particularly useful for scientific programming. It is similar in basic
functionality to MATLAB structures in terms of concise syntax and
implicit branch creation, which, I've found, makes organizing
parameters/data much easier than with other methods. In addition,
though, TreeDict implements all the methods of regular dictionaries,
pickling, fast non-intersecting hashing for efficient cache lookups,
manipulations on the tree structure, a system for forward referencing
branches to make lists of parameters more readable, BSD license, and a
full suite of unit tests.  More info can be found at
http://www.stat.washington.edu/~hoytak/code/treedict/index.html.

Questions or comments or feedback are welcome.

Thanks!
--Hoyt

P.S. One thought regarding scipy -- since this library is fairly
similar to matlab structs, it may be easy to integrate this with the
current matlab io routines to translate things between the two more
easily.  In this direction, I've added in options to two methods,
fromdict and convertTo, to make conversion to/from the existing python
representations of the matlab objects.  Discussion/comments on this
point are welcome.



+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiling numpy using icc gets missing library error

2011-03-25 Thread Hoyt Koepke
Okay, last update.  I finally have got everything to work.

It turns out the problems that I had earlier with f2py were due to
intel's -ipo flag.  So the only place this flag works is with the C++
code, not fortran or c.

Also, I forgot to mention -- the qhull_a.h method has a workaround for
some aspect of intel's compiler that is no longer needed and in fact
causes an error.  It's for a macro that simply suppresses
unused variable warnings.  In my opinion, it could be removed, as it's
only used two places, and scipy spits out enough warnings that that is
hardly an issue.  Thus my change was around line 102 in qhul_a.h.
Replace

#if defined(__INTEL_COMPILER)  !defined(QHULL_OS_WIN)
template typename T
inline void qhullUnused(T x) { (void)x; }
#  define QHULL_UNUSED(x) qhullUnused(x);
#else
#  define QHULL_UNUSED(x) (void)x;
#endif

with

#define QHULL_UNUSED(x)

Also, I could still not get the CloughTocher2DInterpolator to not
segfault.  Thus I had to disable it by raising an exception in the
init method.  With this in place, everything compiles and the unit
tests pretty much all run, with 5 failures mostly due to numerical
accuracy stuff and 9 errors due to the interpolator.

In summary, my final environment variables that give the flags for
compiling stuff are:

export FLAGS='-xHOST -static -fPIC -g -fltconsistency'

export CFLAGS=$FLAGS -O2 -fno-alias
export CPPFLAGS=$FLAGS -fno-alias -ipo -O3
export CXXFLAGS=$CPPFLAGS

export FFLAGS=$FLAGS -O3
export F77FLAGS=$FFLAGS
export F90FLAGS=$FFLAGS

export LDFLAGS=-xHOST -O1 -openmp -lpthread -fPIC

And the arguments given to the fortran compiler in fcompiler/intel.py are:

compiler_opt_flags = '-static -xHOST -fPIC -DMKL_LP64 -mkl -g -O3'

I'd be happy to answer any more questions about the process as needed.
 Now, back to my real work.

-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compiling numpy using icc gets missing library error

2011-03-25 Thread Hoyt Koepke
Okay, even better news.  After a little bit of probing around, it
turns out that some of the errors, namely the one in the interpolator,
were due to intel's handling of floating point values at higher
optimization levels.  Thus the two rather important flags, if you wish
to avoid difficult-to-explain segfaults, are

-fp-model strict  ( disables unsafe fp optimizations)
-fno-alias (C  C++, not fortran)

Now here's a summary of the other errors:

-ipo -- okay for linker / c++, but not elsewhere.  For C, it causes
the long_double_representation() function in setup_common() to fail
(I'll note this on the corresponding ticket), and causes f2py to fail
to generate correct wrappers.

-O3 -- okay for C++ / fortran, but not C.  For C, it causes einsum to hang.

Thus the highest optimization levels I could find in which everything
compiled and ran were:

Fortran: -static -DMKL_LP64 -mkl -xHOST -O3 -fPIC  -fp-model strict
C: -static -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict
C++: -static -DMKL_LP64 -mkl -xHOST -O3 -fno-alias -fPIC -fp-model strict
Linker: -DMKL_LP64 -mkl -xHOST -O2 -fno-alias -fPIC -fp-model strict
-openmp -lpthread

Enjoy,
--Hoyt


On Fri, Mar 25, 2011 at 2:30 PM, Pauli Virtanen p...@iki.fi wrote:
 On Fri, 25 Mar 2011 13:46:41 -0700, Hoyt Koepke wrote:
 [clip]
 Also, I could still not get the CloughTocher2DInterpolator to not
 segfault.

 Backtrace would be useful here. It's probably best to recompile with
 -O0 and some debug flags enabled in the compiler to get something
 reasonable out.

 --
 Pauli Virtanen

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Compiling numpy using icc gets missing library error

2011-03-22 Thread Hoyt Koepke
/libpthread.so.0 (0x7f5915b1)
libc.so.6 = /lib/libc.so.6 (0x7f591578c000)
libdl.so.2 = /lib/libdl.so.2 (0x7f5915588000)
libutil.so.1 = /lib/libutil.so.1 (0x7f5915385000)
/lib64/ld-linux-x86-64.so.2 (0x7f591770c000)


I also attached the build log here (which is the same for the first
error up to the crash).  Any suggestions of where to look or where
something could have gone wrong?

Thanks!
-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++


build.log.gz
Description: GNU Zip compressed data


lines_dump.p.gz
Description: GNU Zip compressed data
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ndarray special methods should raise NotImplemented instead of TypeError

2011-03-13 Thread Hoyt Koepke
Apologies for cross posting -- I posted this to the scipy developer
list a few days ago, since it's really a development question, but
after getting no response I reread my question and realized it was a
bit unclear and also was more appropriate for numpy.  So here goes.

Currently, the numerical special methods of the ndarray class, like
__add__, __gt__, __mul__, etc., raise a TypeError when they encounter
a class or type they don't understand.  I do not think this is the
correct behavior. Instead, they should raise a NotImplemented error.

Here's the logic behind this.  Classes can also define right operators
that correspond to each of the above methods -- e.g. __radd__, __le__,
__rmul__.  However, these are only called under three conditions (see
http://docs.python.org/reference/datamodel.html#object.__radd__):

1. When the left object doesn't implement the left operator.
2. When the left object raises a NotImplemented exception.
3. When the right object is a subclass of the left object.  In this
case, the left operator method is never called.

In my use case, I've been working on implementing a matrix-like class
of symbolic objects that have to interact seamlessly with numpy
arrays.  It's a wrapper for CPlex's Concert library, which effectively
presents variables in an optimization problem as symbolic C++ objects
that can be used to build the model.  I'm trying to wrap this to work
with numpy (I'd be happy to share my code when it's done) and the
current behavior is causing a bit of a headache.

Consider working with A + X, where A is an ndarray and X is an
instance of my class that represents my symbolic objects, and let's
suppose my matrix class implements __radd__ sensibly.  Obviously, 1.
doesn't apply here.  If I go with 2., a TypeError is raised, which
blows things apart.  So the only solution I have is 3.

However, it isn't possible for me to use the underlying array
machinery provided by ndarray, as the python class simply wraps other
C++ containers.  Thus, if I subclass ndarray, the underlying array is
meaningless -- while I do implement slicing and various other
array-like methods, the only reason I'm subclassing ndarray is (3).
The big problem with this is that many of the attributes and methods
of ndarray don't make sense for my class, so I have to cover them up
with dummy methods that simply raise exceptions.  Doable, but far from
ideal.

For now, I'm doing (3) to get it to work, but it would be really nice
if ndarray would raise a NotImplemented error instead of a TypeError.
Is there any reason / use case justifying the current behavior?  Would
it be possible/easy to change it?

Thanks!
-- Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray special methods should raise NotImplemented instead of TypeError

2011-03-13 Thread Hoyt Koepke
 You can also define __array_priority__  in your class. If your class isn't a
 subclass, then numpy will always call your class methods whether or not your
 class is on the right or left. Sounds like you don't want to sublass ndarray
 here so that should work.

Excellent!  This sounds like exactly what I was hoping for.  I assume
in my case, I just need __array_priority__  10 to gain priority over
a matrix (from reading
http://docs.scipy.org/doc/numpy/reference/arrays.classes.html).  Is
that correct?

Thanks!

--Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slicing slower than matrix multiplication?

2009-12-11 Thread Hoyt Koepke
 One thing to note is that dot uses optimized atlas if available, which
 makes it quite faster than equivalent operations you would do using
 purely numpy. I doubt that's the reason here, since the arrays are
 small, but that's something to keep in mind when performances matter:
 use dot wherever possible, it is generally faster than prod/sum,

This is quite true; I once had a very large matrix  (600 x 200,000)
that I needed to normalize.  Using .sum( ) and /= took about 30
minutes.  When I switched to using dot( ) to do the same operation
(matrix multiplication with a vector of 1's, then turning that into a
diagonal matrix and using dot() again to normalize it), it dropped the
computation time down to about 2 minutes.   Most of the gain was
likely due to ATLAS using all the cores and numpy only using 1, but I
was still impressed.

--Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster way to generate a rotation matrix?

2009-03-03 Thread Hoyt Koepke
Hello,

 def rotation(theta, R = np.zeros((3,3))):
cx,cy,cz = np.cos(theta)
sx,sy,sz = np.sin(theta)
R.flat = (cx*cz - sx*cy*sz, cx*sz + sx*cy*cz, sx*sy,
-sx*cz - cx*cy*sz, -sx*sz + cx*cy*cz,
cx*sy, sy*sz, -sy*cz, cy)
return R

 Pretty evil looking ;) but still wouldn't mind somehow getting it faster

I would definitely encourage you to check out cython.  I have to write
lots of numerically intensive stuff in my python code, and I tend to
cythonize it a lot.  In cython, the above would be (something like):

from numpy cimport ndarray

cdef extern from math.h:
   double cos(double)
   double sin(double)

def rotation(ndarry[double] theta, ndarray[double, ndim=2] R = np.zeros((3,3))):
 cdef double cx = cos(theta[0]), cy = cos(theta[1]), cz = cos(theta[2])
 cdef double sx = sin(theta[0]), sy = sin(theta[1]), sz = sin(theta[2])

R[0,0] = cx*cz - sx*cy*sz
R[0,1] = cx*sz + sx*cy*cz
R[0,2] = sx*sy
...
R[2,2] = cy

return R

And that will be probably be orders of magnitude faster than what you
currently have, as everything but the function call and the return
statement would become C code.  Compilers these days are very good at
optimizing that kind of thing too.

--Hoyt


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] how to get the corresponding eigenvector for a specific eigen value?

2009-02-08 Thread Hoyt Koepke
 I have tried with linalg.solve(a, b), where I put a as the
 Matrix A - eigen value* unit matrix, and b as the zero matrix. But the
 solution returned is a zero matrix, which I really find disappointing.

So if you're trying to solve (A - \lambda I) x = b, try appending an
extra row to your  (A - \lambda I) matrix with all ones; the output of
this will be the some of the elements of your eigenvector.  Append a 1
to the end of your b.  This will exclude the case where x = 0 as valid
solution, as you'll then require that \sum_i x_i = 1.  You'll probably
want to renormalize properly afterwards.

Quick and dirty, but should work.

--Hoyt


-- 


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] glibc error

2009-01-24 Thread Hoyt Koepke
Not sure what the problem is in your case, but I had the same error a
while back.  For some reason, ATLAS thought it should be compiled as
64 bit but numpy as 32 bit. .  It went away after I passed the -b 32
flag to configure in ATLAS (I think).  Thus that's one thing to check.
Also, make sure you're using ATLAS 3.8.x, as I had problems with the
3.9 series (don't know if that would be an issue still).

--Hoyt

On Sat, Jan 24, 2009 at 8:47 PM, Gideon Simpson
simp...@math.toronto.edu wrote:
 Having built an up to date lapack and ATLAS against gcc 4.3.2, I tried
 installing numpy 1.2.1 on Python 2.5.4.  When testing I get:

 Python 2.5.4 (r254:67916, Jan 24 2009, 00:27:20)
 [GCC 4.3.2] on linux2
 Type help, copyright, credits or license for more information.
   import numpy
   numpy.test()
 Running unit tests for numpy
 NumPy version 1.2.1
 NumPy is installed in /usr/local/nonsystem/simpson/lib/python2.5/site-
 packages/numpy
 Python version 2.5.4 (r254:67916, Jan 24 2009, 00:27:20) [GCC 4.3.2]
 nose version 0.10.4
 ..F
 K
 
 
 ***
  glibc
  detected *** python: free(): invalid next size (fast):
 0x1196b550 ***


 I then have to kill python to get control again.
 -gideon

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] glibc memory corruption when running numpy.test()

2008-11-20 Thread Hoyt Koepke
Hi,

Sorry; my first message wasn't under 40 KB with the attachments, so
here's the same message but with the log files at
http://www.stat.washington.edu/~hoytak/logs.tar.bz2.


 Which ones ?

Sorry; ATLAS = 3.9.4 and lapack=3.2.  I'll give 3.8.2 a shot per your advice.

 You should not do that, it won't work as you would expect. It is a good
 rule to assume that you should never set the *FLAGS variable unless you
 really know what you are doing.

Fair enough.  In my case I was having some issues with 32 bit and 64
bit mismatches (I think that fftw defaulted to 32 bit), so I set the
flags variables.  I also wanted to get the extra few percent of
performance by using the tuning flags.  I'll back up a bit now before
playing with them now, though.

 First, can you try without any blas/lapack (Do BLAS=None LAPACK=None
 ATLAS=None python setup.py ) ?

This now works in the sense that it doesn't hang.  I still get a
number of test failures, however (build + test logs attached).

Thanks a lot for the help!

--Hoyt



+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ [EMAIL PROTECTED]
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] glibc memory corruption when running numpy.test()

2008-11-20 Thread Hoyt Koepke
Hi,

 I honestly don't think those flags matter much in the case of
 numpy/scipy. In particular, using SSE and co automatically is simply
 impossible in numpy case, since the C code is very generic (non-aligned
 - non contiguous items) and the compiler has no way to know at compile
 time which cases are contiguous.

Good to know.  I'll try to surpress my desire to optimize and not care
about them :-).

 Those errors seem link to  the flags you have been using. Some errors
 are really strange (4 vs 8 bytes types), but I don't see how it could be
 explained by a mismatch of 32 vs 64 bits machine code (to the best of my
 knowledge, you can't mix 32 and 64 bits machine code in one binary).
 Maybe a compiler bug when using -march flag.

 Please try building numpy wo BLAS/LAPACK and wo compiler flags first, to
 test that the bare configuration does work, and that the problems are
 not due to some bugs in your toolchain/OS/etc... The test suite should
 run without any failure in this case; then, we can work on the
 BLAS/LAPACK thing,

I believe the logs I attached (or rather linked to) don't involve
atlas or lapack or any compiler flags.  I agree that they are strange
and I may have something weird floating around.  It's getting late
here, so I'll double check everything in the morning and may try to
run gcc's test suite to verify that isn't the problem.

Thanks again!

--Hoyt

 cheers,

 David

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ [EMAIL PROTECTED]
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] glibc memory corruption when running numpy.test()

2008-11-19 Thread Hoyt Koepke
Hello,

Sorry about this bug report I know how much us programmers like
these kind of bugs :-/.

In testing the latest svn version of numpy (6083), I get a memory
corruption error:

test_csingle (test_linalg.TestSolve) ... ok
test_double (test_linalg.TestSolve) ... ok
test_empty (test_linalg.TestSolve) ... ok
Check that matrix type is preserved. ... ok
Check that matrix type is preserved. ... ok
test_nonarray (test_linalg.TestSolve) ... ok
test_single (test_linalg.TestSolve) ... ok
Ticket #652 ... *** glibc detected *** python: free(): invalid next
size (fast): 0x02c80ef0 ***

... and it hangs.

I don't get any errors in my installation.  The system is a 64bit
intel xeon server running debian unstable, compiler is gcc-4.3.2

This *doesn't* happen with version 1.2.1.  I do get some other test
failures with that version, which I can post if you all think they are
related, but they don't seem to be at first glance.

I've attached the full log.  If there is anything more you want me to
do with this, I'd be happy to.

Thanks!
--Hoyt



+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ [EMAIL PROTECTED]
++


numpy_test.log.gz
Description: GNU Zip compressed data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] glibc memory corruption when running numpy.test()

2008-11-19 Thread Hoyt Koepke
Sorry;

I also added '-fPIC' to the compile flags, though it may work without
having it there.  I just got errors related to not having it and so
threw it at everything...

--Hoyt



On Wed, Nov 19, 2008 at 11:41 PM, Hoyt Koepke [EMAIL PROTECTED] wrote:
 Attached.

 I had a bunch of issues getting things to install with lapack and
 ATLAS.  In the end I had to specify the following environment
 variables (+ appropriate command line options to ATLAS  lapack) to
 get it to work.  If there's an easier way, let me know.


 export FLAGS='-march=core2 -mtune=core2 -m64'
 export CFLAGS=$FLAGS
 export CPPFLAGS=$FLAGS
 export CXXFLAGS=$FLAGS
 export FFLAGS=$FLAGS
 export F77FLAGS=$FLAGS
 export LDFLAGS=$FLAGS


 When compiling numpy, I had to manually add '-shared' to LDFLAGS to
 get it to work.


 Thanks!
 --Hoyt



 On Wed, Nov 19, 2008 at 11:33 PM, David Cournapeau
 [EMAIL PROTECTED] wrote:
 On Wed, 2008-11-19 at 23:23 -0800, Hoyt Koepke wrote:
 Hello,

 Sorry about this bug report I know how much us programmers like
 these kind of bugs :-/.

 In testing the latest svn version of numpy (6083), I get a memory
 corruption error:

 test_csingle (test_linalg.TestSolve) ... ok
 test_double (test_linalg.TestSolve) ... ok
 test_empty (test_linalg.TestSolve) ... ok
 Check that matrix type is preserved. ... ok
 Check that matrix type is preserved. ... ok
 test_nonarray (test_linalg.TestSolve) ... ok
 test_single (test_linalg.TestSolve) ... ok
 Ticket #652 ... *** glibc detected *** python: free(): invalid next
 size (fast): 0x02c80ef0 ***

 ... and it hangs.

 I don't get any errors in my installation.  The system is a 64bit
 intel xeon server running debian unstable, compiler is gcc-4.3.2

 Could you give us the complete build log (e.g. when build from scratch:
 remove the build directory and give us the log of python setup.py build
  build.log) ?

 You have tens of failures which are more likely due to a build problem,

 David

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




 --

 
 + Hoyt Koepke
 + University of Washington Department of Statistics
 + http://www.stat.washington.edu/~hoytak/
 + [EMAIL PROTECTED]
 ++




-- 


+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ [EMAIL PROTECTED]
++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy, swig and TNT-Arrays

2008-11-11 Thread Hoyt Koepke
Hi Rolf,

Just curious -- have you considered using the blitz++ library
(http://www.oonumerics.org/blitz/)?  There seems to be a lot of
overlap in terms of functionality.  If you use blitz++, it's largely
included in scipy as part of weave.  Additionally, I already have code
that generates wrappers to functions taking such arrays using weave.
If blitz++ would work, I'll send them to you.

-- Hoyt

On Tue, Nov 11, 2008 at 2:28 AM, Rolf Wester
[EMAIL PROTECTED] wrote:
 Charles R Harris wrote:
 On Tue, Nov 11, 2008 at 2:19 AM, Rolf Wester
 [EMAIL PROTECTED]wrote:

 Charles R Harris wrote:
 On Tue, Nov 11, 2008 at 1:24 AM, Rolf Wester
 [EMAIL PROTECTED]wrote:

 Hi all,

 I would like to wrap some C++ classes that use TNT-Arrays. Is it
 possible to pass numpy arrays to C++ functions that expect TNT-Arrays as
 function parameter? Does anybody know how the wrappers could be
 generated using swig? I would be very appreciative for any help.

 With kind regards

 IIRC, TNT does vectors and matrices, they have constructors, and they are
 contiguous. I think you can make wrappers, but it isn't going to be
 anything
 straight forward unless you can reuse the memory from a numpy array and I
 don't recall that that sort of constructor is available.

 Is TNT still active? It looked pretty dead last time I looked several
 years
 ago.

 Chuck



 

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion
 TNT has constructors like:

 TNT::Array1Ddouble(int n, double * data)

 which do not allocate a new C-array but that use data as their
 data-array.


 I don't think there is any easy way to do what you want without writing some
 code somewhere along the line. You can expose the C++ functions and TNT to
 python, but to use numpy arrays you will need some way to get the data back
 and forth between TNT arrays and numpy arrays. I suspect you will end up
 just copying data into TNT arrays, calling your function,  and then copying
 data back out of the result. Cython might be an alternative to swig for
 that.

 It would help to have a better idea of what you want to do. Do you just want
 to wrap an existing bunch of functions that use TNT?

 Chuck



 

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

 It's my own code so I have the choice how to do it. Until now I used the
 typemaps defined in numpy.i, so I had either to use the 1-dimensional
 arrays even in case of multidimensional data or to copy the data. I
 wondered wether there is a more elegant way of using numpy arrays on the
 python side and TNT::Arrays on the C++ side without having to
 explicitely write extra code.


 Rolf

 --
 
 # Dr. Rolf Wester
 # Fraunhofer Institut f. Lasertechnik
 # Steinbachstrasse 15, D-52074 Aachen, Germany.
 # Tel: + 49 (0) 241 8906 401, Fax: +49 (0) 241 8906 121
 # EMail: [EMAIL PROTECTED]
 # WWW:   http://www.ilt.fraunhofer.de
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] problem installing numpy using scons

2008-10-25 Thread Hoyt Koepke
Excellent; thank you -- I will test it out soon.

-- Hoyt

On Sat, Oct 25, 2008 at 2:57 AM, David Cournapeau [EMAIL PROTECTED] wrote:
 On Sun, Oct 19, 2008 at 7:57 PM, David Cournapeau [EMAIL PROTECTED] wrote:

 The trunk requires some changes which are not released yet. It will be
 in the next release of numscons (0.9.3).

 Which has just been released.

 David
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] problem installing numpy using scons

2008-10-18 Thread Hoyt Koepke
Hello,

I'm trying to install the latest numpy using setupscons.py, but it
gives me an error that I can't track down.  It's in the latest numpy
release (5946). with the latest numpyscons from the 0.9.3 branch.

Here's the last part of output:

is bootstrapping ? True
Executing scons command (pkg is numpy.core): /usr/local2/bin/python
/usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons.py
-f numpy/core/SConstruct -I. scons_tool_path= src_dir=numpy/core
pkg_name=numpy.core log_level=50
distutils_libdir=../../../../build/lib.linux-x86_64-2.5 cc_opt=gcc
cc_opt_path=/usr/bin f77_opt=gfortran f77_opt_path=/usr/bin
cxx_opt=g++ cxx_opt_path=/usr/bin
include_bootstrap=../../../../numpy/core/include silent=0
bootstrapping=1
scons: Reading SConscript files ...
Checking for C header file Python.h... yes
Checking size of short ... yes
Checking size of int ... yes
Checking size of long ... yes
Checking size of float ... yes
Checking size of double ... yes
Checking size of long double ... yes
Checking size of Py_intptr_t ... yes
Checking whether PY_LONG_LONG is declared... yes
Checking size of PY_LONG_LONG ... yes
Checking whether CHAR_BIT is declared... yes
Checking whether PRIdPTR is declared... yes
Checking if math lib [] is usable for numpy ...  No !
Checking for C library m... yes
Checking if math lib ['m'] is usable for numpy ...  Yes !
AttributeError: SConfBase instance has no attribute 'CheckFuncsAtOnce':
  File /usr/local2/src/numpy/numpy/core/SConstruct, line 2:
GetInitEnvironment(ARGUMENTS).DistutilsSConscript('SConscript')
  File 
/usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/core/numpyenv.py,
line 108:
build_dir = '$build_dir', src_dir = '$src_dir')
  File 
/usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons-local-1.0.1/SCons/Script/SConscript.py,
line 533:
return apply(_SConscript, [self.fs,] + files, subst_kw)
  File 
/usr/local2/lib/python2.5/site-packages/numscons-0.9.3dev-py2.5.egg/numscons/scons-local/scons-local-1.0.1/SCons/Script/SConscript.py,
line 256:
exec _file_ in call_stack[-1].globals
  File /usr/local2/src/numpy/build/scons/numpy/core/SConscript, line 165:
check_funcs(optional_stdfuncs)
  File /usr/local2/src/numpy/build/scons/numpy/core/SConscript, line 154:
st = config.CheckFuncsAtOnce(funcs)
error: Error while executing scons command. See above for more information.
If you think it is a problem in numscons, you can also try executing the scons
command with --log-level option for more detailed output of what numscons is
doing, for example --log-level=0; the lowest the level is, the more detailed
the output it.

I'm trying to install it on a 8 processor 64 bit Intel machine.  I
could post more details of the installation if needed, but they don't
seem relevant here (except that I've had a lot of issues getting the
regular install script to work with the correct compile parameters,
and so I thought I'd try scons).

If there's something I'm doing wrong, let me know; otherwise I'd be
happy to post a bug report.

--Hoyt
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slicing without a priori knowledge of dimension

2008-08-06 Thread Hoyt Koepke
Try using slice (python builtin) to create slice objects (what is
created implicitly by :5, 1:20, etc.).  slice takes the same arguments
as range.  A list of these (7 in your case) can then be passed to
A[...] as a tuple.

That's how I would do it, but maybe someone else has a better idea or
can correct me.

--Hoyt


On Wed, Aug 6, 2008 at 4:02 PM, Matthew Czesarski
[EMAIL PROTECTED] wrote:
 Dear list.

 I've got a feeling that what I'm trying to do *should* be easy but at the
 moment I can't find a non-brute-force method.

 I'm working with quite a high-rank matrix; 7 dimensions filled with chi**2
 values. It's form is something like this:

 chi2 = numpy.ones((3,4,5,6,7,8,9))

 What I need to do is slice out certain 2 dimensional grids that I then want
 to plot for confidence estimation; make a nice graph. This is fine: as easy
 as it gets if the 2 dimensions are known. E.g. for the 2nd and 5th axes, one
 could hardcode something like this:

 subChi2 = chi2[ ia, ib, :, id, ie, :, ig ]

 The difficulty is that the user is to state which plane (s)he wants to slice
 out and we can't code the above. The function itself has to convert 2 rank
 numbers into an expression for a slice and I can't currently figure out how
 to do that. There are a huge number of options (well, there are 7*6=42). If
 one could just manually write a colon into a tuple like (2,2,:,2,2,:,2) or
 something even like 2,2,0:len(2ndDim),2,2,0:len(5thDim),2, things would be
 fine. But that doesn't seem to be an option. An alternative would be to set
 up 2 loops and incrementally read out the individual elements of the slice
 to a new numpy.zeros ( ( len(2ndDim), len(5thDim) ) ), but again we seem to
 be diverging from elegance.

 There must be something that I'm missing. Could somebody have a pointer as
 to what it is?

 Thanks in advance,
 Matt


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion





-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Large symmetrical matrix

2008-06-11 Thread Hoyt Koepke
 The factor of two memory hit is not really the problem, it is the scaling
 O(N^2) which is the limitation

Have you thought about using kd-trees or similar structures?  If you
are only concerned about the minimum distances between two points,
that is the ideal data structure.  It would reduce your running time /
memory usage for generating / using the data from O(n^2) to O(n log
n), with a O(log n) lookup time for the closest distance to each
point.

Without doing something like that, looking at the distance between
each pair of points is \Omega(n^2).

--Hoyt

-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] segmentation fault

2008-05-28 Thread Hoyt Koepke
In my experience tracking down these sorts of things, if the effect is
delayed and detected by glibc, it almost always means that a few bytes
beyond the end of the data part of an array have been overwritten.
This causes glibc's memory management stuff to crash later on when the
object is deallocated (or something like that).  Of course, I should
say I was doing the overwritting in my own c code...

--Hoyt



On Wed, May 28, 2008 at 8:27 AM, Pauli Virtanen [EMAIL PROTECTED] wrote:
 ke, 2008-05-28 kello 10:59 -0400, Scott Ransom kirjoitti:
 Hmmm.  Interesting.  I'm on a 64-bit Debian Unstable system with numpy
 1.0.4 and python 2.5.2 and I don't get this:

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '1.0.4'

 In [3]: def fn():
...: x = np.random.rand(5,2)
...: x.cumsum(None, out=x)
...: return x
...:

 In [4]: fn()
 Out[4]:
 array([[ 0.40329303,  0.45335328],
[ 0.85664631,  0.84798294],
[ 1.71329262,  0.05877989],
[ 2.56127556,  0.99401291],
[ 4.27456818,  0.79275409]])

 Wonder if the 64-bit thing could be the difference?

 Try running fn() again to make the bug bomb out:

 $ python
 Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42)
 [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
 Type help, copyright, credits or license for more information.
 imp
 KeyboardInterrupt
 import numpy as np
 def fn():
 ... x = np.random.rand(5,2)
 ... x.cumsum(None, out=x)
 ... return x
 ...
 np.__version__
 '1.0.5.dev5024'
 fn()
 array([[ 0.59654253,  0.12577169],
   [ 0.72231422,  0.30600244],
   [ 1.44462843,  0.35849553],
   [ 1.75063088,  0.56925858],
   [ 3.19525931,  0.77487798]])
 fn()
 *** glibc detected *** python: munmap_chunk(): invalid pointer:
 0x08439d28 ***

 --
 Pauli Virtanen


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quick Question about Optimization

2008-05-20 Thread Hoyt Koepke
 The time has now been shaved down to ~9 seconds with this suggestion
 from the original 13-14s, with the inclusing of Eric Firing's
 suggestions.  This is without scipy.weave, which at the moment I can't
 get to work for all lines, and when I just replace one of them
 sucessfully it seems to run more slowly, I assume because it is
 converting data back and forth.

There's a nontrivial constant time cost to using scipy.weave.blitz,
but it isn't copying the data. Thus it will slow you down on smaller
arrays, but you'll notice quite a speed-up on much larger ones.  I
should have mentioned that earlier -- I assumed your arrays were
really large.

 Are there any major pitfalls to be aware of?  It sounds like if I do:
 f = a[n,:] I get a reference, but if I did something like g = a[n,:]*2
 I would get a copy.

Well, if you do f = a[n, :], you would get a view, another object that
shares the data in memory with a but is a separate object.

 If anyone has any clues on why scipy.weave is blowing
 (http://pastebin.com/m79699c04) using the code I attached, I wouldn't
 mind knowing.  Most of the times I've attempted using weave, I've been
 a little baffled about why things aren't working.  I also don't have a
 sense for whether all numpy functions should be weavable or if it's
 just general array operations that can be put through weave.

Sorry I didn't get back to you earlier on this -- I was a bit busy
yesterday.  It looks like weave.blitz isn't working on your second
line because you're not explicitly putting slices in some of the
dimensions,  In numpy v[0:2] works for 1,2,3,4, dimensions, but
for a 2d array in blitz you have to use v[0:2,:], 3d v[0:2,:,:].  It's
a bit more picky.   I think that's the problem with your second line
-- try replacing v[:] with v[0,:] and theta[1-curidx] with
theta[1-curidx, :]. (I may have missed some others.)

weave.blitz is currently limited to just array operations... it
doesn't really support the numpy functions.

Hope that helps a little

-- Hoyt

-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quick Question about Optimization

2008-05-19 Thread Hoyt Koepke
for n in range(0,time_milliseconds):
self.u  =  self.expfac_m  *  self.prev_u +
 (1-self.expfac_m) * self.aff_input[n,:]
self.v = self.u + self.sigma *
 np.random.standard_normal(size=(1,self.naff))
self.theta = self.expfac_theta * self.prev_theta -
 (1-self.expfac_theta)

idx_spk = np.where(self.v=self.theta)

self.S[n,idx_spk] = 1
self.theta[idx_spk] = self.theta[idx_spk] + self.b

self.prev_u = self.u
self.prev_theta = self.theta

Copying elements into array objects that already exist will always be
faster than creating a new object with separate data.  However, in
this case, you don't need to do any copying or creation if you use a
flip-flopping index to handle keeping track of the previous. If I drop
the selfs, you can translate the above into (untested):


curidx = 0  # prev will be 2-curidx

u = empty( (2, naff) )
v = empty( naff )
theta = empty( (2, naff) )

stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) )

for n in xrange(time_milliseconds):
u[curidx, :] = expfac_m  *  u[2-curidx, :] + (1-expfac_m) * aff_input[n,:]
v[:] = u[curidx, :] + sigma * stdnormrvs[n, :]
theta[curidx, :] = expfac_theta * theta[2-curidx] - (1-expfac_theta)

idx_spk = np.where(v = theta)
S[n,idx_spk] = 1
theta[curidx, idx_spk] += b

# Flop to handle previous stuff
curidx = 2 - curidx

This should give you a substantial speedup.  Also, I have to say that
this is begging for weave.blitz, which compiles such statements using
templated c++ code to avoid temporaries.  It doesn't work on all
systems, but if it does in your case, here's what your code might look
like:

import scipy.weave as wv

curidx = 0

u = empty( (2, naff) )
v = empty( (1, naff) )
theta = empty( (2, naff) )

stdnormrvs = np.random.standard_normal(size=(time_milliseconds,naff) )

for n in xrange(time_milliseconds):
wv.blitz(u[curidx, :] = expfac_m  *  u[2-curidx, :] +
(1-expfac_m) * aff_input[n,:])
wv.blitz(v[:] = u[curidx, :] + sigma * stdnormrvs[n, :])
wv.blitz(theta[curidx, :] = expfac_theta * theta[2-curidx] -
(1-expfac_theta))

idx_spk = np.where(v = theta)
S[n,idx_spk] = 1
theta[curidx, idx_spk] += b

# Flop to handle previous stuff
curidx = 2 - curidx



-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++


On Mon, May 19, 2008 at 11:08 AM, James Snyder [EMAIL PROTECTED] wrote:
 Hi -

 First off, I know that optimization is evil, and I should make sure
 that everything works as expected prior to bothering with squeezing
 out extra performance, but the situation is that this particular block
 of code works, but it is about half as fast with numpy as in matlab,
 and I'm wondering if there's a better approach than what I'm doing.

 I have a chunk of code, below, that generally iterates over 2000
 iterations, and the vectors that are being worked on at a given step
 generally have ~14000 elements in them.

 In matlab, doing pretty much exactly the same thing takes about 6-7
 seconds, always around 13-14 with numpy on the same machine.  I've
 gotten this on Linux  Mac OS X.

 self.aff_input has the bulk of the data in it (2000x14000 array), and
 the various steps are for computing the state of some afferent neurons
 (if there's any interest, here's a paper that includes the model:
 Brandman, R. and Nelson ME (2002) A simple model of long-term spike
 train regularization. Neural Computation 14, 1575-1597.)

 I've imported numpy as np.

 Is there anything in practice here that could be done to speed this
 up?  I'm looking more for general numpy usage tips, that I can use
 while writing further code and not so things that would be obscure or
 difficult to maintain in the future.

 Also, the results of this are a binary array, I'm wondering if there's
 anything more compact for expressing than using 8 bits to represent
 each single bit.  I've poked around, but I haven't come up with any
 clean and unhackish ideas :-)

 Thanks!

 I can provide the rest of the code if needed, but it's basically just
 filling some vectors with random and empty data and initializing a few
 things.

for n in range(0,time_milliseconds):
self.u  =  self.expfac_m  *  self.prev_u +
 (1-self.expfac_m) * self.aff_input[n,:]
self.v = self.u + self.sigma *
 np.random.standard_normal(size=(1,self.naff))
self.theta = self.expfac_theta * self.prev_theta -
 (1-self.expfac_theta)

idx_spk = np.where(self.v=self.theta)

self.S[n,idx_spk] = 1
self.theta[idx_spk] = self.theta[idx_spk] + self.b

self.prev_u = self.u
self.prev_theta = self.theta

 --
 James Snyder
 Biomedical Engineering
 Northwestern University
 [EMAIL PROTECTED]
 ___
 Numpy-discussion

Re: [Numpy-discussion] Quick Question about Optimization

2008-05-19 Thread Hoyt Koepke
On Mon, May 19, 2008 at 12:53 PM, Robin [EMAIL PROTECTED] wrote:
 Hi,

 I think my understanding is somehow incomplete... It's not clear to me
 why (simplified case)

 a[curidx,:] = scalar * a[2-curidx,:]
 should be faster than
 a = scalar * b

 In both cases I thought the scalar multiplication results in a new
 array (new memory allocated) and then the difference between copying
 that result into the existing array u[curix,:] or reassining the
 reference u to that result should be very similar?

 If anything I would have thought the direct assignment would be
 quicker since then there is no copying.

 What am I missing?

Actually, I think you are correct.  My bad.  I was mainly thinking in
terms of weave.blitz, where it would make a difference, then
translating back...

--Hoyt

+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-04 Thread Hoyt Koepke
  and then update the others only if the row you're updating has that
  minimum value in it.  Then, when scanning for the min dist, you only
  need to scan O(n) rows.

Sorry, let me clarify -- Update the entries corresponding to entries
in the row you're updating if they are the same as the minimum
distance; in that case you need to rescan the row.  Also, ySorry, ou
only need to scan O(n) entries in your cached array.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-03 Thread Hoyt Koepke
You know, for linkage clustering and BHC, I've found it a lot easier
to work with an intermediate 1d map of indices and never resize the
distance matrix.  I then just remove one element from this map at each
iteration, which is a LOT faster than removing a column and a row from
a matrix.  if idxmap were the map, you would be working with
X[idxmap[i], idxmap[j] ] instead of X[i, j].  Also, you could even use
a python list in this case, as they are a lot better for deletions
than an array.

--Hoyt

On Fri, May 2, 2008 at 5:47 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Fri, May 2, 2008 at 5:38 PM, Charles R Harris
  [EMAIL PROTECTED] wrote:
   On Fri, May 2, 2008 at 6:24 PM, Keith Goodman [EMAIL PROTECTED] wrote:
How can I make this function faster? It removes the i-th row and
column from an array.
   
  
   Why do you want to do that?

  Single linkage clustering; x is the distance matrix.


 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-03 Thread Hoyt Koepke
You could also try complete linkage, where you merge two clusters
based on the farthest distance between points in two clusters instead
of the smallest.  This will tend to get clusters of equal size (which
isn't always ideal, either).  However, it also uses sufficient
statistics, so it will be trivial to change your code to use that
merge criteria if you want to try it.

--Hoyt





On Sat, May 3, 2008 at 5:31 PM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Sat, May 3, 2008 at 5:05 PM, Christopher Barker
  [EMAIL PROTECTED] wrote:

  Robert Kern wrote:
 I can get a ~20% improvement with the following:
  
  
In [9]: def mycut(x, i):
...: A = x[:i,:i]
...: B = x[:i,i+1:]
...: C = x[i+1:,:i]
...: D = x[i+1:,i+1:]
...: return hstack([vstack([A,C]),vstack([B,D])])
  
Might it be a touch faster to built the final array first, then fill it:
  
def mycut(x, i):
r,c = x.shape
out = np.empty((r-1, c-1), dtype=x.dtype)
out[:i,:i] = x[:i,:i]
out[:i,i:] = x[:i,i+1:]
out[i:,:i] = x[i+1:,:i]
out[i:,i+1:] = x[i+1:,i+1:]
return out
  
totally untested.
  
That should save the creation of two temporaries.

  Initializing the array makes sense. And it is super fast:

   timeit mycut(x, 6)
  100 loops, best of 3: 7.48 ms per loop
   timeit mycut2(x, 6)
  1000 loops, best of 3: 1.5 ms per loop

  The time it takes to cluster went from about 1.9 seconds to 0.7
  seconds! Thank you.

  When I run the single linkage clustering on my data I get one big
  cluster and a bunch of tiny clusters. So I need to try a different
  linkage method. Average linkage sounds good, but it sounds hard to
  code.


 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] types as functions convert 1 elm arrays to scalars

2008-05-01 Thread Hoyt Koepke
To be honest, this doesn't seem justifiable.

Where it got me is interfacing with c-code that expected a 1d array,
and I was calling it with arrays of varying length.  I was using this
to ensure the proper typing; however, when the array was length 1, the
program crashed...

Should I file a bug report?

--Hoyt

On Mon, Apr 28, 2008 at 11:51 PM, Charles R Harris
[EMAIL PROTECTED] wrote:



 On Tue, Apr 29, 2008 at 12:28 AM, Hoyt Koepke [EMAIL PROTECTED] wrote:
  Hello,
 
  I have a quick question that I'm hoping will improve my numpy
  understanding.  I noticed some behavior when using float64 to convert
  a matrix type that I didn't expect:
 
 
  In [35]: b1 = array([1.0])
 
  In [36]: float64(b1)
  Out[36]: 1.0
 
  In [37]: b2 = array([1.0, 2.0])
 
  In [38]: float64(b2)
  Out[38]: array([ 1.,  2.])
 
 
  I didn't expect calling float64 would convert b1 to a scalar. Seems
  like an inconsistency.  I assume this is intentional, as someone would
  have noticed it a long time ago if not, so could someone explain the
  reasoning behind it?  (or point me to a source that will help?)
 

 It's inconsistent and looks like a bug:

 In [4]: float32(array([[[1]]]))
 Out[4]: array([[[ 1.]]], dtype=float32)

 In [5]: float64(array([[[1]]]))
 Out[5]: 1.0

 Float64 is a bit special because it starts as the python float. Maybe Travis
 can say what the differences are.

 Chuck



 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion





-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Distribution Functions Change Behavior

2008-04-28 Thread Hoyt Koepke
I may not understand what you are asking, Rich, but I'm not sure I
agree with Alan.  A Gaussian fit to data x should fit exactly as well
as data fit to ax, a  0, just with a variance a^2 times the original.
 The only way this would not be true is if:

1. You are not fitting the variance, but only the mean
2. There's some numerical issue (like some of the data are represented
as integers, etc.)

Don't know if it could be one of those issues...

--Hoyt

On Mon, Apr 28, 2008 at 7:18 AM, Alan G Isaac [EMAIL PROTECTED] wrote:
 Hi Rich,

  If your data is truncated at zero, it is not Gaussian (drawn
  from a normal).  You will notice this when you shrink the
  range of values (unless the variance is tiny).

  Cheers,
  Alan Isaac





  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Distribution Functions Change Behavior

2008-04-28 Thread Hoyt Koepke
Wait, I think I see what Alan is saying.  When you use a gaussian
approximation on truncated data, the accuracy of the truncation is
very dependent on where in the interval the mean is.  If it's near the
edges, the results will be worse.  The width of the interval, though,
is a separate factor.

--Hoyt

On Mon, Apr 28, 2008 at 8:29 AM, Hoyt Koepke [EMAIL PROTECTED] wrote:
 I may not understand what you are asking, Rich, but I'm not sure I
  agree with Alan.  A Gaussian fit to data x should fit exactly as well
  as data fit to ax, a  0, just with a variance a^2 times the original.
   The only way this would not be true is if:

  1. You are not fitting the variance, but only the mean
  2. There's some numerical issue (like some of the data are represented
  as integers, etc.)

  Don't know if it could be one of those issues...

  --Hoyt



  On Mon, Apr 28, 2008 at 7:18 AM, Alan G Isaac [EMAIL PROTECTED] wrote:
   Hi Rich,
  
If your data is truncated at zero, it is not Gaussian (drawn
from a normal).  You will notice this when you shrink the
range of values (unless the variance is tiny).
  
Cheers,
Alan Isaac
  
  
  
  
  
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
  



  --
  +++
  Hoyt Koepke
  UBC Department of Computer Science
  http://www.cs.ubc.ca/~hoytak/
  [EMAIL PROTECTED]
  +++




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Generating Bell Curves (was: Using normal() )

2008-04-25 Thread Hoyt Koepke
Another suggestion from machine learning stuff to throw into the mix:

A soft step function that we use often is y = e^(ax) / ( 1 + e^(ax)).
It has the nice property that the result y is always in (0,1).  If you
invert this, you get x = -(1/a)*log(y - 1); this maps (0,1) to the
whole real line, and the a parameter controls how sharp that mapping
is.  Now use that as the input to a Gaussian, and you can get a soft
truncation at (0,1).

Additionally, you can do this twice to have more control over the
shape of the resulting distribution.  I suspect the resulting
kurtosis/skew factors would be calculable.

This has the advantage of giving you a calculable pdf (just normalize
the resulting distribution using the inverse det-of-Jacobian factor)
without too much hassle.  Furthermore, it should be easy to fit the
parameters to data without too much difficulty (though I haven't
tried).

Just a thought, though I've never worked with all this much so I can't
say for sure how well it would work.

--Hoyt


On Fri, Apr 25, 2008 at 4:39 PM, Charles R Harris
[EMAIL PROTECTED] wrote:



 On Fri, Apr 25, 2008 at 1:25 PM, Rich Shepard [EMAIL PROTECTED]
 wrote:
 
  On Fri, 25 Apr 2008, Charles R Harris wrote:
 
   You can use something like f(x) = (1-x**2)**2 , which has inflection
   points and vanishes at +/- 1. Any of the B-splines will also do the
 trick.
 
  Chuck,
 
Thank you. I need to make some time to understand the B-splines to use
  them appropriately. Unfortunately, my mathematical statistics learning was
  many years in the past ... but we had moved ahead of writing on clay
 tablets
  by that time. Not needing to retain that knowledge for many years means it
  was replaced by more pressing current knowledge. The B-splines do look
  promising, though.
 
 
 
 

 Here's a B-spline approximation to a Gaussian:
 http://www.doc.ic.ac.uk/~dfg/AndysSplineTutorial/BSplines.html

 Chuck



 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion





-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] access ndarray in C++

2008-04-23 Thread Hoyt Koepke
Just to add something from personal experience in case it's useful...

I write a lot of code that works on numpy arrays that goes between
python and c++ (too used to OO to stick with pure c). I've been using
scipy.weave to interface to blitz++ arrays in my c++ code.  The
blitz++ library has been wonderful in my experience -- it supports
arrays up to 11 dimensions, slicing, etc.  -- essentially a very
useful subset of numpy's operations.  It's also nice because it can
interface without copying any data; it just provides a wrapper class
that sits on a numpy array.  The library is included as part of scipy.

I recently wrote a function that automatically generates wrappers for
a c++ function that accepts blitz++ arrays which I recently posted to
the scipy list.  You just specify function names and the list of
argument types (nparrays included) and it generates the wrapper code.
Quite easy.  It's pertinent to the discussion here, so I'll send it
again.  Included is a full working example with setup.py.  let me know
if it helps/is confusing/doesn't work/etc.

--Hoyt





On Wed, Apr 23, 2008 at 12:30 PM, Andreas Klöckner
[EMAIL PROTECTED] wrote:
 On Mittwoch 23 April 2008, Christopher Barker wrote:
   NOTE:
   Most folks now think that the pain of writing extensions completely by
   hand is not worth it -- it's just too easy to make reference counting
   mistakes, etc. Most folks are now using one of:
  
   Cython (or Pyrex)
   SWIG
   ctypes

  IMO, all of these deal better with C than they do with C++. There is also a
  number of more C++-affine solutions:

  - Boost Python [1]. Especially if you want usable C++ integration. (ie. more
  than basic templates, etc.)

  - sip [2]. Used for PyQt.

  Andreas

  [1] http://www.boost.org/doc/libs/1_35_0/libs/python/doc/index.html
  [2] http://www.riverbankcomputing.co.uk/sip/index.php

 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion





-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++


weave_wrapper.tar.gz
Description: GNU Zip compressed data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] #734: interactive docstring search (lookfor)

2008-04-10 Thread Hoyt Koepke
My personal 2 cents is that we should put it in the most easy-to-find
place.  I wish I had the propsed lookfor function at my fingertips
when I started to learn numpy/scipy a year ago, coming from matlab.
Like many others (I assume), I just jumped in and learned what numpy
could do by trial and error and I never noticed numpy.lib.utils.info.
Thus my personal vote (though I understand this may not be the best
and there are arguments against it) is to put a lookfor or info
function in the top level namespace, with an obvious name, so newbies
could easily run across it and have a more enjoyable numpy learning
experience.

That's just my humble opinion ...

--Hoyt

On Thu, Apr 10, 2008 at 2:42 PM, Pauli Virtanen [EMAIL PROTECTED] wrote:
 Travis E. Oliphant kirjoitti:
   Pauli Virtanen wrote:
  [clip]

  I think this is a good idea: full-namespace docstring search à la Matlab
   lookfor. I wrote a quick implementation for numpy here:
   http://scipy.org/scipy/numpy/ticket/734
  
  
   Cool.  I started scipy.misc.info a long time ago to try and do this.   I
   didn't advertise it well enough ;-)
  
   scipy.misc.info('eigvals')
  
   I'm happy to move this functionality into numpy (but it might be better
   in IPython).

  Aha! I was wondering if something this simple was already present, but I
  didn't notice that numpy.lib.utils.info also accepted strings as
  parameters. Btw, the info function is already in numpy and the code is
  duplicated in scipy.misc.

  However, info is a bit different from lookfor in that it finds
  documentation by the exact function name, not by looking into the
  docstrings.

  So, should info and lookfor be combined, or kept separate?

  --
  Pauli Virtanen


 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
+++
Hoyt Koepke
UBC Department of Computer Science
http://www.cs.ubc.ca/~hoytak/
[EMAIL PROTECTED]
+++
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] #734: interactive docstring search (lookfor)

2008-04-10 Thread Hoyt Koepke
  In [1]: import numpy

  In [2]: numpy.info('eigvals')
  *** Found in numpy.linalg ***
   eigvals(a)

Fair enough  Don't know why I missed that, prob relied too much on
Google search :-)

Having it as part of iPython does make sense.

--Hoyt
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quikest way to create a symetric (diagonal???) matrix ?

2008-03-26 Thread Hoyt Koepke
If the rest of the matrix is already zeros and memory wasn't a
problem, you could just use

A_sym = A + A.T - diag(diag(A))

If memory was an issue, I'd suggest weave.inline (if that's a viable
option) or pyrex to do the loop, which would be about as fast as you
could get.

--Hoyt



On Wed, Mar 26, 2008 at 7:22 AM, Alexandre Fayolle
[EMAIL PROTECTED] wrote:
 On Wed, Mar 26, 2008 at 09:48:02AM -0400, Pierre GM wrote:
   All,
   What's the quickest way to create a diagonal matrix ? I already have the
   elements above the main diagonal. Of course, I could use loops:
   m=5
   z = numpy.arange(m*m).reshape(m,m)
   for k in range(m):
   for j in range(k+1,m):
   z[j,k] = z[k,j]
   But I was looking for something more efficient.

  From your code, you certainly meant symetric and not diagonal.

  Maybe you can speed up things a bit by assigning slices:

   for k in range(m):
  ... z[k:, k] = z[k, k:]



  --
  Alexandre Fayolle  LOGILAB, Paris (France)
  Formations Python, Zope, Plone, Debian:  http://www.logilab.fr/formations
  Développement logiciel sur mesure:   http://www.logilab.fr/services
  Informatique scientifique:   http://www.logilab.fr/science

 -BEGIN PGP SIGNATURE-
  Version: GnuPG v1.4.6 (GNU/Linux)

  iQEVAwUBR+pcDl6T+PKoJ87eAQI1zAf/W7wnB1a6sa4FuHDPTDjU61ZpvDgS41r7
  B7EuSDncTluf3Y5ynQ8NroAihX0DvV4F5LTDcbFJbmqnQx8JApVoeQF3wnTnpf24
  pUQ5oSB+w0+RtzU0Zu/TBkOh3hM8iPYyB2M7jq9/qakVxEsrlOiTH+j05ysJD9FG
  GezArMoQu5ycJ26Ir9P7jR0acH/WBA84U524aiDbenLMmpFIZX7mElU47z/Ue5m7
  xKTT/lu3BWQAJPoQTiHG7nRLDaAqxKVO0WLXPuUJ7HyCc4qjURhXZMmJQ2FP2ajt
  H9AQQhNkO7eUAPmMLhK0x262bYIdq699UmjV7YOVmSvCrBM76okqew==
  =ha+1
  -END PGP SIGNATURE-

 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dunno what array operation I'm looking for...

2008-03-21 Thread Hoyt Koepke
Try

result = A[1:] - A[:-1]

--Hoyt

On Fri, Mar 21, 2008 at 7:43 PM, Chris Withers [EMAIL PROTECTED] wrote:
 Hi All,

  Say I have an array like:

measurements = array([100,109,115,117])

  What do I do to it to get:

  array([9, 6, 2])

  Is the following really the best way?

result = []
for i in range(1,len(measurements)):
  ...   result.append(measurements[i]-measurements[i-1])
  ...
array(result)
  array([9, 6, 2])

  cheers,

  Chris

  --
  Simplistix - Content Management, Zope  Python Consulting
 - http://www.simplistix.co.uk
  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.random.RandomState threadsafe?

2008-03-11 Thread Hoyt Koepke
This should be a really quick question.  Is a RandomState object
thread safe?  I'm wanting to use a common RandomState object in a
multithreaded program, and I need to know if it's necessary to protect
it with a lock (which wouldn't be difficult).

Thanks!
--Hoyt
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.random.RandomState threadsafe?

2008-03-11 Thread Hoyt Koepke
Okay, thanks!  I won't be using the multivariate_normal function in my
code, so this should work fine.

--Hoyt

On Tue, Mar 11, 2008 at 3:00 PM, Robert Kern [EMAIL PROTECTED] wrote:

 On Tue, Mar 11, 2008 at 4:18 PM, Hoyt Koepke [EMAIL PROTECTED] wrote:
   This should be a really quick question.  Is a RandomState object
thread safe?  I'm wanting to use a common RandomState object in a
multithreaded program, and I need to know if it's necessary to protect
it with a lock (which wouldn't be difficult).

  For nearly all of the methods, yes, they should be. RandomState is
  implemented in C (using Pyrex) and the GIL is acquired before calling
  any C functions. The caveat here is that the methods
  multivariate_normal() calls back out to Python-implemented functions.
  It is possible that the GIL gets released during that call and that
  another thread can pick up execution then. However, even this should
  not be a problem as far as safety goes; no internal state is read or
  changed after the external call.

  --
  Robert Kern

  I have come to believe that the whole world is an enigma, a harmless
  enigma that is made terrible by our own mad attempt to interpret it as
  though it had an underlying truth.
   -- Umberto Eco
  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimize speed of for loop using numpy

2008-02-25 Thread Hoyt Koepke
I would definitely suggest using scipy's weave.inline for this.  It
seems like this particular function can be translated into C code
really easily, which would give you a HUGE speed up.  Look at some of
the examples in scipy/weave/examples to see how to do this.  The numpy
book also has a section on it.

One of the reasons I've left matlab and never looked back is how easy
it is to interweave bits of compiled C code for loops like this.

--Hoyt

On Mon, Feb 25, 2008 at 6:32 PM, Trond Kristiansen [EMAIL PROTECTED] wrote:
 Hi again.

  I have attached the function that the FOR loop is part of as a python file.
  What I am trying to do is to create a set of functions that will read the
  output files (NetCDF) from running the ROMS model (ocean model). The output
  file is organized in xi (x-direction), eta (y-direction), and s
  (z-direction) where the s-values are vertical layers and not depth. This
  particular function (z_slice) will find the closest upper and lower s-layer
  for a given depth in meters (e.g. -100). Then values from the two selcted
  layers will be interpolated to create a new layer at the selected depth
  (-100). The problem is that the s-layers follow the bathymetry and a
  particular s-layer will therefore sometimes be above and sometimes below the
  selected depth that we want to interpolate to. That's why I need a quick
  script that searches all of the layers and find the upper and lower layers
  for a given depth value (which is negative). The z_r is a 3D array
  (s,eta,xi) that is created using the netcdf module.

  The main goal of these set of functions is to move away from using matlab,
  but also to speed things up. The sliced data array will be plotted using GMT
  or pyNGL.

  Thanks for helping me. Cheers, Trond




  On 2/25/08 9:15 PM, Robert Kern [EMAIL PROTECTED] wrote:

   On Mon, Feb 25, 2008 at 8:08 PM, Trond Kristiansen [EMAIL PROTECTED] 
 wrote:
  
  
Hi all.
This is my first email to the discussion group. I have spent two days 
 trying
to get a particular loop to speed up, and the best result I got was this:
  
   Can you try to repost this in such a way that the indentation is
   preserved? Feel free to attach it as a text file. Also, can you
   describe at a higher level what it is you are trying to accomplish and
   what the arrays mean?


 ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion