Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Nathan Bell
On Thu, Feb 12, 2009 at 8:19 AM, Michael Abshoff
 wrote:
>
> No even close. The current generation peaks at around 1.2 TFlops single
> precision, 280 GFlops double precision for ATI's hardware. The main
> problem with those numbers is that the memory on the graphics card
> cannot feed the data fast enough into the GPU to achieve theoretical
> peak. So those hundreds of GFlops are pure marketing :)
>

If your application is memory bandwidth limited, then yes you're not
likely to see 100s of GFlops anytime soon.  However, compute limited
application can and do achieve 100s of GFlops on GPUs.  Basic
operations like FFTs and (level 3) BLAS are compute limited, as are
the following applications:
http://www.ks.uiuc.edu/Research/gpu/
http://www.dam.brown.edu/scicomp/scg-media/report_files/BrownSC-2008-27.pdf

> So in reality you might get anywhere from 20% to 60% (if you are lucky)
> locally before accounting for transfers from main memory to GPU memory
> and so on. Given that recent Intel CPUs give you about 7 to 11 Glops
> Double per core and libraries like ATLAS give you that performance today
> without the need to jump through hoops these number start to look a lot
> less impressive.

You neglect to mention that CPUs, which have roughly 1/10th the memory
bandwidth of high-end GPUs, are memory bound on the very same
problems.  You will not see 7 to 11 GFLops on a memory bound CPU code
for the same reason you argue that GPUs don't achieve 100s of GFLops
on memory bound GPU codes.

In severely memory bound applications like sparse matrix-vector
multiplication (i.e. A*x for sparse A) the best GPU performance you
can expect is ~10 GFLops on the GPU and ~1 GFLop on the CPU (in double
precision).  We discuss this problem in the following tech report:
http://forums.nvidia.com/index.php?showtopic=83825

It's true that host<->device transfers can be a bottleneck.  In many
cases, the solution is to simply leave the data resident on the GPU.
For instance, you could imagine a variant of ndarray that held a
pointer to a device array.  Of course this requires that the other
expensive parts of your algorithm also execute on the GPU so you're
not shuttling data over the PCIe bus all the time.


Full Disclosure: I'm a researcher at NVIDIA

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GMRES internal variables

2009-02-16 Thread Nathan Bell
On Mon, Feb 16, 2009 at 10:49 PM,   wrote:
>
> I'm trying to run multiple instances of GMRES at the same time (one inside
> another actually, used inside of the preconditioner routine) but am
> running in to some problems. My guess is there is a single set of internal
> variables associated with GMRES and when I initiate a new GMRES inside a
> currently running GMRES, the old GMRES data gets the boot.
>
> Is this correct, and if so any suggestions on how to fix this?
>

This recently came up on SciPy-User:
http://thread.gmane.org/gmane.comp.python.scientific.user/19197/focus=19206

One solution:  PyAMG's GMRES implementation (pyamg.krylov.gmres)
should be a drop-in replacement for SciPy's gmres().  Unlike the CG
code mentioned above, you can't use gmres.py without some other
compiled components (i.e. you'll need the whole pyamg package).

We should have this resolved in the next scipy release (either 0.7.x or 0.8).

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Call for testing: full blas/lapack builds of numpy on windows 64 bits

2009-02-27 Thread Nathan Bell
On Fri, Feb 27, 2009 at 12:31 PM, David Cournapeau
 wrote:
> Hi,
>
>    That's a call for testing for 64 bits windows users out there:
> please try the following binary with the test suite:
>
> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy/numpy-1.3.0.dev6517.win-amd64-py2.6.exe
>
> python -c "import numpy; numpy.test()"
>

Running on Vista 64-bits.


C:\Python26>python -c "import numpy; numpy.test()"
Running unit tests for numpy
C:\Python26\lib\site-packages\nose\proxy.py:93: SyntaxWarning:
assertion is always true, perhaps remove parentheses?
  assert (test is self.test
NumPy version 1.3.0.dev6517
NumPy is installed in C:\Python26\lib\site-packages\numpy
Python version 2.6.1 (r26:67517, Dec  4 2008, 16:59:09) [MSC v.1500 64
bit (AMD64)]
nose version 0.10.4
Forcing DISTUTILS_USE_SDK=1



Ran 1787 tests in 8.916s

OK (KNOWNFAIL=6, SKIP=1)


Mission accomplished?

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Call for testing: full blas/lapack builds of numpy on windows 64 bits

2009-02-27 Thread Nathan Bell
On Fri, Feb 27, 2009 at 2:33 PM, David Cournapeau  wrote:
>
> Great, thanks. Do you have VS installed ? Did you install python for
> all users (I would guess so, but I am not yet clear on all the details
> on that matter).
>

I do not have VS installed.  I just downloaded the official Python
2.6.1 msi installer and blindly clicked 'next' a few times.  I then
installed nose from the source .tar.gz.

Python was installed to C:\Python26\, which I assume means all users.

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] arrray/matrix nonzero() type

2009-05-27 Thread Nathan Bell
On Wed, May 27, 2009 at 3:26 PM, Nicolas Rougier
 wrote:
>
> Hi again,
>
> I  have a problem with the nonzero() function for matrix.
>
> The following test program:
>
> import numpy, scipy.sparse
>
> Z = numpy.zeros((10,10))
>
> i = Z.nonzero()
> print i
> Zc = scipy.sparse.coo_matrix((Z[i],i))
>
> Z = numpy.matrix(Z)
> i = Z.nonzero()
> print i
> Zc = scipy.sparse.coo_matrix((Z[i],i))
>
>
> Is that the intended behavior ? How can I use nonzero with matrix to
> build the coo one ?
>

Even simpler, just do
Zc = scipy.sparse.coo_matrix(Z)

As of SciPy 0.7, all the sparse matrix constructors accept dense
matrices and array-like objects.

The problem with the matrix case is that Z[i] is rank-2 when a rank-1
array is expected.

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] sparse matrix dot product

2009-05-28 Thread Nathan Bell
On Thu, May 28, 2009 at 10:14 AM, Nicolas Rougier
 wrote:
>
> Obviously, the last computation is not a dot product, but I got no
> warning at all. Is that the expected behavior ?
>

Sparse matrices make no attempt to work with numpy functions like
dot(), so I'm not sure what is happening there.

>
> By the way, I wrote a speed benchmark for dot product using the
> different flavors of sparse matrices and I wonder if it should go
> somewhere in documentation (in anycase, if anyone interested, I can post
> the benchmark program and result).
>

Sparse dot products will ultimately map to sparse matrix
multiplication, so I'd imagine your best bet is to use A.T * B (for
column matrices A and B in csc_matrix format).

-- 
Nathan Bell wnb...@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rasterizing points onto an array

2009-05-31 Thread Nathan Bell
On Sun, May 31, 2009 at 8:39 PM, Thomas Robitaille
 wrote:
> Hi,
>
> I have a set of n points with real coordinates between 0 and 1, given
> by two numpy arrays x and y, with a value at each point represented by
> a third array z. I am trying to then rasterize the points onto a grid
> of size npix*npix. So I can start by converting x and y to integer
> pixel coordinates ix and iy. But my question is, is there an efficient
> way to add z[i] to the pixel given by (xi[i],yi[i])? Below is what I
> am doing at the moment, but the for loop becomes very inefficient for
> large n. I would imagine that there is a way to do this without using
> a loop?
>
> ---
>
> import numpy as np
>
> n = 1000
> x = np.random.random(n)
> y = np.random.random(n)
> z = np.random.random(n)
>
> npix = 100
> ix = np.array(x*float(npix),int)
> iy = np.array(y*float(npix),int)
>
> image = np.zeros((npix,npix))
> for i in range(len(ix)):
>     image[ix[i],iy[i]] = image[ix[i],iy[i]] + z[i]
>

There might be a better way, but histogram2d() seems like a good fit:

import numpy as np

n = 100
x = np.random.random(n)
y = np.random.random(n)
z = np.random.random(n)

npix = 100

bins = np.linspace(0, 1.0, npix + 1)
image = np.histogram2d(x, y, bins=bins, weights=z)[0]

-- 
Nathan Bell wnb...@gmail.com
http://www.wnbell.com/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Scipy 0.6.0 to 0.7.0, sparse matrix change

2009-06-16 Thread Nathan Bell
On Mon, Jun 15, 2009 at 6:47 AM, Fadhley Salim
 wrote:
>
> I'm trying to track down a numerical discrepancy in our proejct. We
> noticed that a certain set of results are different having upgraded from
> scipy 0.6.0 to 0.7.0.
>
> The following item from the Scipy change-log is our current number-one
> suspect. Could anybody who knows suggest what was actually involved in
> the change which I have highlighted with stars below?
>
> [...]
>
> The handling of diagonals in the ``spdiags`` function has been changed.
> It now agrees with the MATLAB(TM) function of the same name.
>
> *** Numerous efficiency improvements to format conversions and sparse
> matrix arithmetic have been made.  Finally, this release contains
> numerous bugfixes. ***
>

Can you elaborate on why you think sparse matrices may be the culprit?
 None of the changes between 0.6 and 0.7 should produce different
numerical results (beyond standard floating point margins).

--
Nathan Bell wnb...@gmail.com
http://www.wnbell.com/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Uncomfortable with matrix change

2008-05-09 Thread Nathan Bell
On Fri, May 9, 2008 at 9:07 AM, Alan G Isaac <[EMAIL PROTECTED]> wrote:
>
> Point of information: it looks like Nathan already made the
> needed fixes, and the changes made were in my opinion not at
> all obscure and indeed were rather minor.  (Which does not
> deny they were needed.)
>

That's correct, the necessary changes to scipy.sparse were not very substantial.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Uncomfortable with matrix change

2008-05-09 Thread Nathan Bell
On Fri, May 9, 2008 at 10:28 AM, Alan G Isaac <[EMAIL PROTECTED]> wrote:
>
> Since I mentioned Nathan's changes, I wish to clarify
> something.  I have no idea what Nathan's views are, but as
> I recall them, it looked to me that his changes would be
> robust to backing out.

That should be true.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Uncomfortable with matrix change

2008-05-09 Thread Nathan Bell
On Fri, May 9, 2008 at 9:56 AM, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
> Of course, if Nathan has already made the changes we will drive him crazy if
> we back them out now

This shouldn't be a problem, scipy.sparse should work with either

Thanks for your concern though :)

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Uncomfortable with matrix change

2008-05-09 Thread Nathan Bell
On Fri, May 9, 2008 at 12:31 PM, Alan G Isaac <[EMAIL PROTECTED]> wrote:
>
> That's how we got here in the first place, I think.
> Trading off problems in current behavior vs. the possibility
> that other code (like Nathan's) might rely on that bad behavior.
> Uncomfortable either way.

We'll, I think we've established that the possibility of breaking
people's code is 1 :)

> One piece of good news: Nathan's fixes were easy,
> so one way forward is not looking too rocky.

True, but scipy.sparse makes fairly limited use of matrix and I have
386 unittests to tell me what broke.  End-users might spend
considerably longer sorting out the problem, particularly if they
don't know what they're looking for.  Personally, I would not have
thought a 1.0 -> 1.1 version bump would break something like this.
Yes, we can put caveats in the release notes, but how many
numpy.matrix users read those?

Some aspects of the change are still broken.  It's probably annoying
to end users when they pass a matrix into a function and get an
ndarray back out.  Now matrix indexing is itself guilty of this
offense.  Furthermore, some users probably *do* rely on the lack of
dimension reduction because that's one of the few differences between
the matrix and ndarray.

Alan, I don't fundamentally disagree with your positions on the
deficiencies/quirks of matrices in numpy.  However, it's completely
inappropriate to plug one hole while creating others, especially in a
minor release.   I suspect that if we surveyed end-users we'd find
that "my code still works" is a much higher priority than "A[0][0] now
does what I expect".

IMO scalar indexing should raise a warning (as you first suggested) in
the 1.1 release.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Uncomfortable with matrix change

2008-05-10 Thread Nathan Bell
On Sat, May 10, 2008 at 3:05 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
>
> I don't expect my opinion to prevail, but the point is that we do not
> even have enough consensus to agree on a recommendation to go in the
> DeprecationWarning. Alas.
>

Would you object to raising a general Warning with a message like the following?

"matrix indexing of the form x[0] is ambiguous, consider the explicit
format x[0,:]"


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] searchsorted() and memory cache

2008-05-13 Thread Nathan Bell
On Tue, May 13, 2008 at 6:59 PM, Andrew Straw <[EMAIL PROTECTED]> wrote:
>  easier and still blazingly fast compared to the binary search
>  implemented in searchsorted() given today's cached memory architectures.

Andrew, I looked at your code and I don't quite understand something.
Why are you looking up single values?

Is it not the case that the overhead of several Python calls
completely dominates the actual cost of performing a binary search
(e.g. in C code)?  For instance, the 'v' argument of searchsorted(a,v)
can be an array:

from scipy import *
haystack = rand(1e7)
needles = haystack[:1].copy()
haystack.sort()
timeit searchsorted(haystack, needles)

100 loops, best of 3: 12.7 ms per loop

Which seems to be much faster than:

timeit for k in needles: x = searchsorted(haystack,k)

10 loops, best of 3: 43.8 ms per loop


The other thing to consider is that a reasonably smart CPU cache
manager may retain the first few levels that are explored in a binary
search tree.  Clearly you could speed this up by increasing the
branching factor of the tree, or using a fast index into the large
array.  However, I think that these effects are being masked by Python
overhead in your tests.

I whipped up a weave implementation of searchsorted() that uses the
STL.  It clocks in at 9.72ms per loop, so I think NumPy's
searchsorted() is fairly good.


import scipy
import scipy.weave

def searchsorted2(a,v):
N_a = len(a)
N_v = len(v)
indices = scipy.empty(N_v, dtype='intc')

code = """
for(int i = 0; i < N_v; i++){
  indices(i) = lower_bound(&a(0), &a(N_a), v(i)) - &a(0);
}
"""

err = scipy.weave.inline(code,
 ['a','v','N_a', 'N_v','indices'],
 type_converters = scipy.weave.converters.blitz,
 compiler = 'gcc',
 support_code = '#include ')

return indices


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPtr vs NumPy.i to access C

2008-05-17 Thread Nathan Bell
On Sat, May 17, 2008 at 5:55 PM, Jose Martin <[EMAIL PROTECTED]> wrote:
>
> Hi, I'd like to access a C function from python, and the function takes 
> input/output arrays. I'd probably use SWIG to do the interface to the C code. 
> I found 2 options:
> -NumPtr module, to access Numeric arrays as pointers
> http://www.penzilla.net/tutorials/python/numptr/
> - numpy.i, a SWIG interface file for NumPy that defines typemaps
> http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/swig/doc/numpy_swig.html
>
> I'm not sure if there is significant differences between the 2 options 
> (besides using either NumPy or Numeric). Does numpy.i interface file use 
> pointers to access NumPy arrays? or does it make a copy of the array to pass 
> it to/from the C function?
>
> I'm new to programming and I'd like to make sure of this. I need to use in C 
> very large arrays frequently, so I want to avoid making copies of it, because 
> speed will be an important factor.
>

I'm fairly confident that numpy.i *will not* copy an array unless it
is necessary to do so.  Nice contiguous arrays should be passed
through efficiently.

I use SWIG extensively in scipy.sparse:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools

One nice aspect of SWIG is that you can have it automatically dispatch
the right function.  For instance, if you have
foo(float * arr)  and foo(double * arr)
then you can use SWIG typemaps pick the correct function to use.

You can also make SWIG upcast to the appropriate types.  For example,
if in Python you passed an int array and a double array to:
   foo(double * arr1, double * arr2)
then you can have SWIG automatically upcast the int array to double
and delete it before returning to Python.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPtr vs NumPy.i to access C

2008-05-17 Thread Nathan Bell
On Sat, May 17, 2008 at 6:42 PM, Dag Sverre Seljebotn
<[EMAIL PROTECTED]> wrote:
>
> Cython is a different approach from SWIG (see
> http://wiki.cython.org/WrappingCorCpp; in particular SWIG uses more layers
> of indirection).
>

>From the link:
"[SWIG] Can wrap almost any C and C++ code, including templates etc.
Disadvantage is that it produces a C file, this compiles to .so, but
then it also produces a Python wrapper on top of this .so file. So
it's messy and it's slow. Also SWIG is not only targeting Python, but
other languages as well."

I really wish that people didn't spread FUD about SWIG.  For
reference, here's the "messy and slow" Python wrapper for a function
in scipy.sparse:

52  def expandptr(*args):
53  """expandptr(int n_row, int Ap, int Bi)"""
54  return _csr.expandptr(*args)

From:  
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools/csr.py#L52

I understand that scipy.sparse does not use all features of SWIG
(which may produce uglier wrappers), but I think the above case is a
fair example of what to expect when wrapping typical C/C++ libraries.

More disingenuous FUD here: http://www.sagemath.org/doc/html/prog/node36.html

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPtr vs NumPy.i to access C

2008-05-17 Thread Nathan Bell
On Sat, May 17, 2008 at 7:48 PM, Robert Kern <[EMAIL PROTECTED]> wrote:
> For the purposes to which SWIG was applied in that case, the findings
> are accurate.

IMO it's deliberately misleading.  The following three layers are
spurious and have no analog on the Cython stack:
   Python code to provide a clean interface
   Handcode C++ Integer class
   GMP's C++ Interface

A more honest comparison would list 3 layers for SWIG vs. 2 layers for Cython.

I don't have a hard time believing that Cython is a better choice for
fine-grained access to C/C++ code.  However contrived scenarios like
the above don't inspire my confidence either.  I have yet to see a
benchmark that reveals the claimed benefits of Cython.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPtr vs NumPy.i to access C

2008-05-17 Thread Nathan Bell
On Sat, May 17, 2008 at 9:30 PM, Brian Granger <[EMAIL PROTECTED]> wrote:
>
> Please correct any new errors I have introduced.
>

Thanks Brian, I think that's a fair representation.

Minor typo "course grained" -> "coarse-grained"

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Nathan Bell
On Thu, May 22, 2008 at 2:16 PM, Andrea Gavana <[EMAIL PROTECTED]> wrote:
> By the way, about the solution Francesc posted:
>
> xyzReq = (xCent >= xMin) & (xCent <= xMax) &  \
>(yCent >= yMin) & (yCent <= yMax) &  \
>(zCent >= zMin) & (zCent <= zMax)
>

You could implement this with inplace operations to save memory:
xyzReq = (xCent >= xMin)
xyzReq &= (xCent <= xMax)
xyzReq &= (yCent >= yMin)
xyzReq &= (yCent <= yMax)
xyzReq &= (zCent >= zMin)
xyzReq &= (zCent <= zMax)


> Do you think is there any chance that a C extension (or something
> similar) could be faster? Or something else using weave? I understand
> that this solution is already highly optimized as it uses the power of
> numpy with the logic operations in Python, but I was wondering if I
> can make it any faster

A C implementation would certainly be faster, perhaps 5x faster, due
to short-circuiting the AND operations and the fact that you'd only
pass over the data once.

OTOH I'd be very surprised if this is the slowest part of your application.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ufunc oddities

2008-05-24 Thread Nathan Bell
On Sat, May 24, 2008 at 10:24 PM, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
>
> Shouldn't that be the other way round? If you want integers, do
> x.sum(dtype=int). Ints don't sum in float64 by default.
>

The default behavior (x.sum() -> int) is more useful than (x.sum() ->
bool) since x.any() already exists.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ufunc oddities

2008-05-24 Thread Nathan Bell
On Sat, May 24, 2008 at 10:32 PM, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
> But the accumalator is of the same kind unless the kind is boolean, in which
> case it is integer. Clear as a bell.
>

I believe the rule is that any integer type smaller than the machine
word size is effectively upcast.

IMO this is desirable since small types are very likely to overflow.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion



Re: [Numpy-discussion] ufunc oddities

2008-05-24 Thread Nathan Bell
On Sat, May 24, 2008 at 10:40 PM, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
> The question is consistency. A programmer should just have to remember a few
> simple rules, not a host of special cases. It makes things easier to learn
> and the code easier to understand because the intent is always made clear.
> Designing to whatever happens to be convenient at the moment leads to a mess
> and trying to document all the oddities is a PITA.
>

Sometimes "do what I expect" is better than rigid consistency.  I
would argue that avoiding common overflow cases is more important than
preserving the dtype when summing.

Anyway, the point is moot.  There's no way to change x.sum() without
breaking lots of code.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ufunc oddities

2008-05-24 Thread Nathan Bell
On Sat, May 24, 2008 at 10:48 PM, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
> You can't overflow in modular arithmetic, which is how numpy is supposed to
> work. Try
>
> In [51]: x
> Out[51]: array([2147483647, 2147483647])
>
> In [52]: x.sum()
> Out[52]: -2
>

I would call that an overflow.

Have you considered that other people might have a different notion of
"how numpy is supposed to work"?

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Binary ufuncs: minimum

2008-05-27 Thread Nathan Bell
On Tue, May 27, 2008 at 5:07 PM, Christopher Barker
<[EMAIL PROTECTED]> wrote:
>
> Sure, it could be fixed in this case by promoting to a larger type, but
> it's going to fail at the largest integer anyway, and I don't think any
> expects abs() to return a new type.
>

I think he was advocating using the corresponding unsigned type, not a
larger type.

e.g.
abs(int8) -> uint8
abs(int64) -> uint64

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Binary ufuncs: minimum

2008-05-27 Thread Nathan Bell
On Tue, May 27, 2008 at 5:39 PM, Christopher Barker
<[EMAIL PROTECTED]> wrote:
>
> I'm not so sure. I know I wouldn't expect to get a different type back
> with a call to abs(). Do we really want to change that expectation just
> for the case of MIN_INT?
>
> While everyone is going to want an unsigned value when calling abs(),
> who knows if they might want to use negative numbers later? Like:
>
> x = abs(x)
> x *= -1
>
> Now what do we get/want?

IMO abs() returning non-negative numbers is a more fundamental
property.  In-place operations on integer arrays are somewhat
dangerous, and best left to more sophisticated users anyway.


Interestingly, MATLAB (v7.5.0) takes a different approach:

>> A = int8([ -128, 1])
A =
 -1281
>> abs(A)
ans =
  1271
>> -A
ans =
  127   -1


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Binary ufuncs: minimum

2008-05-27 Thread Nathan Bell
On Tue, May 27, 2008 at 7:08 PM, Christopher Barker
<[EMAIL PROTECTED]> wrote:
>
> Exactly. However, while I'd like to guarantee that abs(x) >= 0, the
> truth is that numpy is "close to the metal" in a lot of ways, and anyone
> should know that the arithmetic of integers near max and minimum values
> is fraught with danger.
>

It would be a mistake to assume that many/most NumPy users know the
oddities of two's complement signed integer representations.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New documentation web application

2008-05-29 Thread Nathan Bell
On Thu, May 29, 2008 at 5:28 PM, Stéfan van der Walt <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> The NumPy documentation project has taken another leap forward!  Pauli
> Virtanen has, in a week of superhuman coding, produced a web
> application that enhances the work-flow and editing experience of
> NumPy docstrings on the web.
>
> Unfortunately, this means that those of you who signed up before will
> have to create new accounts at
>
> http://sd-2116.dedibox.fr/pydocweb

Neat!  I really like the layout.  The red format warnings are a nice touch:
http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.umath.exp/

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] get range of numpy type

2008-06-03 Thread Nathan Bell
On Wed, Jun 4, 2008 at 12:16 AM, Christopher Burns <[EMAIL PROTECTED]> wrote:
> Is there a way to get the range of a numpy type?  I'd like to clamp a
> parameter to be within the range of a numpy type, np.uint8, np.uint32...
>
> Something like:
> if x > max_value_of(np.uint8):
>x = max_value_of(np.uint8)

That kind of information is available via numpy.finfo() and numpy.iinfo():

In [12]: finfo('d').max
Out[12]: 1.7976931348623157e+308

In [13]: iinfo('i').max
Out[13]: 2147483647

In [14]: iinfo(uint8).max
Out[14]: 255


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A memory problem: why does mmap come up in numpy.inner?

2008-06-04 Thread Nathan Bell
On Wed, Jun 4, 2008 at 9:50 PM, Dan Yamins <[EMAIL PROTECTED]> wrote:
>>
>> In [3]: numpy.dtype(numpy.uintp).itemsize
>> Out[3]: 4
>>
>> which is the size in bytes of the integer needed to hold a pointer. The
>> output above is for 32 bit python/numpy.
>>
>
> Check, the answer is 4, as you got for the 32-bit.   What would the answer
> be on a 64-bit architecture?  Why is this diagnostic?

It would be 8 on a 64-bit architecture (with a 64-bit binary):  8
bytes = 64 bits, 4 bytes = 32 bits.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Switching to nose test framework (was: NumpyTest problem)

2008-06-08 Thread Nathan Bell
On Sun, Jun 8, 2008 at 6:15 AM, Alan McIntyre <[EMAIL PROTECTED]> wrote:
> Right now there's "if __name__ == '__main__'" boilerplate at the end
> of every test module; nose can find and run tests without that being
> there.  I wanted to get the list's take on removing that--it just
> means that if you want to run the tests in one module, you have to do
> "nosetests numpy/blah/test_foo.py" instead of "python
> numpy/blah/test_foo.py".  Is there a reason not to remove that
> boilerplate since we've already decided to use nose?
>

When making frequent changes to test_foo.py, it's often nice to run
test_foo.py directly, rather than installing the whole package and
then testing via nose.

I would leave the decision up to the maintainers of the individual
submodules.  Personally, I find it useful to have.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sintax differences

2008-06-20 Thread Nathan Bell
On Fri, Jun 20, 2008 at 4:28 AM, izak marais <[EMAIL PROTECTED]> wrote:
> Hi
>
> Is there a reason why the syntax is rand(n,n), but zeros((n,n)) to create an
> n*n array? If not, perhaps this could be cleaned up?
>

SciPy's rand() is just a convenience wrapper:


>>> help(rand)
Help on built-in function rand:

rand(...)
Return an array of the given dimensions which is initialized to
random numbers from a uniform distribution in the range [0,1).

rand(d0, d1, ..., dn) -> random values

Note:  This is a convenience function. If you want an
interface that takes a tuple as the first argument
use numpy.random.random_sample(shape_tuple).


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sparse Matrices in Numpy -- (with eigenvalue algorithms if possible)

2008-06-23 Thread Nathan Bell
On Mon, Jun 23, 2008 at 3:36 PM, Dan Yamins <[EMAIL PROTECTED]> wrote:
> I'm wondering if there is a currently-available Sparse-Matrix package for
> numpy?  If so, how do I get it?And, if there is a good sparse-matrix
> package, does it include an eigenvalue-computation algorithm?  How would a
> sparse-matrix package interact with something like numpy.linalg.eig, or for
> that matter any of the other numpy modules?
>

The next version of SciPy will include two sparse eigensolvers: ARPACK
and LOBPCG

http://scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/eigen

If you're familiar with MATLAB's eigs(), then you'll find ARPACK easy to use.


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sparse Matrices in Numpy -- (with eigenvalue algorithms if possible)

2008-06-23 Thread Nathan Bell
On Mon, Jun 23, 2008 at 10:23 PM, Dan Yamins <[EMAIL PROTECTED]> wrote:

>
> 1) Does your commend imply that the current version of scipy does _not_ have
> an eigensolver? (I installed it but can't find an eigensolver in it ... )

That's correct, SciPy 0.6 does not have a sparse eigensolver.

>
> 2) How would I go about installing the ARPACK and/or LOBPCG routines that do
> exist now?  Or is it too early and they're not quite ready yet?
>

You will need to install the development version of SciPy from SVN:
http://www.scipy.org/Download#head-d0ec9f4cfbf6ed3029a2c409b6d9899586eee0e3
http://www.scipy.org/Installing_SciPy

I can confirm that both solvers work.

> 3) Finally: (and this is slightly off topic) -- I've just installed SciPy
> for the first time.  In the readme.txt for the download it mentioned
> something about having LAPACK and ATLAS installed.  However, on the scipy
> install page (for OSX, http://www.scipy.org/Installing_SciPy/Mac_OS_X), it
> doesn't mention anything about LAPACK and ATLAS -- and the instructions
> there that I followed worked without any need for those packages.Do I
> need them?  Or are they only necessary for making certain routines faster?
>

I don't know about OSX specifically, but my understanding is that you
can build NumPy and SciPy without those libraries.  Performance in
certain dense linear algebra operations will be slower, but
scipy.sparse will be unchanged.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "import numpy" is slow

2008-07-31 Thread Nathan Bell
On Thu, Jul 31, 2008 at 7:31 AM, Hanni Ali <[EMAIL PROTECTED]> wrote:
>
> I would just to highlight an alternate use of numpy to interactive use. We
> have a cluster of machines which process tasks on an individual basis where
> a master tasks may spawn 600 slave tasks to be processed. These tasks are
> spread across the cluster and processed as scripts in a individual python
> thread. Although reducing the process time by 300 seconds for the master
> task is only about a 1.5% speedup (total time can be i excess of 24000s). We
> process large number of these tasks in any given year and every little
> helps!
>

There are other components of NumPy/SciPy that are more worthy of
optimization.  Given that programmer time is a scarce resource, it's
more sensible to direct our efforts towards making the other 98.5% of
the computation faster.

/law of diminishing returns

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "import numpy" is slow

2008-07-31 Thread Nathan Bell
On Thu, Jul 31, 2008 at 5:36 AM, Andrew Dalke <[EMAIL PROTECTED]> wrote:
>
> The user base for numpy might be .. 10,000 people?  100,000 people?
> Let's go with the latter, and assume that with command-line scripts,
> CGI scripts, and the other programs that people write in order to
> help do research means that numpy is started on average 10 times a day.
>
> 100,000 people * 10 times / day * 0.1 seconds per startup
>= almost 28 people-hours spent each day waiting for numpy to start.
>
> I'm willing to spend a few days to achieve that.
>
>
> Perhaps there's fewer people than I'm estimating.  OTOH, perhaps
> there are more imports of numpy per day.  An order of magnitude less
> time is still a couple of hours each day as the world waits to import
> all of the numpy libraries.
>
> If on average people import numpy 10 times a day and it could be made
> 0.1 seconds faster then that's 1 second per person per day.  If it
> takes on average 5 minutes to learn to import the module directly and
> the onus is all on numpy, then after 1 year of use the efficiency has
> made up for it, and the benefits continue to grow.
>

Just think of the savings that could be achieved if all 2.1 million
Walmart employees were outfitted with colostomy bags.

   0.5 hours / day for bathroom breaks * 2,100,000 employees * 365
days/year * $7/hour = $2,682,750,000/year

Granted, I'm probably not the first to run these numbers.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Renaming MaskedArray._basedict

2008-08-03 Thread Nathan Bell
On Sun, Aug 3, 2008 at 1:29 PM, Pierre GM <[EMAIL PROTECTED]> wrote:
>
> I intend to put a DeprecationWarning at the beginning of numpy.ma.core. A
> side-effect is that this warning is issued each time numpy.ma is imported,
> which clutters the output of the tests.
> Am I missing something ?

Can you could trap it in __getattr_ instead?  For instance:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/csr.py#L87


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] generating float sequences

2008-08-22 Thread Nathan Bell
On Fri, Aug 22, 2008 at 9:35 PM, Chris Fonnesbeck <[EMAIL PROTECTED]> wrote:
> Is there any prospect for easily getting ranges of floats in numpy, rather 
> than
> just integers? In R, for example, you can specify a decimal value for the step
> size. It would be great to be able to do the following:
>
> arange(0, 100, 0.1)
>

It appears to be great already :)

In other words, arange(0, 100, 0.1) does exactly what you want.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dealing with types in extension modules

2008-09-13 Thread Nathan Bell
On Wed, Sep 10, 2008 at 11:59 PM, Lane Brooks <[EMAIL PROTECTED]> wrote:
>
> When writing an numpy extension module, what is the preferred way to
> deal with the all the possible types an ndarray can have?
>
> I have some data processing functions I need to implement and they need
> to be generic and work for all the possible numerical dtypes.  I do not
> want to have to re-implement the same C-code for all the possible types,
> so the way I approached it was to use a C++ template function to
> implement the processing.  Then I have a dispatching function that
> checks the type of the input ndarray and calls the correct template.  Is
> there a better way?
>

In scipy.sparse there is a 'sparsetools' extension module that uses
C++ templates to support 14 (or so) numpy types.  In sparsetools, SWIG
typemaps are used to dispatch the correct template based on the numpy
data type.  For instance, a typemap decides to call foo if the
array has type PyArray_FLOAT.

SWIG can be a little overwhelming, but it automates dispatching and
can even do automatic upcasting for you.  For instance, if your C++
function is foo(double, double) and you pass it (float,int) then SWIG
can upcast both arguments to double before passing them into foo().
Also, you can force arrays to be C-contiguous (as opposed to strided
or Fortran order) and have the correct endianness.

Basically, you can use SWIG to tame the input numpy arrays.

The sparsetools module is here:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools

The file numpy.i does most of the heavy lifiting.  For each sparse
matrix format, for instance 'csr', there is a separate SWIG file
('csr.i') that instantiates each function in the corresponding header
file  ('csr.h').  The output of 'swig -c++ -python csr.i' is the
actual extension module ('csr_wrap.cxx').  If you look at one of these
(massive!) files you can see how SWIG dispatches the appropriate
function.

Also, in order to support complex types (e.g. complex128), there is a
wrapper in complex_ops.h that overloads the standard arithmetic
operators for complex numbers.  Then complex types are handled exactly
as above.

http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools/complex_ops.h

Depending on your problem, SWIG may be overkill.  OTOH you may be able
to get everything to need from the sparsetools source code.  Feel free
to pillage it as you require :)

Should you go the SWIG path, I can help explain some of the more cryptic parts.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.2.0rc2 tagged! --PLEASE TEST--

2008-09-15 Thread Nathan Bell
On Sat, Sep 13, 2008 at 7:00 PM, Jarrod Millman <[EMAIL PROTECTED]> wrote:
>
> Please test this release ASAP and let us know if there are any
> problems.  If there are no show stoppers, this will become the
> 1.2.0 release.
>

Looks OK on Ubuntu 8.04 x86_64.  Thanks for your continued work!

$ python -c 'from numpy import test; test()'
Running unit tests for numpy
NumPy version 1.2.0rc2
NumPy is installed in /usr/lib/python2.5/site-packages/numpy
Python version 2.5.2 (r252:60911, Jul 31 2008, 17:31:22) [GCC 4.2.3
(Ubuntu 4.2.3-2ubuntu7)]
nose version 0.10.3
...K..
 

--
Ran 1726 tests in 8.813s

OK (KNOWNFAIL=1)


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] distutils chooses wrong arch flags

2008-09-16 Thread Nathan Bell
Could someone with a better knowledge of distutils look over the
following SciPy ticket:
http://scipy.org/scipy/scipy/ticket/738

Short version: distutils compiles with -march=pentium-m on a machine
that can't execute SS2 instructions.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array indices and dictionary

2008-09-21 Thread Nathan Bell
On Sun, Sep 21, 2008 at 1:28 PM, Dinesh B Vadhia
<[EMAIL PROTECTED]> wrote:
>
> But, I want to pick up the column index of non-zero elements per row.
>

http://www.scipy.org/Numpy_Example_List_With_Doc#nonzero

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Minimum distance between 2 paths in 3D

2008-09-27 Thread Nathan Bell
On Sat, Sep 27, 2008 at 11:18 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
>
> I think a kd-tree implementation would be a valuable addition to
> scipy, perhaps in a submodule scipy.spatial that might eventually
> contain other spatial data structures and algorithms. What do you
> think? Should we have one? Should it be based on Sturla Molden's code,
> if the license permits? I am willing to contribute one, if not.

+1

If you're implementing one, I would highly recommend the
"left-balanced" kd-tree.
http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/2535/pdf/imm2535.pdf

Here's one implementation (undetermined license):
http://www2.imm.dtu.dk/~jab/software.html#kdtrees

I have a C++ implementation based on the one above that does a few
more operations (like nearest-n neighbors) that you're welcome to use.
 The use of n_th() in the STL makes left-balancing fairly
straightforward.

FWIW I also have a pure python implementation here:
http://code.google.com/p/pydec/source/browse/trunk/pydec/math/kd_tree.py

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nonuniform scatter operations

2008-09-27 Thread Nathan Bell
On Sun, Sep 28, 2008 at 12:34 AM, Geoffrey Irving <[EMAIL PROTECTED]> wrote:
>
> Is there an efficient way to implement a nonuniform gather operation
> in numpy?  Specifically, I want to do something like
>
> n,m = 100,1000
> X = random.uniform(size=n)
> K = random.randint(n, size=m)
> Y = random.uniform(size=m)
>
> for k,y in zip(K,Y):
>X[k] += y
>
> but I want it to be fast.  The naive attempt "X[K] += Y" does not
> work, since the slice assumes the indices don't repeat.
>

I don't know of  numpy solution, but in scipy you could use a sparse
matrix to perform the operation.  I think the following does what you
want.

from scipy.sparse import coo_matrix
X += coo_matrix( (Y, (K,zeros(m,dtype=int)), shape=(n,1)).sum(axis=1)

This reduces to a simple C++ loop, so speed should be good:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools/coo.h#L139

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Are there command similar as Matlab find command?

2008-09-29 Thread Nathan Bell
On Mon, Sep 29, 2008 at 5:32 PM, frank wang <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am trying to find a command in numpy or python that perform similar
> function as Matlab find command. It will return the indexes of array that
> satisfy a condition. So far I have not found anything.
>

If you're familiar with MATLAB, look here:
http://www.scipy.org/NumPy_for_Matlab_Users

In the table you'll find the following equivalence:
 find(a>0.5) <-> where(a>0.5)

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nonuniform scatter operations

2008-09-29 Thread Nathan Bell
On Sun, Sep 28, 2008 at 4:15 PM, Geoffrey Irving <[EMAIL PROTECTED]> wrote:
>
> Thanks.  That works great.  A slightly cleaner version is
>
>X += coo_matrix((Y, (K, zeros_like(K.sum(axis=1)
>
> The next question is: is there a similar way that generalizes to the
> case where X is n by 3 and Y is m by 3 (besides the obvious loop over
> range(3), that is)?
>

You could flatten the arrays and make a single matrix that implemented
the operation.  I'd stick with the loop over range(3) though, it's
more readable and likely to be as fast or faster than flattening the
arrays yourself.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is the sign of nan?

2008-09-29 Thread Nathan Bell
On Tue, Sep 30, 2008 at 1:20 AM, Robert Kern <[EMAIL PROTECTED]> wrote:
>
> F.9.9.2 The fmax functions
> 1 If just one argument is a NaN, the fmax functions return the other
> argument (if both arguments are NaNs, the functions return a NaN).
> 2 The body of the fmax function might be
> {return (isgreaterequal(x, y) ||
>  isnan(y)) ? x : y; }
>
> If we want to follow C99 semantics rather than our own
> NaN-always-propagates semantics, then we should do this instead.
>

+1 for NaN-always-propagates since we have explicit variants for the
alternative semantics.

Users are more likely to remember that "NaNs always propagate" than
"as stated in the C99 standard...".

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Nathan Bell
On Tue, Sep 30, 2008 at 6:10 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
>
> Well, the problem with this is that you often want to provide a
> distance upper bound as well as a number of nearest neighbors. For

This use case is also important in scattered data interpolation, so we
definitely want to support it.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: scipy.spatial

2008-09-30 Thread Nathan Bell
On Wed, Oct 1, 2008 at 1:26 AM, Gael Varoquaux
<[EMAIL PROTECTED]> wrote:
>
> Absolutely. I just think k should default to None, when
> distance_upper_bound is specified. k=None could be interpreted as k=1
> when distance_uppper_bound is not specified.
>

Why not expose the various possibilities through different names?

# nearest k points (possibly fewer)
query_nearest(pt, k=1)

# all points within given distance
query_sphere(pt, distance)

#nearest k points within given distance (possibly fewer)
query(pt, k, distance)

Few people will use the last form, but it's useful nevertheless.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Problème pour construire les tests Numpy-Swig

2008-10-03 Thread Nathan Bell
On Fri, Oct 3, 2008 at 11:21 AM, Michel Dupront
<[EMAIL PROTECTED]> wrote:
>
> I was using swig 1.3.24.
> I installed the last swig version 1.3.36 and now it is working fine !
> and it makes me very very happy !!!
>

SWIG often has that effect on people :)

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
Satisfied SWIG customer
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] docs.scipy.org -- new site for the documentation marathon

2008-10-26 Thread Nathan Bell
On Sun, Oct 26, 2008 at 10:13 AM, Pauli Virtanen <[EMAIL PROTECTED]> wrote:
>
> Now, the role of docs.scipy.org warrants discussion, because on the one
> hand, the domain "docs.scipy.org" looks very official, and on the other
> hand, "scipy.org/Documentation" claims to be the place for official
> documentation. What do you think: should we use the current front page
> of docs.scipy.org, shifting the focus and entry point of documentation
> to the Sphinx-generated pages, or still keep the focus on the Moin wiki
> at scipy.org?
>
> I'd like to see a move towards a full dedicated documentation site -- I
> believe using Sphinx is the way to go in the future, and documentation
> generated by it does have a reassuring and clear visual appearance.
> Also, making the Sphinx documentation more prominent could help people
> to focus documentation efforts, and write more of it :)
>

Great work, I definitely like the look of the Sphinx documentation.

As a concrete example of your question, I've been working on some
scipy.sparse documentation [1], and it seems like it will be fairly
lengthy when completed.  Would it be appropriate to merge it into the
Sphinx documentation for scipy.sparse [2], or should the Sphinx docs
be more concise?

[1] http://www.scipy.org/SciPyPackages/Sparse
[2] http://docs.scipy.org/doc/scipy/reference/sparse.html

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] n-dimensional array indexing question

2008-11-08 Thread Nathan Bell
On Sat, Nov 8, 2008 at 5:35 PM, dmitrey <[EMAIL PROTECTED]> wrote:
> hi all,
> I have array A, A.ndim = n, and 1-dimensional array B of length n.
> How can I get element of A with coords B[0],...,B[n-1]?
> i.e. A[B[0], B[1], ..., B[n-1])
>
> A, B, n are not known till execution time, and can have unpredictable
> lengths (still n is usually small, no more than 4-5).
>
> I have tried via ix_ but haven't succeeded yet.
>

There's probably a better way, but tuple(B) works:

In [1]: from numpy import *

In [2]: A = arange(8).reshape(2,2,2)

In [3]: A
Out[3]:
array([[[0, 1],
[2, 3]],

   [[4, 5],
[6, 7]]])

In [4]: B = array([0,1,0])

In [5]: A[tuple(B)]
Out[5]: 2


-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion