date:20080522

Re: [Numpy-discussion] Closing some tickets.

2008-05-22 Thread Matthieu Brucher

Hi,

Is there an official support for MSVC 2005 ? Last time I tried to compile
Python with it, it couldn't build extension. If MSVC 2005 is not officially
supported, at least by Python itself, I'm not sure Numpy can.

Matthieu

2008/5/22 Charles R Harris [EMAIL PROTECTED]:

 All,

 Can we close ticket #117  and add Pearu's comment to the FAQ?
 http://projects.scipy.org/scipy/numpy/ticket/117

 Can someone with MSVC 2005 check if we can close ticket #164?
 http://projects.scipy.org/scipy/numpy/ticket/164

 Chuck

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion




-- 
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Closing some tickets.

2008-05-22 Thread David Cournapeau

Matthieu Brucher wrote:
 Hi,

 Is there an official support for MSVC 2005 ? Last time I tried to 
 compile Python with it, it couldn't build extension. If MSVC 2005 is 
 not officially supported, at least by Python itself, I'm not sure 
 Numpy can.

Python 2.5 used 2003 (VS 7 ? I am always confused by their version 
numbering scheme), and Python 2.6/3.0 will use 2008 (VS 9 ?) AFAIK.

I don't think it makes sense to support a compiler which is not used for 
official binaries and is already superseded by a newer version.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] first recarray steps

2008-05-22 Thread Vincent Schut

Anne Archibald wrote:
 2008/5/21 Vincent Schut [EMAIL PROTECTED]:
 Christopher Barker wrote:
 Also, if you image data is rgb, usually, that's a (width, height, 3)
 array: rgbrgbrgbrgb... in memory. If you have a (3, width, height)
 array, then that's rrr... Some image libs
 may give you that, I'm not sure.
 My data is. In fact, this is a simplification of my situation; I'm
 processing satellite data, which usually has more (and other) bands than
 just rgb. But the data is definitely in shape (bands, y, x).
 
 You may find your life becomes easier if you transpose the data in
 memory. This can make a big difference to efficiency. Years ago I was
 working with enormous (by the standards of the day) MATLAB files on
 disk, storing complex data. The way (that version of) MATLAB
 represented complex data was the way you describe: matrix of real
 parts, matrix of imaginary parts. This meant that to draw a single
 pixel, the disk needed to seek twice... depending on what sort of
 operations you're doing, transposing your data so that each pixel is
 all in one place may improve cache coherency as well as making the use
 of record arrays possible.
 
 Anne

Anne, thanks for the thoughts. In most cases, you'll probably be right. 
In this case, however, it won't give me much (if any) speedup, maybe 
even slowdown. Satellite images often are stored on disk in a band 
sequential manner. The library I use for IO is GDAL, which is a higly 
optimized c library for reading/writing almost any kind of satellite 
data type. It also features an internal caching mechanism. And it gives 
me my data as (y, x, bands).
I'm not reading single pixels anyway. The amounts of data I have to 
process (enormous, even by the standards of today ;-)) require me to do 
this in chunks, in parallel, even on different cores/cpu's/computers. 
Every chunk usually is (chunkYSize, chunkXSize, allBands) with xsize and 
ysize being not so small (think from 64^2 to 1024^2) so that pretty much 
eliminates any performance issues regarding the data on disk. 
Furthermore, having to process on multiple computers forces me to have 
my data on networked storage. The latency and transfer rate of the 
network will probably eliminate any small speedup because my drive has 
to do less seeks...
Now for the recarray part, that would indeed ease my life a bit :) 
However, having to transpose the data in memory on every read and write 
does not sound very attractive. It will spoil cycles, and memory, and be 
asking for bugs. I can live without recarrays, for sure. I only hoped 
they might make my live a bit easier and my code a bit more readable, 
without too much effort. Well, they won't, apparently... I'll just go on 
like I did before this little excercise.

Thanks all for the inputs.

Cheers,
Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] distance_matrix: how to speed up?

2008-05-22 Thread Vincent Schut

Emanuele Olivetti wrote:
snip
 
 This solution is super-fast, stable and use little memory.
 It is based on the fact that:
 (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w
 
 For size1=size2=dimensions=1000 requires ~0.6sec. to compute
 on my dual core duo. It is 2 order of magnitude faster than my
 previous solution, but 1-2 order of magnitude slower than using
 C with weave.inline.
 
 Definitely good enough for me.
 
 
 Emanuele

Reading this thread, I remembered having tried scipy's sandbox.rbf 
(radial basis function) to interpolate a pretty large, multidimensional 
dataset, to fill in the missing data points. This however failed soon 
with out-of-memory errors, which, if I remember correctly, came from the 
pretty straightforward distance calculation between the different data 
points that is used in this package. Being no math wonder, I assumed 
that there simply was no simple way to calculate distances without using 
much memory, and ended my rbf experiments.

To make a story short: correct me if I am wrong, but might it be an idea 
to use the above solution in scipy.sandbox.rbf?

Vincent.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008

2008-05-22 Thread Pauli Virtanen

ke, 2008-05-21 kello 10:08 +0200, Stéfan van der Walt kirjoitti:
[clip]
  This will parse better (as the line with the semicolon is bold, the
  next lines are not).  Also, would it be possible to put function and
  next_function in double back-ticks, so that they are referenced, like
  modules?  That way they will might be clickable in a html version of
  the documentation.
 
 When generating the reference guide, I parse all the numpy docstrings
 and re-generate a document enhanced with Sphinx markup.  In this
 document, functions in the See Also clause are clickable.  I have
 support for two formats:
 
 See Also
 
 function_a, function_b, function_c
 function_d : relation to current function
 
 Don't worry if it doesn't look perfect on the wiki; the reference
 guide will be rendered correctly.

Should the function names in the See also section also include the
namespace prefix, ie.

numpy.function_a
numpy.function_b

Or should we assume from numpy import * or import numpy as np? I
think it'd be useful to clarify this in the documentation standard and
in example.py, also for the Examples section. (Btw, the
Docstrings/Example appears to be out-of-date wrt. this.)

Pauli


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy and scipy icons ?

2008-05-22 Thread David Cournapeau

Stéfan van der Walt wrote:
 Hi David

 The icons are attached to the scipy.org main page as .png's.  Travis
 Vaught drew the vector versions, IIRC.
   

I can find logo for scipy conf, bug, etc... but no normal icon, nor 
any numpy icon. Would it be possible to put the icon (vector version) 
somewhere in subversion ? Would be useful for say installers, etc...

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Numpy and scipy icons ?

2008-05-22 Thread David Cournapeau

Hi,

Where can I find numpy and scipy icons, preferably in a vector format ?

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy and scipy icons ?

2008-05-22 Thread Stéfan van der Walt

Hi David

The icons are attached to the scipy.org main page as .png's.  Travis
Vaught drew the vector versions, IIRC.

Regards
Stéfan

2008/5/22 David Cournapeau [EMAIL PROTECTED]:
 Hi,

Where can I find numpy and scipy icons, preferably in a vector format ?

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] distance_matrix: how to speed up?

2008-05-22 Thread Rob Hetland


On May 22, 2008, at 9:45 AM, Vincent Schut wrote:

 Reading this thread, I remembered having tried scipy's sandbox.rbf
 (radial basis function) to interpolate a pretty large,  
 multidimensional
 dataset, to fill in the missing data points. This however failed soon
 with out-of-memory errors, which, if I remember correctly, came from  
 the
 pretty straightforward distance calculation between the different data
 points that is used in this package. Being no math wonder, I assumed
 that there simply was no simple way to calculate distances without  
 using
 much memory, and ended my rbf experiments.

 To make a story short: correct me if I am wrong, but might it be an  
 idea
 to use the above solution in scipy.sandbox.rbf?


Yes, this would be a very good substitution.  Not only does it use  
less memory, but in my quick tests it is about as fast or faster.   
Really, though, both are pretty quick.  There will still be memory  
limitations, but you only need to store a matrix of (N, M), instead of  
(NDIM, N, M), so for many dimensions there will be big memory  
improvements.  Probably only small improvements for 3 dimensions or  
less.

I'm not sure where rbf lives anymore -- it's not in scikits.  I have  
my own version (parts of which got folded into the old scipy.sandbox  
version), that I would be willing to share if there is interest.

Really, though, the rbf toolbox will not be limited by the memory of  
the distance matrix.  Later on, you need to do a large linear algebra  
'solve', like this:


r = norm(x, x) # The distances between all of the ND points to each  
other.
A = psi(r)  # where psi is some divergent function, often the  
multiquadratic function : sqrt((self.epsilon*r)**2 + 1)
coefs = linalg.solve(A, data)  # where data is the length of x, one  
data point for each spatial point.

# to find the interpolated data points at xi
ri = norm(xi, x)
Ai = psi(ri)
di = dot(Ai, coefs)


All in all, it is the 'linalg.solve' that kills you.

-Rob


Rob Hetland, Associate Professor
Dept. of Oceanography, Texas AM University
http://pong.tamu.edu/~rob
phone: 979-458-0096, fax: 979-845-6331

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] distance_matrix: how to speed up?

2008-05-22 Thread Vincent Schut

Rob Hetland wrote:
 On May 22, 2008, at 9:45 AM, Vincent Schut wrote:
 
snip
 
 Really, though, the rbf toolbox will not be limited by the memory of  
 the distance matrix.  Later on, you need to do a large linear algebra  
 'solve', like this:
 
 
 r = norm(x, x) # The distances between all of the ND points to each  
 other.
 A = psi(r)  # where psi is some divergent function, often the  
 multiquadratic function : sqrt((self.epsilon*r)**2 + 1)
 coefs = linalg.solve(A, data)  # where data is the length of x, one  
 data point for each spatial point.
 
 # to find the interpolated data points at xi
 ri = norm(xi, x)
 Ai = psi(ri)
 di = dot(Ai, coefs)
 
 
 All in all, it is the 'linalg.solve' that kills you.

Ah, indeed, my memory was faulty, I'm afraid. It was in this phase that 
it halted, not in the distance calculations.

Vincent.

 
 -Rob
 
 
 Rob Hetland, Associate Professor
 Dept. of Oceanography, Texas AM University
 http://pong.tamu.edu/~rob
 phone: 979-458-0096, fax: 979-845-6331

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy and scipy icons ?

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 4:14 AM, David Cournapeau 
[EMAIL PROTECTED] wrote:

 Stéfan van der Walt wrote:
  Hi David
 
  The icons are attached to the scipy.org main page as .png's.  Travis
  Vaught drew the vector versions, IIRC.
 

 I can find logo for scipy conf, bug, etc... but no normal icon, nor
 any numpy icon. Would it be possible to put the icon (vector version)
 somewhere in subversion ? Would be useful for say installers, etc...


Travis Vaught is the man:
http://article.gmane.org/gmane.comp.python.numeric.general/18495

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi All,

I am building some 3D grids for visualization starting from a much
bigger grid. I build these grids by satisfying certain conditions on
x, y, z coordinates of their cells: up to now I was using VTK to
perform this operation, but VTK is slow as a turtle, so I thought to
use numpy to get the cells I am interested in.
Basically, for every cell I have the coordinates of its center point
(centroids), named xCent, yCent and zCent. These values are stored in
numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
yCent and zCent with 10,000 values in them). What I'd like to do is:

# Filter cells which do not satisfy Z requirements:
zReq = zMin = zCent = zMax

# After that, filter cells which do not satisfy Y requirements,
# but apply this filter only on cells who satisfy the above condition:

yReq = yMin = yCent = yMax

# After that, filter cells which do not satisfy X requirements,
# but apply this filter only on cells who satisfy the 2 above conditions:

xReq = xMin = xCent = xMax

I'd like to end up with a vector of indices which tells me which are
the cells in the original grid that satisfy all 3 conditions. I know
that something like this:

zReq = zMin = zCent = zMax

Can not be done directly in numpy, as the first statement executed
returns a vector of boolean. Also, if I do something like:

zReq1 = numpy.nonzero(zCent = zMax)
zReq2 = numpy.nonzero(zCent[zReq1] = zMin)

I lose the original indices of the grid, as in the second statement
zCent[zReq1] has no more the size of the original grid but it has
already been filtered out.

Is there anything I could try in numpy to get what I am looking for?
Sorry if the description is not very clear :-D

Thank you very much for your suggestions.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt

Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

You clearly have a large dataset, otherwise speed wouldn't have been a
concern to you.  You can do your operation in one pass over the data,
and I'd suggest you try doing that with Cython or Ctypes.  If you need
an example on how to access data using those methods, let me know.

Of course, it *can* be done using NumPy (maybe not in one pass), but
thinking in terms of for-loops is sometimes easier, and immediately
takes you to a highly optimised execution time.

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi Stefan  All,

On Thu, May 22, 2008 at 12:29 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 You clearly have a large dataset, otherwise speed wouldn't have been a
 concern to you.  You can do your operation in one pass over the data,
 and I'd suggest you try doing that with Cython or Ctypes.  If you need
 an example on how to access data using those methods, let me know.

 Of course, it *can* be done using NumPy (maybe not in one pass), but
 thinking in terms of for-loops is sometimes easier, and immediately
 takes you to a highly optimised execution time.

First of all, thank you for your answer. I know next to nothing about
Cython and very little about Ctypes, but it would be nice to have an
example on how to use them to speed up the operations. Actually, I
don't really know if my dataset is large, as I work normally with
xCent, yCent and zCent vectors of about 100,000-300,000 elements in
them. However, all the other operations I do with numpy on these
vectors are pretty fast (reshaping, re-casting, min(), max() and so
on). So I believe that also a pure numpy solution might perform well
enough for my needs: but I am really no expert in numpy, so please
forgive any mistake I'm doing :-D.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Francesc Alted

A Thursday 22 May 2008, Andrea Gavana escrigué:
 Hi All,

 I am building some 3D grids for visualization starting from a
 much bigger grid. I build these grids by satisfying certain
 conditions on x, y, z coordinates of their cells: up to now I was
 using VTK to perform this operation, but VTK is slow as a turtle, so
 I thought to use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax

 # After that, filter cells which do not satisfy Y requirements,
 # but apply this filter only on cells who satisfy the above
 condition:

 yReq = yMin = yCent = yMax

 # After that, filter cells which do not satisfy X requirements,
 # but apply this filter only on cells who satisfy the 2 above
 conditions:

 xReq = xMin = xCent = xMax

 I'd like to end up with a vector of indices which tells me which are
 the cells in the original grid that satisfy all 3 conditions. I know
 that something like this:

 zReq = zMin = zCent = zMax

 Can not be done directly in numpy, as the first statement executed
 returns a vector of boolean. Also, if I do something like:

 zReq1 = numpy.nonzero(zCent = zMax)
 zReq2 = numpy.nonzero(zCent[zReq1] = zMin)

 I lose the original indices of the grid, as in the second statement
 zCent[zReq1] has no more the size of the original grid but it has
 already been filtered out.

 Is there anything I could try in numpy to get what I am looking for?
 Sorry if the description is not very clear :-D

 Thank you very much for your suggestions.

I don't know if this is what you want, but you can get the boolean 
arrays separately, do the intersection and finally get the interesting 
values (by using fancy indexing) or coordinates (by using .nonzero()).  
Here it is an example:

In [105]: a = numpy.arange(10,20)

In [106]: c1=(a=13)(a=17)

In [107]: c2=(a=14)(a=18)

In [109]: all=c1c2

In [110]: a[all]
Out[110]: array([14, 15, 16, 17])   # the values

In [111]: all.nonzero()
Out[111]: (array([4, 5, 6, 7]),)# the coordinates

Hope that helps,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Alan G Isaac

On Thu, 22 May 2008, Andrea Gavana apparently wrote:
 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax 

This seems to raise a question:
should numpy arrays support this standard Python idiom?

Cheers,
Alan Isaac



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Angus McMorland

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 Hi All,

I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax

 # After that, filter cells which do not satisfy Y requirements,
 # but apply this filter only on cells who satisfy the above condition:

 yReq = yMin = yCent = yMax

 # After that, filter cells which do not satisfy X requirements,
 # but apply this filter only on cells who satisfy the 2 above conditions:

 xReq = xMin = xCent = xMax

 I'd like to end up with a vector of indices which tells me which are
 the cells in the original grid that satisfy all 3 conditions. I know
 that something like this:

 zReq = zMin = zCent = zMax

 Can not be done directly in numpy, as the first statement executed
 returns a vector of boolean. Also, if I do something like:

 zReq1 = numpy.nonzero(zCent = zMax)
 zReq2 = numpy.nonzero(zCent[zReq1] = zMin)

 I lose the original indices of the grid, as in the second statement
 zCent[zReq1] has no more the size of the original grid but it has
 already been filtered out.

 Is there anything I could try in numpy to get what I am looking for?
 Sorry if the description is not very clear :-D

 Thank you very much for your suggestions.

How about (as a pure numpy solution):

valid = (z = zMin)  (z = zMax)
valid[valid] = (y[valid] = yMin)  (y[valid] = yMax)
valid[valid] = (x[valid] = xMin)  (x[valid] = xMax)
inds = valid.nonzero()

?
-- 
AJC McMorland, PhD candidate
Physiology, University of Auckland

(Nearly) post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi Francesc  All,

On Thu, May 22, 2008 at 1:04 PM, Francesc Alted wrote:
 I don't know if this is what you want, but you can get the boolean
 arrays separately, do the intersection and finally get the interesting
 values (by using fancy indexing) or coordinates (by using .nonzero()).
 Here it is an example:

 In [105]: a = numpy.arange(10,20)

 In [106]: c1=(a=13)(a=17)

 In [107]: c2=(a=14)(a=18)

 In [109]: all=c1c2

 In [110]: a[all]
 Out[110]: array([14, 15, 16, 17])   # the values

 In [111]: all.nonzero()
 Out[111]: (array([4, 5, 6, 7]),)# the coordinates

Thank you for this suggestion! I had forgotten  that this worked in
numpy :-( . I have written a couple of small functions to test your
method and my method (hopefully I did it correctly for both). On my
computer (Toshiba Notebook 2.00 GHz, Windows XP SP2, 1GB Ram, Python
2.5, numpy 1.0.3.1), your solution is about 30 times faster than mine
(implemented when I didn't know about multiple boolean operations in
numpy).

This is my code:

# Begin Code

import numpy
from timeit import Timer

# Number of cells in my original grid
nCells = 15

# Define some constraints for X, Y, Z
xMin, xMax = 250.0, 700.0
yMin, yMax = 1000.0, 1900.0
zMin, zMax = 120.0, 300.0

# Generate random centroids for the cells
xCent = 1000.0*numpy.random.rand(nCells)
yCent = 2500.0*numpy.random.rand(nCells)
zCent = 400.0*numpy.random.rand(nCells)


def MultipleBoolean1():
 Andrea's solution, slow :-( .

xReq_1 = numpy.nonzero(xCent = xMin)
xReq_2 = numpy.nonzero(xCent = xMax)

yReq_1 = numpy.nonzero(yCent = yMin)
yReq_2 = numpy.nonzero(yCent = yMax)

zReq_1 = numpy.nonzero(zCent = zMin)
zReq_2 = numpy.nonzero(zCent = zMax)

xReq = numpy.intersect1d_nu(xReq_1, xReq_2)
yReq = numpy.intersect1d_nu(yReq_1, yReq_2)
zReq = numpy.intersect1d_nu(zReq_1, zReq_2)

xyReq = numpy.intersect1d_nu(xReq, yReq)
xyzReq = numpy.intersect1d_nu(xyReq, zReq)


def MultipleBoolean2():
 Francesc's's solution, Much faster :-) .

xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]


if __name__ == __main__:

trial = 10

t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1)
print \n\nAndrea's Solution: %0.8g
Seconds/Trial%(t.timeit(number=trial)/trial)

t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2)
print Francesc's Solution: %0.8g
Seconds/Trial\n%(t.timeit(number=trial)/trial)


# End Code


And I get this timing on my PC:

Andrea's Solution: 0.34946193 Seconds/Trial
Francesc's Solution: 0.011288139 Seconds/Trial

If I implemented everything correctly, this is an amazing improvement.
Thank you to everyone who provided suggestions, and thanks to the list
:-D

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: NumPy/SciPy Documentation Marathon 2008

2008-05-22 Thread Bruce Southey

Stéfan van der Walt wrote:
 It looks like there is significant interest in using np instead of
 numpy in the examples (i.e. we expect the user to do import numpy
 as np before trying code snippets).

 Would anybody who objects to using np raise it now, so that we can
 bury this issue?

 Regards
 Stéfan

 2008/5/22 Rob Hetland [EMAIL PROTECTED]:
   
 On May 22, 2008, at 11:37 AM, Pauli Virtanen wrote:

 
 Or should we assume from numpy import * or import numpy as np? I
   
 Although a good case could probably be made for all three (*, np,
 numpy), I think that if import numpy as np is to be put forward as
 the standard coding style, the examples should use this as well.

 -Rob

 
 Rob Hetland, Associate Professor
 Dept. of Oceanography, Texas AM University
 http://pong.tamu.edu/~rob
 phone: 979-458-0096, fax: 979-845-6331



 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

   
Hi,
I prefer using 'import numpy' over 'import numpy as np'.  But as long as 
each example has 'import numpy as np' included  then I have no 
objections. The main reason for this is that the block of code can 
easily be copied and pasted to run a complete entity. Also this type of 
implicit assumption often get missed because these assumptions are often 
far from the example (missed in web searches) or overlooked as the 
reader doesn't think that part was important.

Regards
Bruce
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Bruce Southey

Hi,
Is it bug if different NumPy types have different attributes?

Based on prior discussion, 'complex', 'float' and 'int' are Python types 
and others are NumPy types.  Consequently 'complex', 'float' and 'int' 
do not inherit from NumPy. However, an element from array created using 
dtype=numpy.float has the numpy.float64 type. So this is really a 
documentation issue than an implementation issue.

Also different NumPy types have different attributes, for example, 
'float64' contains attributes (eg __coerce__) that are not present in 
'float32' and 'float128' (these two have the same attributes). This can 
cause attribute errors in somewhat contrived examples that probably are 
unlikely to appear in practice because of the casting involved in array 
creation.

The 'uint' types all seem to have the same attributes so do not have 
these issues.

import numpy
len(dir(float))  #   47
len(dir(numpy.float))#   47
len(dir(numpy.float32))  #  131
len(dir(numpy.float64))  #  135
len(dir(numpy.float128)) #  131

len(dir(int))#   54
len(dir(numpy.int))  #   54
len(dir(numpy.int0)) #  135
len(dir(numpy.int16))#  132
len(dir(numpy.int32))#  132
len(dir(numpy.int64))#  135
len(dir(numpy.int8)) #  132

print (numpy.float64(1234).size)  # 1
print (numpy.float(1234).size)  
''' prints error:
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: 'float' object has no attribute 'size'
'''

Regards
Bruce


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution

2008-05-22 Thread joep

Hi,
I was just looking around at the new numpy documentation and got a
xhtml parsing error on the page (with Firefox):

http://mentat.za.net/numpy/refguide/random.xhtml#index-29351

The offending line contains
$X pprox prod_{i=1}^{k}{x^{lpha_i-1}_i}$
in the docstring of the dirichlet distribution

the corresponding line in the source at
http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/mtrand/mtrand.pyx
is
.. math:: X \\approx \\prod_{i=1}^{k}{x^{\\alpha_i-1}_i}

(I have no idea, why it seems not to parse \\a correctly).

When looking for this, I found that the Dirichlet distribution is
missing from the new Docstring Wiki, 
http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random


Then I saw that Dirichlet is also missing in  __all__ in
http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py

As a consequence numpy.lookfor does not find Dirichlet
 numpy.lookfor('dirichlet')
Search results for 'dirichlet'
--
 import numpy.random
 dir(numpy.random)
contains dirichlet
 numpy.random.__all__
does not contain dirichlet.

To me this seems to be a documentation bug.

Josef

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Travis E. Oliphant

Bruce Southey wrote:
 Hi,
 Is it bug if different NumPy types have different attributes?
   
I don't think so, other than perhaps we should not have the Python types 
in the numpy namespace.

numpy.float is just __builtin__.float   which is a Python type not a 
NumPy data-type object.

numpy.float64 inherits from numpy.float however.

-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Travis E. Oliphant

Alan G Isaac wrote:
 On Thu, 22 May 2008, Andrea Gavana apparently wrote:
   
 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax 
 

 This seems to raise a question:
 should numpy arrays support this standard Python idiom?
   
It would be nice, but alas it requires a significant change to Python 
first to give us the hooks to modify.   (We need the 'and' and 'or' 
operations to return vectors instead of just numbers as they do 
now). There is a PEP to allow this, but it has not received much TLC 
as of late.   The difficulty in the implementation is supporting 
short-circuited evaluation. 

-Travis



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Fancier indexing

2008-05-22 Thread Kevin Jacobs [EMAIL PROTECTED]

After poking around for a bit, I was wondering if there was a faster method
for the following:

# Array of index values 0..n
items = numpy.array([0,3,2,1,4,2],dtype=int)

# Count the number of occurrences of each index
counts = numpy.zeros(5, dtype=int)
for i in items:
  counts[i] += 1

In my real code, 'items' contain up to a million values and this loop will
be in a performance critical area of code.  If there is no simple solution,
I can trivially code this using the C-API.

Thanks,
-Kevin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Keith Goodman

On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

How big is n? If it is much smaller than a million then loop over that instead.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Kevin Jacobs [EMAIL PROTECTED]

On Thu, May 22, 2008 at 12:08 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 How big is n? If it is much smaller than a million then loop over that
 instead.


n is always relatively small, but I'd rather not do:

for i in range(n):
  counts[i] = (items==i).sum()

If that was the best alternative, I'd just bite the bullet and code this in
C.

Thanks,
-Kevin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Keith Goodman

On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

 How big is n? If it is much smaller than a million then loop over that 
 instead.

Or how about using a list instead:

 items = [0,3,2,1,4,2]
 uitems = frozenset(items)
 count = [items.count(i) for i in uitems]
 count
   [1, 1, 2, 1, 1]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Mark Miller

You're just trying to do this...correct?

 import numpy
 items = numpy.array([0,3,2,1,4,2],dtype=int)
 unique = numpy.unique(items)
 unique
array([0, 1, 2, 3, 4])
 counts=numpy.histogram(items,unique)
 counts
(array([1, 1, 2, 1, 1]), array([0, 1, 2, 3, 4]))
 counts[0]
array([1, 1, 2, 1, 1])


On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
  After poking around for a bit, I was wondering if there was a faster
 method
  for the following:
 
  # Array of index values 0..n
  items = numpy.array([0,3,2,1,4,2],dtype=int)
 
  # Count the number of occurrences of each index
  counts = numpy.zeros(5, dtype=int)
  for i in items:
counts[i] += 1
 
  In my real code, 'items' contain up to a million values and this loop
 will
  be in a performance critical area of code.  If there is no simple
 solution,
  I can trivially code this using the C-API.

 How big is n? If it is much smaller than a million then loop over that
 instead.
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Keith Goodman

On Thu, May 22, 2008 at 9:15 AM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 9:08 AM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 8:59 AM, Kevin Jacobs [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

 How big is n? If it is much smaller than a million then loop over that 
 instead.

 Or how about using a list instead:

 items = [0,3,2,1,4,2]
 uitems = frozenset(items)
 count = [items.count(i) for i in uitems]
 count
   [1, 1, 2, 1, 1]

Oh, I see, so uitems should be range(n)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Robin

On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

I would use bincount:
count = bincount(items)
should be all you need:


In [192]: items = [0,3,2,1,4,2]

In [193]: bincount(items)
Out[193]: array([1, 1, 2, 1, 1])

In [194]: bincount?
Type:   builtin_function_or_method
Base Class: type 'builtin_function_or_method'
String Form:built-in function bincount
Namespace:  Interactive
Docstring:
bincount(x,weights=None)

Return the number of occurrences of each value in x.

x must be a list of non-negative integers.  The output, b[i],
represents the number of times that i is found in x.  If weights
is specified, every occurrence of i at a position p contributes
weights[p] instead of 1.

See also: histogram, digitize, unique.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Keith Goodman

On Thu, May 22, 2008 at 9:22 AM, Robin [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

 I would use bincount:
 count = bincount(items)
 should be all you need:

I guess bincount is *little* faster:

 items = mp.random.randint(0, 100, (100,))
 timeit mp.bincount(items)
100 loops, best of 3: 4.05 ms per loop
 items = items.tolist()
 timeit [items.count(i) for i in range(100)]
10 loops, best of 3: 2.91 s per loop
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution

2008-05-22 Thread joep



On May 22, 11:11 am, joep [EMAIL PROTECTED] wrote:
 Hi,

 When looking for this, I found that the Dirichlet distribution is
 missing from the new Docstring 
 Wiki,http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random

Actually, a search on the wiki finds dirichlet in
http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/dirichlet

I found random/mtrand only through the search, it doesn't seem to be
linked from anywhere

Is it intentional that function that are imported inside numpy might
have the same docstring assigned to several different wiki pages, and
might get edited on different pages?

Since all distribution (except for dirichlet) are included in
numpy.random.__all__, these distribution show up on two different
pages, e.g.
http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/poisson
and
http://sd-2116.dedibox.fr/doc/Docstrings/numpy/random/mtrand/poisson


So except for the strange parsing of the dirichlet docstring, this is
a problem with numpy:

numpy.random.__all__ as defined in numpy/random/info.py does not
expose Dirichlet

Josef

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt

Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 You clearly have a large dataset, otherwise speed wouldn't have been a
 concern to you.  You can do your operation in one pass over the data,
 and I'd suggest you try doing that with Cython or Ctypes.  If you need
 an example on how to access data using those methods, let me know.

 Of course, it *can* be done using NumPy (maybe not in one pass), but
 thinking in terms of for-loops is sometimes easier, and immediately
 takes you to a highly optimised execution time.

 First of all, thank you for your answer. I know next to nothing about
 Cython and very little about Ctypes, but it would be nice to have an
 example on how to use them to speed up the operations. Actually, I
 don't really know if my dataset is large, as I work normally with
 xCent, yCent and zCent vectors of about 100,000-300,000 elements in
 them.

Just to clarify things in my mind: is VTK *that* slow?  I find that
surprising, since it is written in C or C++.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] osX leopard linker setting

2008-05-22 Thread Thomas Hrabe

Hi,

does anybody know the linker settings for python c modules on osX ?
i have the original xcode3 tools installed - gcc 4.0.1 

I use 
'-bundle -flat_namespace'
for linking now, but I get 
ld: can't insert lazy pointers, __dyld section not found for inferred 
architecture ppc

Does anybody know of flags working for the linker?

Thank you,
Thomas
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] osX leopard linker setting

2008-05-22 Thread Thomas Hrabe

By the way, whats a lazy pointer anyway?


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] im Auftrag von Thomas Hrabe
Gesendet: Do 22.05.2008 11:22
An: numpy-discussion@scipy.org
Betreff: [Numpy-discussion] osX leopard linker setting
 
Hi,

does anybody know the linker settings for python c modules on osX ?
i have the original xcode3 tools installed - gcc 4.0.1 

I use 
'-bundle -flat_namespace'
for linking now, but I get 
ld: can't insert lazy pointers, __dyld section not found for inferred 
architecture ppc

Does anybody know of flags working for the linker?

Thank you,
Thomas

winmail.dat___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] osX leopard linker setting

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 1:22 PM, Thomas Hrabe [EMAIL PROTECTED] wrote:
 Hi,

 does anybody know the linker settings for python c modules on osX ?
 i have the original xcode3 tools installed - gcc 4.0.1

Just use distutils (or numpy.distutils). It will take care of the
linker flags for you. If you really can't use distutils for some
reason, take a look at the flags that are added for modules that do
build with distutils.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution

2008-05-22 Thread joep



On May 22, 1:30 pm, Pauli Virtanen [EMAIL PROTECTED] wrote:
 to, 2008-05-22 kello 09:51 -0700, joep kirjoitti:


 
 It is not intentional. And for the majority of cases this does not
 happen, and I can fix this for numpy.random.mtrand. Thanks for
 reporting.


I was looking some more at the __all__ statements and trying to figure
out what the system/idea behind the imports and exposure of functions
at different places is. I did not find any other full duplication as
with mtrand so far.

However, when I do a search on the DocWiki for example for arccos (or
log, log10, exp, tan,...), I see it 9 times, and it is not clear which
ones refer to the same docstring and where several imports of the same
function are picked up separately, and which ones refer to actually
different functions in the source.

numpy.lookfor('arccos') yields 3 results, with 3 different doc
strings, the other 6 might be duplicates.

http://sd-2116.dedibox.fr/doc/Docstrings/numpy/lib/scimath/arccos
has the most informative docstring
In numpy it is exposed as ``numpy.emath.arccos``

A recommendation for docstring editing might be to verify duplicates
and copy doc strings if the function is (almost) duplicated or
triplicated in the numpy source and possibly cross link different
versions.

When I start from the DocWiki front page, I seem to be able to follow
links only to one version of any docstring, but any search leads to
the multiple exposer of the same function.

Josef
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] workaround for searchsorted with strings?

2008-05-22 Thread Lewis Hyatt

Hello-

I see from this thread: 
http://article.gmane.org/gmane.comp.python.numeric.general/18746/

that searchsorted does not work correctly with strings. Is there a workaround,
though, that I can use with 1.0.4 until there is a new official numpy release
that includes the fix mentioned in the reference above? Using the latest SVN
version is not an option for me.

My understanding was that searchsorted works OK if the strings are all the same
data type, but that does not appear to be the case:

p  x=array(['0', '1', '2', '12'])
p  y=array(['0', '0', '2', '3', '123'])
p  x.searchsorted(y) 
 array([0, 0, 0, 2, 0])
p  x.astype(y.dtype).searchsorted(y)
 array([0, 0, 2, 4, 2])

I understand that the first call to searchsorted fails because y has type S3 and
x  has type S2. But it seems that changing the type of x produces still
incorrect (albeit) different results. Is there something similar I can do to
make this work for now? Thanks very much.

-Lewis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] workaround for searchsorted with strings?

2008-05-22 Thread Lewis Hyatt

Oh sorry, my example was dumb, never mind. It looks like this way does work
after all. Can someone please confirm for me, though, that the workaround I am
using (just changing to the wider string type) is reliable? Thanks, sorry for
the noise.

-Lewis



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] workaround for searchsorted with strings?

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 12:29 PM, Lewis Hyatt [EMAIL PROTECTED] wrote:

 Hello-

 I see from this thread:
 http://article.gmane.org/gmane.comp.python.numeric.general/18746/

 that searchsorted does not work correctly with strings. Is there a
 workaround,
 though, that I can use with 1.0.4 until there is a new official numpy
 release
 that includes the fix mentioned in the reference above? Using the latest
 SVN
 version is not an option for me.

 My understanding was that searchsorted works OK if the strings are all the
 same
 data type, but that does not appear to be the case:

 p  x=array(['0', '1', '2', '12'])
 p  y=array(['0', '0', '2', '3', '123'])
 p  x.searchsorted(y)
  array([0, 0, 0, 2, 0])
 p  x.astype(y.dtype).searchsorted(y)
  array([0, 0, 2, 4, 2])

 I understand that the first call to searchsorted fails because y has type
 S3 and
 x  has type S2. But it seems that changing the type of x produces still
 incorrect (albeit) different results. Is there something similar I can do
 to
 make this work for now? Thanks very much.


The x array is not sorted. Try

x = array(['0', '1', '12', '2'])
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] buglet: Dirichlet missing in numpy.random.all as defined in numpy/random/info.py

2008-05-22 Thread joep

The Dirichlet distribution is missing in  __all__ in
http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/random/info.py

As a consequence numpy.lookfor does not find Dirichlet
 numpy.lookfor('dirichlet')
Search results for 'dirichlet'
--

 import numpy.random
 dir(numpy.random)
contains dirichlet

 numpy.random.__all__
does not contain dirichlet.

looks like a tiny bug.

Josef

(this is kind of a duplicate email, but I didn't want it to get lost
in the DocWiki discussion)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote:
 Just to clarify things in my mind: is VTK *that* slow?  I find that
 surprising, since it is written in C or C++.

Performance can depend more on the design of the code than the
implementation language. There are several places in VTK which are
slower than they strictly could be because VTK exposes data primarily
through abstract interfaces and only sometimes expose underlying data
structure for faster processing. Quite sensibly, they implement the
general form first.

It's much the same with parts of numpy. The iterator abstraction lets
you work on arbitrarily strided arrays, but for contiguous arrays,
just using the pointer lets you, and the compiler, optimize your code
more.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new numpy docs, missing function and parse error - dirichlet distribution

2008-05-22 Thread Pauli Virtanen

to, 2008-05-22 kello 11:28 -0700, joep kirjoitti:
[clip]
 However, when I do a search on the DocWiki for example for arccos (or
 log, log10, exp, tan,...), I see it 9 times, and it is not clear which
 ones refer to the same docstring and where several imports of the same
 function are picked up separately, and which ones refer to actually
 different functions in the source.
[clip]
 A recommendation for docstring editing might be to verify duplicates
 and copy doc strings if the function is (almost) duplicated or
 triplicated in the numpy source and possibly cross link different
 versions.

This is a problem with the tool on handling extension objects and
Pyrex-generated classes, and the editors shouldn't have to concern
themselves with it. I'll fix it and remove any unedited duplicates from
the wiki.

Pauli


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] workaround for searchsorted with strings?

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 12:36 PM, Lewis Hyatt [EMAIL PROTECTED] wrote:

 Oh sorry, my example was dumb, never mind. It looks like this way does work
 after all. Can someone please confirm for me, though, that the workaround I
 am
 using (just changing to the wider string type) is reliable? Thanks, sorry
 for
 the noise.


You can still have problems because the numpy strings will be filled out
with zeros. The string compare in 1.0.4 doesn't handle zeros correctly and
this might cause some problems.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant
[EMAIL PROTECTED] wrote:
 Bruce Southey wrote:
 Hi,
 Is it bug if different NumPy types have different attributes?

 I don't think so, other than perhaps we should not have the Python types
 in the numpy namespace.

 numpy.float is just __builtin__.float   which is a Python type not a
 NumPy data-type object.

 numpy.float64 inherits from numpy.float however.

And I believe this is the cause of the difference between the
attributes of numpy.float32/numpy.float128 and numpy.float64. Same
deal with int0 and int64 on your presumably 64-bit platform.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] C API

2008-05-22 Thread Charles R Harris

All,

I added a function to array_api_order.txt and apparently this changed the
order of the pointers in the API, which caused ctypes to segfault until I
removed the build directory and did a complete rebuild. It seems to me that
if we really want to make adding these API functions safe, then we should
only have one list instead of the current two. This looks to require some
mods to the build system. What do folks think?

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi All,

On Thu, May 22, 2008 at 7:46 PM, Robert Kern  wrote:
 On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] 
 wrote:
 Just to clarify things in my mind: is VTK *that* slow?  I find that
 surprising, since it is written in C or C++.

 Performance can depend more on the design of the code than the
 implementation language. There are several places in VTK which are
 slower than they strictly could be because VTK exposes data primarily
 through abstract interfaces and only sometimes expose underlying data
 structure for faster processing. Quite sensibly, they implement the
 general form first.

Yes, Robert is perfectly right. VTK is quite handy in most of the
situations, but in this case I had to recursively apply 3 thresholds
(each one for X, Y and Z respectively) and the threshold construction
(initialization) and its execution were much slower than my (sloppy)
numpy result. Compared to the solution Francesc posted, the VTK
approach simply disappears.
By the way, about the solution Francesc posted:

xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]

Do you think is there any chance that a C extension (or something
similar) could be faster? Or something else using weave? I understand
that this solution is already highly optimized as it uses the power of
numpy with the logic operations in Python, but I was wondering if I
can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
more or less, for 150,000 cells, but today I encountered a case in
which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
Otherwise, I will try and implement it in Fortran and wrap it with
f2py, assuming I am able to do it correctly and the overhead of
calling an external extension is not killing the execution time.

Thank you very much for your sugestions.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Bruce Southey

Hi,
Thanks very much for the confirmation.

Bruce

On Thu, May 22, 2008 at 2:09 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant
 [EMAIL PROTECTED] wrote:
 Bruce Southey wrote:
 Hi,
 Is it bug if different NumPy types have different attributes?

 I don't think so, other than perhaps we should not have the Python types
 in the numpy namespace.

 numpy.float is just __builtin__.float   which is a Python type not a
 NumPy data-type object.

 numpy.float64 inherits from numpy.float however.

 And I believe this is the cause of the difference between the
 attributes of numpy.float32/numpy.float128 and numpy.float64. Same
 deal with int0 and int64 on your presumably 64-bit platform.

 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a harmless
 enigma that is made terrible by our own mad attempt to interpret it as
 though it had an underlying truth.
  -- Umberto Eco
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Christopher Barker

Andrea Gavana wrote:
 By the way, about the solution Francesc posted:
 
 xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)
 
 xyzReq = numpy.nonzero(xyzReq)[0]
 
 Do you think is there any chance that a C extension (or something
 similar) could be faster?

yep -- if I've be got this right, the above creates 7 temporary arrays. 
creating that many and pushing the data in and out of memory can be 
pretty slow for large arrays.

In C, C++, Cython or Fortran, you can just do one loop, and one output 
array. It should be much faster for the big arrays.

 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

nope, that's one function call for the whole thing, negligible.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 1:32 PM, Bruce Southey [EMAIL PROTECTED] wrote:

 Hi,
 Thanks very much for the confirmation.

 Bruce

 On Thu, May 22, 2008 at 2:09 PM, Robert Kern [EMAIL PROTECTED]
 wrote:
  On Thu, May 22, 2008 at 10:24 AM, Travis E. Oliphant
  [EMAIL PROTECTED] wrote:
  Bruce Southey wrote:
  Hi,
  Is it bug if different NumPy types have different attributes?
 
  I don't think so, other than perhaps we should not have the Python types
  in the numpy namespace.
 
  numpy.float is just __builtin__.float   which is a Python type not a
  NumPy data-type object.
 
  numpy.float64 inherits from numpy.float however.
 
  And I believe this is the cause of the difference between the
  attributes of numpy.float32/numpy.float128 and numpy.float64. Same
  deal with int0 and int64 on your presumably 64-bit platform.
 


It also leads to various inconsistencies:

In [1]: float32(array([[1]]))
Out[1]: array([[ 1.]], dtype=float32)

In [2]: float64(array([[1]]))
Out[2]: 1.0

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 2:46 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 It also leads to various inconsistencies:

 In [1]: float32(array([[1]]))
 Out[1]: array([[ 1.]], dtype=float32)

 In [2]: float64(array([[1]]))
 Out[2]: 1.0

Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype).

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Nathan Bell

On Thu, May 22, 2008 at 2:16 PM, Andrea Gavana [EMAIL PROTECTED] wrote:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)


You could implement this with inplace operations to save memory:
xyzReq = (xCent = xMin)
xyzReq = (xCent = xMax)
xyzReq = (yCent = yMin)
xyzReq = (yCent = yMax)
xyzReq = (zCent = zMin)
xyzReq = (zCent = zMax)


 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster

A C implementation would certainly be faster, perhaps 5x faster, due
to short-circuiting the AND operations and the fact that you'd only
pass over the data once.

OTOH I'd be very surprised if this is the slowest part of your application.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt

Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
 more or less, for 150,000 cells, but today I encountered a case in
 which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

I wrote a quick proof of concept (no guarantees).  You can find it
here (download using bzr, http://bazaar-vcs.org, or just grab the
files with your web browser):

https://code.launchpad.net/~stefanv/+junk/xyz

1. Install Cython if you haven't already
2. Run python setup.py build_ext -i to build the C extension
3. Use the code, e.g.,

import xyz
out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5,
 array([2.0, 4.0, 6.0]), 2, 4,
 array([-1.0, -2.0, -4.0]), -3, -2)

In the above case, out is [False, True, False].

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Bruce Southey

On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 2:46 PM, Charles R Harris
 [EMAIL PROTECTED] wrote:
 It also leads to various inconsistencies:

 In [1]: float32(array([[1]]))
 Out[1]: array([[ 1.]], dtype=float32)

 In [2]: float64(array([[1]]))
 Out[2]: 1.0

 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype).

 --
 Robert Kern

 I have come to believe that the whole world is an enigma, a harmless
 enigma that is made terrible by our own mad attempt to interpret it as
 though it had an underlying truth.
  -- Umberto Eco
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion


So, should these return an error if the argument is an ndarray object,
a list or similar?
Otherwise, int, float and string type of arguments would be okay under
the assumption that people would like variable precision scalars.


Bruce
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] C API

2008-05-22 Thread Travis E. Oliphant

Charles R Harris wrote:
 All,

 I added a function to array_api_order.txt and apparently this changed 
 the order of the pointers in the API, which caused ctypes to segfault 
 until I removed the build directory and did a complete rebuild. It 
 seems to me that if we really want to make adding these API functions 
 safe, then we should only have one list instead of the current two. 
 This looks to require some mods to the build system. What do folks think?
Yes, or a simple solution is to only append to one of the lists.   At 
the very least,  we should mark the array_api_order as not appendable. 

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] C API

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant [EMAIL PROTECTED]
wrote:

 Charles R Harris wrote:
  All,
 
  I added a function to array_api_order.txt and apparently this changed
  the order of the pointers in the API, which caused ctypes to segfault
  until I removed the build directory and did a complete rebuild. It
  seems to me that if we really want to make adding these API functions
  safe, then we should only have one list instead of the current two.
  This looks to require some mods to the build system. What do folks think?
 Yes, or a simple solution is to only append to one of the lists.   At
 the very least,  we should mark the array_api_order as not appendable.


That doesn't work unless I change the tag from OBJECT_API to MULTIARRAY_API.
Do these tags really matter? Maybe we should just replace them with API and
merge this lists. At the beginning of 1.2, of course.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Christopher Barker

Stéfan van der Walt wrote:
 I wrote a quick proof of concept (no guarantees). 

Thanks for the example -- I like how Cython understands ndarrays!

It looks like this code would break if x,y,and z are not C-contiguous -- 
should there be a check for that?

-Chris


 here (download using bzr, http://bazaar-vcs.org, or just grab the
 files with your web browser):
 
 https://code.launchpad.net/~stefanv/+junk/xyz

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi Chris and All,

On Thu, May 22, 2008 at 8:40 PM, Christopher Barker wrote:
 Andrea Gavana wrote:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster?

 yep -- if I've be got this right, the above creates 7 temporary arrays.
 creating that many and pushing the data in and out of memory can be
 pretty slow for large arrays.

 In C, C++, Cython or Fortran, you can just do one loop, and one output
 array. It should be much faster for the big arrays.

Well, I have implemented it in 2 ways in Fortran, and actually the
Fortran solutions are slower than the numpy one (2 and 3 times slower
respectively). I attach the source code of the timing code and the 5
implementations I have at the moment (I have included Nathan's
implementation, which is as fast as Francesc's one but it has the
advantage of saving memory).

The timing I get on my home PC are:

Andrea's Solution: 0.42807561 Seconds/Trial
Francesc's Solution: 0.018297884 Seconds/Trial
Fortran Solution 1: 0.035862072 Seconds/Trial
Fortran Solution 2: 0.029822338 Seconds/Trial
Nathan's Solution: 0.018930507 Seconds/Trial

Maybe my fortran coding is sloppy but I don't really know fortran so
well to implement it better...

Thank you so much to everybody for your suggestions so far :-D

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
# Begin Code

import numpy
from timeit import Timer

# FORTRAN modules
from MultipleBoolean3 import multipleboolean3
from MultipleBoolean4 import multipleboolean4

# Number of cells in my original grid
nCells = 15

# Define some constraints for X, Y, Z
xMin, xMax = 250.0, 700.0
yMin, yMax = 1000.0, 1900.0
zMin, zMax = 120.0, 300.0

# Generate random centroids for the cells
xCent = 1000.0*numpy.random.rand(nCells)
yCent = 2500.0*numpy.random.rand(nCells)
zCent = 400.0*numpy.random.rand(nCells)


def MultipleBoolean1():
Andrea's solution, slow :-( .

   xReq_1 = numpy.nonzero(xCent = xMin)
   xReq_2 = numpy.nonzero(xCent = xMax)

   yReq_1 = numpy.nonzero(yCent = yMin)
   yReq_2 = numpy.nonzero(yCent = yMax)

   zReq_1 = numpy.nonzero(zCent = zMin)
   zReq_2 = numpy.nonzero(zCent = zMax)

   xReq = numpy.intersect1d_nu(xReq_1, xReq_2)
   yReq = numpy.intersect1d_nu(yReq_1, yReq_2)
   zReq = numpy.intersect1d_nu(zReq_1, zReq_2)

   xyReq = numpy.intersect1d_nu(xReq, yReq)
   xyzReq = numpy.intersect1d_nu(xyReq, zReq)


def MultipleBoolean2():
Francesc's's solution, Much faster :-) .

   xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

   xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean3():

xyzReq = multipleboolean3(xCent, yCent, zCent, xMin, xMax, yMin, yMax, 
zMin, zMax, nCells)
xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean4():

xyzReq = multipleboolean4(xCent, yCent, zCent, xMin, xMax, yMin, yMax, 
zMin, zMax, nCells)
xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean5():

xyzReq = (xCent = xMin)
xyzReq = (xCent = xMax)
xyzReq = (yCent = yMin)
xyzReq = (yCent = yMax)
xyzReq = (zCent = zMin)
xyzReq = (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]


if __name__ == __main__:

   trial = 10

   t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1)
   print \n\nAndrea's Solution: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2)
   print Francesc's Solution: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean3(), from __main__ import MultipleBoolean3)
   print Fortran Solution 1: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean4(), from __main__ import MultipleBoolean4)
   print Fortran Solution 2: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean5(), from __main__ import MultipleBoolean5)
   print Nathan's Solution: %0.8g 
Seconds/Trial\n%(t.timeit(number=trial)/trial)

# End Code


MultipleBoolean3.f90
Description: Binary data


MultipleBoolean4.f90
Description: Binary data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana

Hi Stefan,

On Thu, May 22, 2008 at 10:23 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
 more or less, for 150,000 cells, but today I encountered a case in
 which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

 I wrote a quick proof of concept (no guarantees).  You can find it
 here (download using bzr, http://bazaar-vcs.org, or just grab the
 files with your web browser):

 https://code.launchpad.net/~stefanv/+junk/xyz

 1. Install Cython if you haven't already
 2. Run python setup.py build_ext -i to build the C extension
 3. Use the code, e.g.,

import xyz
out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5,
 array([2.0, 4.0, 6.0]), 2, 4,
 array([-1.0, -2.0, -4.0]), -3, -2)

 In the above case, out is [False, True, False].

Thank you very much for this! I am going to try it and time it,
comparing it with the other implementations. I think I need to study a
bit your code as I know almost nothing about Cython :-D

Thank you!

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Thu, May 22, 2008 at 2:46 PM, Charles R Harris
 [EMAIL PROTECTED] wrote:
 It also leads to various inconsistencies:

 In [1]: float32(array([[1]]))
 Out[1]: array([[ 1.]], dtype=float32)

 In [2]: float64(array([[1]]))
 Out[2]: 1.0

 Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype).

 So, should these return an error if the argument is an ndarray object,
 a list or similar?

I think it was originally put in as a feature, but given the
inconsistency and the long-standing alternatives, I would deprecate
its use for converting array dtypes. But that's just my opinion.

 Otherwise, int, float and string type of arguments would be okay under
 the assumption that people would like variable precision scalars.

Yes.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 5:07 PM, Robert Kern [EMAIL PROTECTED] wrote:

 On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED] wrote:
  On Thu, May 22, 2008 at 2:59 PM, Robert Kern [EMAIL PROTECTED]
 wrote:
  On Thu, May 22, 2008 at 2:46 PM, Charles R Harris
  [EMAIL PROTECTED] wrote:
  It also leads to various inconsistencies:
 
  In [1]: float32(array([[1]]))
  Out[1]: array([[ 1.]], dtype=float32)
 
  In [2]: float64(array([[1]]))
  Out[2]: 1.0
 
  Okay, so don't do that. Always use x.astype(dtype) or asarray(x, dtype).
 
  So, should these return an error if the argument is an ndarray object,
  a list or similar?

 I think it was originally put in as a feature, but given the
 inconsistency and the long-standing alternatives, I would deprecate
 its use for converting array dtypes. But that's just my opinion.


I agree. Having too many ways to do things just makes for  headaches. Should
we schedule in a deprecation for anything other than scalars and strings.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] C API

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 3:55 PM, Charles R Harris [EMAIL PROTECTED]
wrote:



 On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant 
 [EMAIL PROTECTED] wrote:

 Charles R Harris wrote:
  All,
 
  I added a function to array_api_order.txt and apparently this changed
  the order of the pointers in the API, which caused ctypes to segfault
  until I removed the build directory and did a complete rebuild. It
  seems to me that if we really want to make adding these API functions
  safe, then we should only have one list instead of the current two.
  This looks to require some mods to the build system. What do folks
 think?
 Yes, or a simple solution is to only append to one of the lists.   At
 the very least,  we should mark the array_api_order as not appendable.


 That doesn't work unless I change the tag from OBJECT_API to
 MULTIARRAY_API. Do these tags really matter? Maybe we should just replace
 them with API and merge this lists. At the beginning of 1.2, of course.


This doesn't look to hard to do. How about a unified NUMPY_API list?

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt

Hi Andrea

2008/5/23 Andrea Gavana [EMAIL PROTECTED]:
 Thank you very much for this! I am going to try it and time it,
 comparing it with the other implementations. I think I need to study a
 bit your code as I know almost nothing about Cython :-D

That won't be necessary -- the Fortran-implementation is guaranteed to win!

Just to make sure, I timed it anyway (on somewhat larger arrays):

Francesc's Solution: 0.062797403 Seconds/Trial
Fortran Solution 1: 0.050316906 Seconds/Trial
Fortran Solution 2: 0.052595496 Seconds/Trial
Nathan's Solution: 0.055562282 Seconds/Trial
Cython Solution: 0.06250751 Seconds/Trial

Nathan's version runs over the data 6 times, and still does better
than the Pyrex version.  I don't know why!

But, hey, this algorithm is parallelisable!  Wait, no, it's bedtime.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] C API

2008-05-22 Thread Travis E. Oliphant

Charles R Harris wrote:


 On Thu, May 22, 2008 at 3:55 PM, Charles R Harris 
 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:



 On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant
 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:

 Charles R Harris wrote:
  All,
 
  I added a function to array_api_order.txt and apparently
 this changed
  the order of the pointers in the API, which caused ctypes to
 segfault
  until I removed the build directory and did a complete
 rebuild. It
  seems to me that if we really want to make adding these API
 functions
  safe, then we should only have one list instead of the
 current two.
  This looks to require some mods to the build system. What do
 folks think?
 Yes, or a simple solution is to only append to one of the
 lists.   At
 the very least,  we should mark the array_api_order as not
 appendable.


 That doesn't work unless I change the tag from OBJECT_API to
 MULTIARRAY_API. Do these tags really matter? Maybe we should just
 replace them with API and merge this lists. At the beginning of
 1.2, of course.


 This doesn't look to hard to do. How about a unified NUMPY_API list?
That's fine with me.  I can't remember why there were 2 separate lists.

-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Different attributes for NumPy types

2008-05-22 Thread Travis E. Oliphant

Charles R Harris wrote:


 On Thu, May 22, 2008 at 5:07 PM, Robert Kern [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:

 On Thu, May 22, 2008 at 4:25 PM, Bruce Southey [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:
  On Thu, May 22, 2008 at 2:59 PM, Robert Kern
 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:
  On Thu, May 22, 2008 at 2:46 PM, Charles R Harris
  [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
 wrote:
  It also leads to various inconsistencies:
 
  In [1]: float32(array([[1]]))
  Out[1]: array([[ 1.]], dtype=float32)
 
  In [2]: float64(array([[1]]))
  Out[2]: 1.0
 
  Okay, so don't do that. Always use x.astype(dtype) or
 asarray(x, dtype).
 
  So, should these return an error if the argument is an ndarray
 object,
  a list or similar?

 I think it was originally put in as a feature, but given the
 inconsistency and the long-standing alternatives, I would deprecate
 its use for converting array dtypes. But that's just my opinion.


 I agree. Having too many ways to do things just makes for  headaches. 
 Should we schedule in a deprecation for anything other than scalars 
 and strings.

I don't have a strong opinion either way.

-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] triangular matrix fill

2008-05-22 Thread Tom Waite

I have a question on filling a lower triangular matrix using numpy. This
is essentially having two loops and the inner loop upper limit is the
outer loop current index. In the inner loop I have a vector being
multiplied by a constant set in the outer loop. For a matrix N*N in size,
the C the code is:

for(i = 0; i  N; ++i){
for(j = 0; j  i; ++j){
Matrix[i*N + j] = V1[i] * V2[j];
}
}


Thanks


Tom
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] C API

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 6:36 PM, Travis E. Oliphant [EMAIL PROTECTED]
wrote:

 Charles R Harris wrote:
 
 
  On Thu, May 22, 2008 at 3:55 PM, Charles R Harris
  [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:
 
 
 
  On Thu, May 22, 2008 at 3:34 PM, Travis E. Oliphant
  [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:
 
  Charles R Harris wrote:
   All,
  
   I added a function to array_api_order.txt and apparently
  this changed
   the order of the pointers in the API, which caused ctypes to
  segfault
   until I removed the build directory and did a complete
  rebuild. It
   seems to me that if we really want to make adding these API
  functions
   safe, then we should only have one list instead of the
  current two.
   This looks to require some mods to the build system. What do
  folks think?
  Yes, or a simple solution is to only append to one of the
  lists.   At
  the very least,  we should mark the array_api_order as not
  appendable.
 
 
  That doesn't work unless I change the tag from OBJECT_API to
  MULTIARRAY_API. Do these tags really matter? Maybe we should just
  replace them with API and merge this lists. At the beginning of
  1.2, of course.
 
 
  This doesn't look to hard to do. How about a unified NUMPY_API list?
 That's fine with me.  I can't remember why there were 2 separate lists.


OK. Another question, why do __ufunc_api.h and __multiarray_api.h have
double underscores prefixes?

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] triangular matrix fill

2008-05-22 Thread Charles R Harris

On Thu, May 22, 2008 at 7:19 PM, Tom Waite [EMAIL PROTECTED] wrote:

 I have a question on filling a lower triangular matrix using numpy. This
 is essentially having two loops and the inner loop upper limit is the
 outer loop current index. In the inner loop I have a vector being
 multiplied by a constant set in the outer loop. For a matrix N*N in size,
 the C the code is:

 for(i = 0; i  N; ++i){
 for(j = 0; j  i; ++j){
 Matrix[i*N + j] = V1[i] * V2[j];
 }
 }


You can use numpy.outer(V1,V2) and just ignore everything  on and above the
diagonal.

In [1]: x = arange(3)

In [2]: y = arange(3,6)

In [3]: outer(x,y)
Out[3]:
array([[ 0,  0,  0],
   [ 3,  4,  5],
   [ 6,  8, 10]])

You can mask the upper part if you want:

In [16]: outer(x,y)*fromfunction(lambda i,j: ij, (3,3))
Out[16]:
array([[0, 0, 0],
   [3, 0, 0],
   [6, 8, 0]])

 Or you could use fromfunction directly.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] triangular matrix fill

2008-05-22 Thread Robert Kern

On Thu, May 22, 2008 at 9:07 PM, Charles R Harris
[EMAIL PROTECTED] wrote:

 On Thu, May 22, 2008 at 7:19 PM, Tom Waite [EMAIL PROTECTED] wrote:

 I have a question on filling a lower triangular matrix using numpy. This
 is essentially having two loops and the inner loop upper limit is the
 outer loop current index. In the inner loop I have a vector being
 multiplied by a constant set in the outer loop. For a matrix N*N in size,
 the C the code is:

 for(i = 0; i  N; ++i){
 for(j = 0; j  i; ++j){
 Matrix[i*N + j] = V1[i] * V2[j];
 }
 }


 You can use numpy.outer(V1,V2) and just ignore everything  on and above the
 diagonal.

 In [1]: x = arange(3)

 In [2]: y = arange(3,6)

 In [3]: outer(x,y)
 Out[3]:
 array([[ 0,  0,  0],
[ 3,  4,  5],
[ 6,  8, 10]])

 You can mask the upper part if you want:

 In [16]: outer(x,y)*fromfunction(lambda i,j: ij, (3,3))
 Out[16]:
 array([[0, 0, 0],
[3, 0, 0],
[6, 8, 0]])

  Or you could use fromfunction directly.

Or numpy.tril().

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

69 matches

Mail list logo