Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-25 Thread Andrea Gavana
Hi Stefan  All,

On Sat, May 24, 2008 at 8:11 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/24 Andrea Gavana [EMAIL PROTECTED]:
   Number Of Cells: 5
 -
 | Rank |  Method Name  | Execution Time | Relative Slowness |
 -
   1NumPy 5 (Andrea)   0.00562803  1.0
   2   NumPy 1 (Francesc)  0.00563365  1.00100
   3NumPy 2 (Nathan)   0.00577322  1.02580
   4NumPy 4 (Nathan-Vector)0.00580577  1.03158
   5   Fortran 6 (James)   0.00660514  1.17361
   6Fortran 3 (Alex)   0.00709856  1.26129
   7   Fortran 5 (James)   0.00748726  1.33035
   8Fortran 2 (Mine)   0.00748816  1.33051
   9Fortran 1 (Mine)   0.00775906  1.37864
  10  Fortran 4 {Michael)  0.00777685  1.38181
  11NumPy 3 (Peter)0.01253662  2.22753
  12Cython (Stefan)0.01597804  2.83901
 -

 When you bench the Cython code, you'll have to take out the Python
 calls (for checking dtype etc.), otherwise you're comparing apples and
 oranges.  After I tweaked it, it ran roughly the same time as
 Francesc's version.  But like I mentioned before, the Fortran results
 should trump all, so what is going on here?

I thought I had removed the Python checks from the Cython code (if you
look at the attached files), but maybe I haven't removed them all...
about Fortran, I have no idea: I have 6 different implementations in
Fortran, and they are all slower than the pure NumPy ones. I don't
know if I can optimiza them further (I have asked to a Fortran
newsgroup too, but no faster solution has arisen). I am not even sure
if the defaults f2py compiler options are already on maximum
optimization for Fortran. Does anyone know if this is true? Maybe
Pearu can shed some light on this issue...

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-25 Thread Stéfan van der Walt
Hi Andrea

2008/5/25 Andrea Gavana [EMAIL PROTECTED]:
 When you bench the Cython code, you'll have to take out the Python
 calls (for checking dtype etc.), otherwise you're comparing apples and
 oranges.  After I tweaked it, it ran roughly the same time as
 Francesc's version.  But like I mentioned before, the Fortran results
 should trump all, so what is going on here?

 I thought I had removed the Python checks from the Cython code (if you
 look at the attached files), but maybe I haven't removed them all...
 about Fortran, I have no idea: I have 6 different implementations in
 Fortran, and they are all slower than the pure NumPy ones. I don't
 know if I can optimiza them further (I have asked to a Fortran
 newsgroup too, but no faster solution has arisen). I am not even sure
 if the defaults f2py compiler options are already on maximum
 optimization for Fortran. Does anyone know if this is true? Maybe
 Pearu can shed some light on this issue...

Here are the timings on my machine.  You were right -- you did remove
the type checks in the Cython code.  It turns out the bottleneck was
the for i in range loop.  I was under the impression that loop was
correctly optimised; I'll have a chat with the Cython developers.  If
I change it to for i in xrange or for i from 0 = i  n the speed
improves a lot.

I used gfortran 4.2.1, and as you can see below, the Fortran
implementation beat all the others.

-
   Number Of Cells: 30
-
| Rank |  Method Name  | Execution Time | Relative Slowness |
-
   1   Fortran 5 (James)   0.01854820  1.0
   2   Fortran 6 (James)   0.01882849  1.01511
   3Fortran 1 (Mine)   0.01917751  1.03393
   4  Fortran 4 {Michael)  0.01927021  1.03893
   5Fortran 2 (Mine)   0.01937311  1.04447
   6NumPy 4 (Nathan-Vector)0.02008982  1.08311
   7NumPy 2 (Nathan)   0.02046990  1.10361
   8NumPy 5 (Andrea)   0.02108521  1.13678
   9   NumPy 1 (Francesc)  0.02211959  1.19255
  10Cython (Stefan)0.02235680  1.20533
  11Fortran 3 (Alex)   0.02486629  1.34063
  12NumPy 3 (Peter)0.05020461  2.70671
-

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-25 Thread Andrea Gavana
Hi Stefan  All,

On Sun, May 25, 2008 at 8:59 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/25 Andrea Gavana [EMAIL PROTECTED]:
 When you bench the Cython code, you'll have to take out the Python
 calls (for checking dtype etc.), otherwise you're comparing apples and
 oranges.  After I tweaked it, it ran roughly the same time as
 Francesc's version.  But like I mentioned before, the Fortran results
 should trump all, so what is going on here?

 I thought I had removed the Python checks from the Cython code (if you
 look at the attached files), but maybe I haven't removed them all...
 about Fortran, I have no idea: I have 6 different implementations in
 Fortran, and they are all slower than the pure NumPy ones. I don't
 know if I can optimiza them further (I have asked to a Fortran
 newsgroup too, but no faster solution has arisen). I am not even sure
 if the defaults f2py compiler options are already on maximum
 optimization for Fortran. Does anyone know if this is true? Maybe
 Pearu can shed some light on this issue...

 Here are the timings on my machine.  You were right -- you did remove
 the type checks in the Cython code.  It turns out the bottleneck was
 the for i in range loop.  I was under the impression that loop was
 correctly optimised; I'll have a chat with the Cython developers.  If
 I change it to for i in xrange or for i from 0 = i  n the speed
 improves a lot.

 I used gfortran 4.2.1, and as you can see below, the Fortran
 implementation beat all the others.

 -
   Number Of Cells: 30
 -
 | Rank |  Method Name  | Execution Time | Relative Slowness |
 -
   1   Fortran 5 (James)   0.01854820  1.0
   2   Fortran 6 (James)   0.01882849  1.01511
   3Fortran 1 (Mine)   0.01917751  1.03393
   4  Fortran 4 {Michael)  0.01927021  1.03893
   5Fortran 2 (Mine)   0.01937311  1.04447
   6NumPy 4 (Nathan-Vector)0.02008982  1.08311
   7NumPy 2 (Nathan)   0.02046990  1.10361
   8NumPy 5 (Andrea)   0.02108521  1.13678
   9   NumPy 1 (Francesc)  0.02211959  1.19255
  10Cython (Stefan)0.02235680  1.20533
  11Fortran 3 (Alex)   0.02486629  1.34063
  12NumPy 3 (Peter)0.05020461  2.70671
 -

Thank you for the tests you have run. I have run mine with Compaq
Visual Fortran 6.6 on Windows XP, and I assume I can not use gfortran
with MS visual studio 2003 (which is the compiler Python is built
with). So the only option I have is to try Intel Visual Fortran, which
I will do tomorrow, or stick with the numpy implementation as it looks
like there is nothing more I can do (even with sorted arrays or using
another approach, and I can't think of anything else) to speed up my
problem.

A big thank you to everyone in this list, you have been very kind and helpful.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-24 Thread Andrea Gavana
Hi All,

On Fri, May 23, 2008 at 4:50 PM, Andrea Gavana wrote:
 Hi Peter  All,

 On Fri, May 23, 2008 at 4:16 PM, Peter Creasey wrote:
 Hi Andrea,

 2008/5/23  Andrea Gavana [EMAIL PROTECTED]:
 And so on. The probelm with this approach is that I lose the original
 indices for which I want all the inequality tests to succeed:

 To have the original indices you just need to re-index your indices, as it 
 were

 idx = flatnonzero(xCent = xMin)
 idx = idx[flatnonzero(xCent[idx] = xMax)]
 idx = idx[flatnonzero(yCent[idx] = yMin)]
 idx = idx[flatnonzero(yCent[idx] = yMax)]
 ...
 (I haven't tested this code, apologies for bugs)

 However, there is a performance penalty for doing all this re-indexing
 (I once fell afoul of this), and if these conditions mostly evaluate
 to True you can often be better off with one of the solutions already
 suggested.

 Thank you for your answer. I have tried your suggestion, and the
 performances are more or less comparable with the other NumPy
 implementations (yours is roughly 1.2 times slower than the others),
 but I do gain some advantage when the subgrids are very small (i.e.,
 most of the values in the first array are already False). I'll go and
 implement your solution when I have many small subgrids in my model.

I have added a few more implementations for this issue. One think that
stroke me was that I could actually use sorted arrays for my xCent,
yCent and zCent vectors, so I came up with a solution which uses
numpy.searchsorted but the speed is more or less as it was without
sorting. The specific function is this:

def MultipleBoolean13():
 Numpy solution 5 (Andrea). 

searchsorted = numpy.searchsorted

indxStart, indxEnd = searchsorted(xCentS, [xMin, xMax])
indyStart, indyEnd = searchsorted(yCentS, [yMin, yMax])
indzStart, indzEnd = searchsorted(zCentS, [zMin, zMax])

xInd = numpy.zeros((nCells), dtype=numpy.bool)
yInd = xInd.copy()
zInd = xInd.copy()

xInd[xIndices[indxStart:indxEnd]] = True
yInd[yIndices[indyStart:indyEnd]] = True
zInd[zIndices[indzStart:indzEnd]] = True

xyzReq = numpy.nonzero(xInd  yInd  zInd)[0]


Where xCentS, yCentS and zCentS are the sorted arrays. I don't see any
easy way to optimize this method, so I'd like to know if it is
possible to code it better than I did. I have done some testing and
timings of all the solutions we came up until now (12
implementations), and this is what I get (please notice the nice work
or ascii art :-D :-D  ):

***
* SUMMARY RESULTS *
***

-
   Number Of Cells: 5
-
| Rank |  Method Name  | Execution Time | Relative Slowness |
-
   1NumPy 5 (Andrea)   0.00562803  1.0
   2   NumPy 1 (Francesc)  0.00563365  1.00100
   3NumPy 2 (Nathan)   0.00577322  1.02580
   4NumPy 4 (Nathan-Vector)0.00580577  1.03158
   5   Fortran 6 (James)   0.00660514  1.17361
   6Fortran 3 (Alex)   0.00709856  1.26129
   7   Fortran 5 (James)   0.00748726  1.33035
   8Fortran 2 (Mine)   0.00748816  1.33051
   9Fortran 1 (Mine)   0.00775906  1.37864
  10  Fortran 4 {Michael)  0.00777685  1.38181
  11NumPy 3 (Peter)0.01253662  2.22753
  12Cython (Stefan)0.01597804  2.83901
-

-
   Number Of Cells: 10
-
| Rank |  Method Name  | Execution Time | Relative Slowness |
-
   1NumPy 5 (Andrea)   0.01080372  1.0
   2NumPy 2 (Nathan)   0.01109147  1.02663
   3   NumPy 1 (Francesc)  0.01114189  1.03130
   4NumPy 4 (Nathan-Vector)0.01214118  1.12380
   5   Fortran 6 (James)   0.01351264  1.25074
   6   Fortran 5 (James)   0.01368450  1.26665
   7Fortran 3 (Alex)   0.01373010  1.27087
   8Fortran 2 (Mine)   0.01415306  1.31002
   9Fortran 1 (Mine)   0.01425558  1.31951
  10  Fortran 4 {Michael)  0.01443192  1.33583
  11NumPy 3 (Peter)0.02463268  2.28002
  12Cython (Stefan)0.04298108  3.97836
-

-
   Number Of Cells: 15

Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-24 Thread Stéfan van der Walt
Hi Andrea

2008/5/24 Andrea Gavana [EMAIL PROTECTED]:
   Number Of Cells: 5
 -
 | Rank |  Method Name  | Execution Time | Relative Slowness |
 -
   1NumPy 5 (Andrea)   0.00562803  1.0
   2   NumPy 1 (Francesc)  0.00563365  1.00100
   3NumPy 2 (Nathan)   0.00577322  1.02580
   4NumPy 4 (Nathan-Vector)0.00580577  1.03158
   5   Fortran 6 (James)   0.00660514  1.17361
   6Fortran 3 (Alex)   0.00709856  1.26129
   7   Fortran 5 (James)   0.00748726  1.33035
   8Fortran 2 (Mine)   0.00748816  1.33051
   9Fortran 1 (Mine)   0.00775906  1.37864
  10  Fortran 4 {Michael)  0.00777685  1.38181
  11NumPy 3 (Peter)0.01253662  2.22753
  12Cython (Stefan)0.01597804  2.83901
 -

When you bench the Cython code, you'll have to take out the Python
calls (for checking dtype etc.), otherwise you're comparing apples and
oranges.  After I tweaked it, it ran roughly the same time as
Francesc's version.  But like I mentioned before, the Fortran results
should trump all, so what is going on here?

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-23 Thread Andrea Gavana
Hi Stefan  All,

On Fri, May 23, 2008 at 1:02 AM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/23 Andrea Gavana [EMAIL PROTECTED]:
 Thank you very much for this! I am going to try it and time it,
 comparing it with the other implementations. I think I need to study a
 bit your code as I know almost nothing about Cython :-D

 That won't be necessary -- the Fortran-implementation is guaranteed to win!

 Just to make sure, I timed it anyway (on somewhat larger arrays):

 Francesc's Solution: 0.062797403 Seconds/Trial
 Fortran Solution 1: 0.050316906 Seconds/Trial
 Fortran Solution 2: 0.052595496 Seconds/Trial
 Nathan's Solution: 0.055562282 Seconds/Trial
 Cython Solution: 0.06250751 Seconds/Trial

 Nathan's version runs over the data 6 times, and still does better
 than the Pyrex version.  I don't know why!

 But, hey, this algorithm is parallelisable!  Wait, no, it's bedtime.

Thank you so much for testing, and thanks to the list for the kind
help and suggestions. In any case, after all these different
implementations, I think the only way to make it faster is to
progressively reduce the size of the vectors on which I make the
inequality tests, while somehow keeping the original vectors indices.
Let me explain with a example:

# step1 will be a vector with nCells elements, filled with True
# or False values
step1 = xCent = xMin

# step2 will be a vector with nonzero(step1) cells, filled
# with True or False values
step2 = xCent[step1] = xMax

# step3 will be a vector with nonzero(step2) cells, filled
# with True or False values
step3 = yCent[step2] = yMin

And so on. The probelm with this approach is that I lose the original
indices for which I want all the inequality tests to succeed: for
example, consider the following:

 xMin, xMax = 3, 7
 xCent = numpy.arange(10)
 step1 = xCent = xMin
 numpy.nonzero(step1)[0]
array([3, 4, 5, 6, 7, 8, 9])

 step2 = xCent[step1] = xMax
 numpy.nonzero(step2)[0]
array([0, 1, 2, 3, 4])

Which are no more the indices of the original vector as I shrunk down
xCent to xCent[step1], and the real indices should be:

 realStep = (xCent = xMin)   (xCent = xMax)
 numpy.nonzero(realStep)[0]
array([0, 1, 2, 3, 4, 5, 6, 7])

So, now the question is. If I iteratively shrink down the vectors as I
did before, is there any way to get back the original indices for
which all the conditions are satisfied? Sorry if this looks like a
dummy/noob question, is just I am not sure on how to implement it.

Thank you very much for your help.

 Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-23 Thread Peter Creasey
Hi Andrea,

2008/5/23  Andrea Gavana [EMAIL PROTECTED]:
 And so on. The probelm with this approach is that I lose the original
 indices for which I want all the inequality tests to succeed:

To have the original indices you just need to re-index your indices, as it were

idx = flatnonzero(xCent = xMin)
idx = idx[flatnonzero(xCent[idx] = xMax)]
idx = idx[flatnonzero(yCent[idx] = yMin)]
idx = idx[flatnonzero(yCent[idx] = yMax)]
...
(I haven't tested this code, apologies for bugs)

However, there is a performance penalty for doing all this re-indexing
(I once fell afoul of this), and if these conditions mostly evaluate
to True you can often be better off with one of the solutions already
suggested.

Regards,
Peter
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-23 Thread Andrea Gavana
Hi Peter  All,

On Fri, May 23, 2008 at 4:16 PM, Peter Creasey wrote:
 Hi Andrea,

 2008/5/23  Andrea Gavana [EMAIL PROTECTED]:
 And so on. The probelm with this approach is that I lose the original
 indices for which I want all the inequality tests to succeed:

 To have the original indices you just need to re-index your indices, as it 
 were

 idx = flatnonzero(xCent = xMin)
 idx = idx[flatnonzero(xCent[idx] = xMax)]
 idx = idx[flatnonzero(yCent[idx] = yMin)]
 idx = idx[flatnonzero(yCent[idx] = yMax)]
 ...
 (I haven't tested this code, apologies for bugs)

 However, there is a performance penalty for doing all this re-indexing
 (I once fell afoul of this), and if these conditions mostly evaluate
 to True you can often be better off with one of the solutions already
 suggested.

Thank you for your answer. I have tried your suggestion, and the
performances are more or less comparable with the other NumPy
implementations (yours is roughly 1.2 times slower than the others),
but I do gain some advantage when the subgrids are very small (i.e.,
most of the values in the first array are already False). I'll go and
implement your solution when I have many small subgrids in my model.

Thank you!

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt
Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

You clearly have a large dataset, otherwise speed wouldn't have been a
concern to you.  You can do your operation in one pass over the data,
and I'd suggest you try doing that with Cython or Ctypes.  If you need
an example on how to access data using those methods, let me know.

Of course, it *can* be done using NumPy (maybe not in one pass), but
thinking in terms of for-loops is sometimes easier, and immediately
takes you to a highly optimised execution time.

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana
Hi Stefan  All,

On Thu, May 22, 2008 at 12:29 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 You clearly have a large dataset, otherwise speed wouldn't have been a
 concern to you.  You can do your operation in one pass over the data,
 and I'd suggest you try doing that with Cython or Ctypes.  If you need
 an example on how to access data using those methods, let me know.

 Of course, it *can* be done using NumPy (maybe not in one pass), but
 thinking in terms of for-loops is sometimes easier, and immediately
 takes you to a highly optimised execution time.

First of all, thank you for your answer. I know next to nothing about
Cython and very little about Ctypes, but it would be nice to have an
example on how to use them to speed up the operations. Actually, I
don't really know if my dataset is large, as I work normally with
xCent, yCent and zCent vectors of about 100,000-300,000 elements in
them. However, all the other operations I do with numpy on these
vectors are pretty fast (reshaping, re-casting, min(), max() and so
on). So I believe that also a pure numpy solution might perform well
enough for my needs: but I am really no expert in numpy, so please
forgive any mistake I'm doing :-D.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Francesc Alted
A Thursday 22 May 2008, Andrea Gavana escrigué:
 Hi All,

 I am building some 3D grids for visualization starting from a
 much bigger grid. I build these grids by satisfying certain
 conditions on x, y, z coordinates of their cells: up to now I was
 using VTK to perform this operation, but VTK is slow as a turtle, so
 I thought to use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax

 # After that, filter cells which do not satisfy Y requirements,
 # but apply this filter only on cells who satisfy the above
 condition:

 yReq = yMin = yCent = yMax

 # After that, filter cells which do not satisfy X requirements,
 # but apply this filter only on cells who satisfy the 2 above
 conditions:

 xReq = xMin = xCent = xMax

 I'd like to end up with a vector of indices which tells me which are
 the cells in the original grid that satisfy all 3 conditions. I know
 that something like this:

 zReq = zMin = zCent = zMax

 Can not be done directly in numpy, as the first statement executed
 returns a vector of boolean. Also, if I do something like:

 zReq1 = numpy.nonzero(zCent = zMax)
 zReq2 = numpy.nonzero(zCent[zReq1] = zMin)

 I lose the original indices of the grid, as in the second statement
 zCent[zReq1] has no more the size of the original grid but it has
 already been filtered out.

 Is there anything I could try in numpy to get what I am looking for?
 Sorry if the description is not very clear :-D

 Thank you very much for your suggestions.

I don't know if this is what you want, but you can get the boolean 
arrays separately, do the intersection and finally get the interesting 
values (by using fancy indexing) or coordinates (by using .nonzero()).  
Here it is an example:

In [105]: a = numpy.arange(10,20)

In [106]: c1=(a=13)(a=17)

In [107]: c2=(a=14)(a=18)

In [109]: all=c1c2

In [110]: a[all]
Out[110]: array([14, 15, 16, 17])   # the values

In [111]: all.nonzero()
Out[111]: (array([4, 5, 6, 7]),)# the coordinates

Hope that helps,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Alan G Isaac
On Thu, 22 May 2008, Andrea Gavana apparently wrote:
 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax 

This seems to raise a question:
should numpy arrays support this standard Python idiom?

Cheers,
Alan Isaac



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Angus McMorland
2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 Hi All,

I am building some 3D grids for visualization starting from a much
 bigger grid. I build these grids by satisfying certain conditions on
 x, y, z coordinates of their cells: up to now I was using VTK to
 perform this operation, but VTK is slow as a turtle, so I thought to
 use numpy to get the cells I am interested in.
 Basically, for every cell I have the coordinates of its center point
 (centroids), named xCent, yCent and zCent. These values are stored in
 numpy arrays (i.e., if I have 10,000 cells, I have 3 vectors xCent,
 yCent and zCent with 10,000 values in them). What I'd like to do is:

 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax

 # After that, filter cells which do not satisfy Y requirements,
 # but apply this filter only on cells who satisfy the above condition:

 yReq = yMin = yCent = yMax

 # After that, filter cells which do not satisfy X requirements,
 # but apply this filter only on cells who satisfy the 2 above conditions:

 xReq = xMin = xCent = xMax

 I'd like to end up with a vector of indices which tells me which are
 the cells in the original grid that satisfy all 3 conditions. I know
 that something like this:

 zReq = zMin = zCent = zMax

 Can not be done directly in numpy, as the first statement executed
 returns a vector of boolean. Also, if I do something like:

 zReq1 = numpy.nonzero(zCent = zMax)
 zReq2 = numpy.nonzero(zCent[zReq1] = zMin)

 I lose the original indices of the grid, as in the second statement
 zCent[zReq1] has no more the size of the original grid but it has
 already been filtered out.

 Is there anything I could try in numpy to get what I am looking for?
 Sorry if the description is not very clear :-D

 Thank you very much for your suggestions.

How about (as a pure numpy solution):

valid = (z = zMin)  (z = zMax)
valid[valid] = (y[valid] = yMin)  (y[valid] = yMax)
valid[valid] = (x[valid] = xMin)  (x[valid] = xMax)
inds = valid.nonzero()

?
-- 
AJC McMorland, PhD candidate
Physiology, University of Auckland

(Nearly) post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana
Hi Francesc  All,

On Thu, May 22, 2008 at 1:04 PM, Francesc Alted wrote:
 I don't know if this is what you want, but you can get the boolean
 arrays separately, do the intersection and finally get the interesting
 values (by using fancy indexing) or coordinates (by using .nonzero()).
 Here it is an example:

 In [105]: a = numpy.arange(10,20)

 In [106]: c1=(a=13)(a=17)

 In [107]: c2=(a=14)(a=18)

 In [109]: all=c1c2

 In [110]: a[all]
 Out[110]: array([14, 15, 16, 17])   # the values

 In [111]: all.nonzero()
 Out[111]: (array([4, 5, 6, 7]),)# the coordinates

Thank you for this suggestion! I had forgotten  that this worked in
numpy :-( . I have written a couple of small functions to test your
method and my method (hopefully I did it correctly for both). On my
computer (Toshiba Notebook 2.00 GHz, Windows XP SP2, 1GB Ram, Python
2.5, numpy 1.0.3.1), your solution is about 30 times faster than mine
(implemented when I didn't know about multiple boolean operations in
numpy).

This is my code:

# Begin Code

import numpy
from timeit import Timer

# Number of cells in my original grid
nCells = 15

# Define some constraints for X, Y, Z
xMin, xMax = 250.0, 700.0
yMin, yMax = 1000.0, 1900.0
zMin, zMax = 120.0, 300.0

# Generate random centroids for the cells
xCent = 1000.0*numpy.random.rand(nCells)
yCent = 2500.0*numpy.random.rand(nCells)
zCent = 400.0*numpy.random.rand(nCells)


def MultipleBoolean1():
 Andrea's solution, slow :-( .

xReq_1 = numpy.nonzero(xCent = xMin)
xReq_2 = numpy.nonzero(xCent = xMax)

yReq_1 = numpy.nonzero(yCent = yMin)
yReq_2 = numpy.nonzero(yCent = yMax)

zReq_1 = numpy.nonzero(zCent = zMin)
zReq_2 = numpy.nonzero(zCent = zMax)

xReq = numpy.intersect1d_nu(xReq_1, xReq_2)
yReq = numpy.intersect1d_nu(yReq_1, yReq_2)
zReq = numpy.intersect1d_nu(zReq_1, zReq_2)

xyReq = numpy.intersect1d_nu(xReq, yReq)
xyzReq = numpy.intersect1d_nu(xyReq, zReq)


def MultipleBoolean2():
 Francesc's's solution, Much faster :-) .

xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]


if __name__ == __main__:

trial = 10

t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1)
print \n\nAndrea's Solution: %0.8g
Seconds/Trial%(t.timeit(number=trial)/trial)

t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2)
print Francesc's Solution: %0.8g
Seconds/Trial\n%(t.timeit(number=trial)/trial)


# End Code


And I get this timing on my PC:

Andrea's Solution: 0.34946193 Seconds/Trial
Francesc's Solution: 0.011288139 Seconds/Trial

If I implemented everything correctly, this is an amazing improvement.
Thank you to everyone who provided suggestions, and thanks to the list
:-D

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Travis E. Oliphant
Alan G Isaac wrote:
 On Thu, 22 May 2008, Andrea Gavana apparently wrote:
   
 # Filter cells which do not satisfy Z requirements:
 zReq = zMin = zCent = zMax 
 

 This seems to raise a question:
 should numpy arrays support this standard Python idiom?
   
It would be nice, but alas it requires a significant change to Python 
first to give us the hooks to modify.   (We need the 'and' and 'or' 
operations to return vectors instead of just numbers as they do 
now). There is a PEP to allow this, but it has not received much TLC 
as of late.   The difficulty in the implementation is supporting 
short-circuited evaluation. 

-Travis



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt
Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 You clearly have a large dataset, otherwise speed wouldn't have been a
 concern to you.  You can do your operation in one pass over the data,
 and I'd suggest you try doing that with Cython or Ctypes.  If you need
 an example on how to access data using those methods, let me know.

 Of course, it *can* be done using NumPy (maybe not in one pass), but
 thinking in terms of for-loops is sometimes easier, and immediately
 takes you to a highly optimised execution time.

 First of all, thank you for your answer. I know next to nothing about
 Cython and very little about Ctypes, but it would be nice to have an
 example on how to use them to speed up the operations. Actually, I
 don't really know if my dataset is large, as I work normally with
 xCent, yCent and zCent vectors of about 100,000-300,000 elements in
 them.

Just to clarify things in my mind: is VTK *that* slow?  I find that
surprising, since it is written in C or C++.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Robert Kern
On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote:
 Just to clarify things in my mind: is VTK *that* slow?  I find that
 surprising, since it is written in C or C++.

Performance can depend more on the design of the code than the
implementation language. There are several places in VTK which are
slower than they strictly could be because VTK exposes data primarily
through abstract interfaces and only sometimes expose underlying data
structure for faster processing. Quite sensibly, they implement the
general form first.

It's much the same with parts of numpy. The iterator abstraction lets
you work on arbitrarily strided arrays, but for contiguous arrays,
just using the pointer lets you, and the compiler, optimize your code
more.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana
Hi All,

On Thu, May 22, 2008 at 7:46 PM, Robert Kern  wrote:
 On Thu, May 22, 2008 at 12:26 PM, Stéfan van der Walt [EMAIL PROTECTED] 
 wrote:
 Just to clarify things in my mind: is VTK *that* slow?  I find that
 surprising, since it is written in C or C++.

 Performance can depend more on the design of the code than the
 implementation language. There are several places in VTK which are
 slower than they strictly could be because VTK exposes data primarily
 through abstract interfaces and only sometimes expose underlying data
 structure for faster processing. Quite sensibly, they implement the
 general form first.

Yes, Robert is perfectly right. VTK is quite handy in most of the
situations, but in this case I had to recursively apply 3 thresholds
(each one for X, Y and Z respectively) and the threshold construction
(initialization) and its execution were much slower than my (sloppy)
numpy result. Compared to the solution Francesc posted, the VTK
approach simply disappears.
By the way, about the solution Francesc posted:

xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]

Do you think is there any chance that a C extension (or something
similar) could be faster? Or something else using weave? I understand
that this solution is already highly optimized as it uses the power of
numpy with the logic operations in Python, but I was wondering if I
can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
more or less, for 150,000 cells, but today I encountered a case in
which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
Otherwise, I will try and implement it in Fortran and wrap it with
f2py, assuming I am able to do it correctly and the overhead of
calling an external extension is not killing the execution time.

Thank you very much for your sugestions.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Christopher Barker
Andrea Gavana wrote:
 By the way, about the solution Francesc posted:
 
 xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)
 
 xyzReq = numpy.nonzero(xyzReq)[0]
 
 Do you think is there any chance that a C extension (or something
 similar) could be faster?

yep -- if I've be got this right, the above creates 7 temporary arrays. 
creating that many and pushing the data in and out of memory can be 
pretty slow for large arrays.

In C, C++, Cython or Fortran, you can just do one loop, and one output 
array. It should be much faster for the big arrays.

 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

nope, that's one function call for the whole thing, negligible.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Nathan Bell
On Thu, May 22, 2008 at 2:16 PM, Andrea Gavana [EMAIL PROTECTED] wrote:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)


You could implement this with inplace operations to save memory:
xyzReq = (xCent = xMin)
xyzReq = (xCent = xMax)
xyzReq = (yCent = yMin)
xyzReq = (yCent = yMax)
xyzReq = (zCent = zMin)
xyzReq = (zCent = zMax)


 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster

A C implementation would certainly be faster, perhaps 5x faster, due
to short-circuiting the AND operations and the fact that you'd only
pass over the data once.

OTOH I'd be very surprised if this is the slowest part of your application.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt
Hi Andrea

2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
 more or less, for 150,000 cells, but today I encountered a case in
 which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

I wrote a quick proof of concept (no guarantees).  You can find it
here (download using bzr, http://bazaar-vcs.org, or just grab the
files with your web browser):

https://code.launchpad.net/~stefanv/+junk/xyz

1. Install Cython if you haven't already
2. Run python setup.py build_ext -i to build the C extension
3. Use the code, e.g.,

import xyz
out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5,
 array([2.0, 4.0, 6.0]), 2, 4,
 array([-1.0, -2.0, -4.0]), -3, -2)

In the above case, out is [False, True, False].

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Christopher Barker
Stéfan van der Walt wrote:
 I wrote a quick proof of concept (no guarantees). 

Thanks for the example -- I like how Cython understands ndarrays!

It looks like this code would break if x,y,and z are not C-contiguous -- 
should there be a check for that?

-Chris


 here (download using bzr, http://bazaar-vcs.org, or just grab the
 files with your web browser):
 
 https://code.launchpad.net/~stefanv/+junk/xyz

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana
Hi Chris and All,

On Thu, May 22, 2008 at 8:40 PM, Christopher Barker wrote:
 Andrea Gavana wrote:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
 (yCent = yMin)  (yCent = yMax)   \
 (zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster?

 yep -- if I've be got this right, the above creates 7 temporary arrays.
 creating that many and pushing the data in and out of memory can be
 pretty slow for large arrays.

 In C, C++, Cython or Fortran, you can just do one loop, and one output
 array. It should be much faster for the big arrays.

Well, I have implemented it in 2 ways in Fortran, and actually the
Fortran solutions are slower than the numpy one (2 and 3 times slower
respectively). I attach the source code of the timing code and the 5
implementations I have at the moment (I have included Nathan's
implementation, which is as fast as Francesc's one but it has the
advantage of saving memory).

The timing I get on my home PC are:

Andrea's Solution: 0.42807561 Seconds/Trial
Francesc's Solution: 0.018297884 Seconds/Trial
Fortran Solution 1: 0.035862072 Seconds/Trial
Fortran Solution 2: 0.029822338 Seconds/Trial
Nathan's Solution: 0.018930507 Seconds/Trial

Maybe my fortran coding is sloppy but I don't really know fortran so
well to implement it better...

Thank you so much to everybody for your suggestions so far :-D

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
# Begin Code

import numpy
from timeit import Timer

# FORTRAN modules
from MultipleBoolean3 import multipleboolean3
from MultipleBoolean4 import multipleboolean4

# Number of cells in my original grid
nCells = 15

# Define some constraints for X, Y, Z
xMin, xMax = 250.0, 700.0
yMin, yMax = 1000.0, 1900.0
zMin, zMax = 120.0, 300.0

# Generate random centroids for the cells
xCent = 1000.0*numpy.random.rand(nCells)
yCent = 2500.0*numpy.random.rand(nCells)
zCent = 400.0*numpy.random.rand(nCells)


def MultipleBoolean1():
Andrea's solution, slow :-( .

   xReq_1 = numpy.nonzero(xCent = xMin)
   xReq_2 = numpy.nonzero(xCent = xMax)

   yReq_1 = numpy.nonzero(yCent = yMin)
   yReq_2 = numpy.nonzero(yCent = yMax)

   zReq_1 = numpy.nonzero(zCent = zMin)
   zReq_2 = numpy.nonzero(zCent = zMax)

   xReq = numpy.intersect1d_nu(xReq_1, xReq_2)
   yReq = numpy.intersect1d_nu(yReq_1, yReq_2)
   zReq = numpy.intersect1d_nu(zReq_1, zReq_2)

   xyReq = numpy.intersect1d_nu(xReq, yReq)
   xyzReq = numpy.intersect1d_nu(xyReq, zReq)


def MultipleBoolean2():
Francesc's's solution, Much faster :-) .

   xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

   xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean3():

xyzReq = multipleboolean3(xCent, yCent, zCent, xMin, xMax, yMin, yMax, 
zMin, zMax, nCells)
xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean4():

xyzReq = multipleboolean4(xCent, yCent, zCent, xMin, xMax, yMin, yMax, 
zMin, zMax, nCells)
xyzReq = numpy.nonzero(xyzReq)[0]


def MultipleBoolean5():

xyzReq = (xCent = xMin)
xyzReq = (xCent = xMax)
xyzReq = (yCent = yMin)
xyzReq = (yCent = yMax)
xyzReq = (zCent = zMin)
xyzReq = (zCent = zMax)

xyzReq = numpy.nonzero(xyzReq)[0]


if __name__ == __main__:

   trial = 10

   t = Timer(MultipleBoolean1(), from __main__ import MultipleBoolean1)
   print \n\nAndrea's Solution: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean2(), from __main__ import MultipleBoolean2)
   print Francesc's Solution: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean3(), from __main__ import MultipleBoolean3)
   print Fortran Solution 1: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean4(), from __main__ import MultipleBoolean4)
   print Fortran Solution 2: %0.8g 
Seconds/Trial%(t.timeit(number=trial)/trial)

   t = Timer(MultipleBoolean5(), from __main__ import MultipleBoolean5)
   print Nathan's Solution: %0.8g 
Seconds/Trial\n%(t.timeit(number=trial)/trial)

# End Code


MultipleBoolean3.f90
Description: Binary data


MultipleBoolean4.f90
Description: Binary data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Andrea Gavana
Hi Stefan,

On Thu, May 22, 2008 at 10:23 PM, Stéfan van der Walt wrote:
 Hi Andrea

 2008/5/22 Andrea Gavana [EMAIL PROTECTED]:
 By the way, about the solution Francesc posted:

 xyzReq = (xCent = xMin)  (xCent = xMax)   \
(yCent = yMin)  (yCent = yMax)   \
(zCent = zMin)  (zCent = zMax)

 xyzReq = numpy.nonzero(xyzReq)[0]

 Do you think is there any chance that a C extension (or something
 similar) could be faster? Or something else using weave? I understand
 that this solution is already highly optimized as it uses the power of
 numpy with the logic operations in Python, but I was wondering if I
 can make it any faster: on my PC, the algorithm runs in 0.01 seconds,
 more or less, for 150,000 cells, but today I encountered a case in
 which I had 10800 sub-grids... 10800*0.01 is close to 2 minutes :-(
 Otherwise, I will try and implement it in Fortran and wrap it with
 f2py, assuming I am able to do it correctly and the overhead of
 calling an external extension is not killing the execution time.

 I wrote a quick proof of concept (no guarantees).  You can find it
 here (download using bzr, http://bazaar-vcs.org, or just grab the
 files with your web browser):

 https://code.launchpad.net/~stefanv/+junk/xyz

 1. Install Cython if you haven't already
 2. Run python setup.py build_ext -i to build the C extension
 3. Use the code, e.g.,

import xyz
out = xyz.filter(array([1.0, 2.0, 3.0]), 2, 5,
 array([2.0, 4.0, 6.0]), 2, 4,
 array([-1.0, -2.0, -4.0]), -3, -2)

 In the above case, out is [False, True, False].

Thank you very much for this! I am going to try it and time it,
comparing it with the other implementations. I think I need to study a
bit your code as I know almost nothing about Cython :-D

Thank you!

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multiple Boolean Operations

2008-05-22 Thread Stéfan van der Walt
Hi Andrea

2008/5/23 Andrea Gavana [EMAIL PROTECTED]:
 Thank you very much for this! I am going to try it and time it,
 comparing it with the other implementations. I think I need to study a
 bit your code as I know almost nothing about Cython :-D

That won't be necessary -- the Fortran-implementation is guaranteed to win!

Just to make sure, I timed it anyway (on somewhat larger arrays):

Francesc's Solution: 0.062797403 Seconds/Trial
Fortran Solution 1: 0.050316906 Seconds/Trial
Fortran Solution 2: 0.052595496 Seconds/Trial
Nathan's Solution: 0.055562282 Seconds/Trial
Cython Solution: 0.06250751 Seconds/Trial

Nathan's version runs over the data 6 times, and still does better
than the Pyrex version.  I don't know why!

But, hey, this algorithm is parallelisable!  Wait, no, it's bedtime.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion