Re: [Numpy-discussion] Broadcasting question

2008-05-02 Thread Anne Archibald
2008/5/2 Chris.Barker [EMAIL PROTECTED]:
 Hi all,

  I have a n X m X 3 array, and I a n X M array. I want to assign the
  values in the n X m to all three of the slices in the bigger array:

  A1 = np.zeros((5,4,3))
  A2 = np.ones((5,4))
  A1[:,:,0] = A2
  A1[:,:,1] = A2
  A1[:,:,2] = A2

  However,it seems I should be able to broadcast that, so I don't have to
  repeat myself, but I can't figure out how.

All you need is a newaxis on A2:
In [2]: A = np.zeros((3,4))

In [3]: B = np.ones(3)

In [4]: A[:,:] = B[:,np.newaxis]

In [5]: A
Out[5]:
array([[ 1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.]])


Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] FW: HELP: PROBLEMS WITH VPYTHON, NUMPY AND PY2EXE PART 2

2008-05-02 Thread andres felipe sierra echeverry


 



Does anyone could help me generate an executable program that functions for the 
bounce.py with py2exe? 
 
 Thank you

bounce.py:

from visual import *

from numpy import *


ball = sphere(pos=(0,4,0), color=color.red)

example = zeros((10,10),dtype=float)

print (example)



setup.py:

from distutils.core import setup
import py2exe



opts = {
 'py2exe': {'packages':['numarray']
   
   }
   }
  

setup(windows=[{script : bounce.py}], options=opts)




_
Discover the new Windows Vista
http://search.msn.com/results.aspx?q=windows+vistamkt=en-USform=QBRE___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Broadcasting question

2008-05-02 Thread Christopher Barker

Charles R Harris wrote:
 A1 += A2[:,:,newaxis] is one way.

Exactly what I was looking for -- thanks Charles (and Ann)

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] a possible way to implement a plogin system

2008-05-02 Thread Lisandro Dalcin
On 5/1/08, David Cournapeau [EMAIL PROTECTED] wrote:
 On Wed, 2008-04-30 at 16:44 -0300, Lisandro Dalcin wrote:
   David, in order to put clear what I was proposing to you in previous
   mail regarding to implementing plugin systems for numpy, please take a
   look at the attached tarball.


 Thanks for looking at this Lisandro.

  The problem I see with the approach of struct vs direct function
  pointers is of ABI compatibility: it is easy to mess things up with
  structures.

That's a very valid point.

 There is the advantage of using only one dlsym (or
  equivalent) with the struct, which may be much faster than using
  hundreds of dlsym for each function. Linux does not seem to have
  problems with that, but mac os X for example felt slow when I tried
  doing this for several thousand functions. I did not go really far on
  that, though. I don't really see why using a struct is cleaner, though.
  That's really the same thing (function pointers), in both cases the name
  namespace pollution will be the same, and in both cases there will be a
  need to generate source code.

Perhaps cleaner is not the right word. I actually believe that is far
more portable, regarding the oddities of dlopening in different
platforms.

 Concerning the loading mechanism, I don't understand the point of using
  PyCObject_Import. By quickly looking at the code of the function, the
  plugin needs to be a python module in that case, which is not needed in
  our case; I don't like the fact that it is not documented either.

Well, that is the same mechanism NumPy uses for exporting its C API to
extensions modules. Look at generated header  '__multiarray_api.h' in
function '_import_array()' in your numpy installation. This is the
standard, portable, and (as Python doc says) recommended way to expose
C API's from extensions modules.

  The code for dlopen/dlclose is really short. For each platform, it is
  like 50 lines of code, and we have a better control on what we can do
  (which may be needed; for example, you want want to load symbols from a
  dll A, but the symbols are only in dll B, which dll A depends on; that's
  a problem that does not happen for python extensions, I think; I don't
  really know to be honest).

Please note that all my concerns about recommending you not to use
dlopen, is just to save you from future headaches!!!.  See yourself at
row 17 of table here
http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html . And that
table does not cover stuff like RTLD_GLOBAL flags to dlopen (or
equivalent).


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Rich Shepard
   When I last visited I was given excellent advice about Gaussian and other
bell-shaped curves. Upon further reflection I realized that the Gaussian
curves will not do; the curve does need to have y=0.0 at each end.

   I tried to apply a Beta distribution, but I cannot correlate the alpha and
beta parameters with the curve characteristics that I have.

   What will work (I call it a pi curve) is a matched pair of sigmoid curves,
the ascending curve on the left and the descending curve on the right. Using
the Boltzmann function for these I can calculate and plot each individually,
but I'm having difficulty in appending the x and y values across the entire
range. This is where I would appreciate your assistance.

   The curve parameters passed to the function are the left end, right end,
and midpoint. The inflection point (where y = 0.5) is half-way between the
ends and the midpoint.

   What I have in the function is this:

def piCurve(ll, lr, mid):
  flexL = (mid - ll)/2.0
  flexR = (lr - mid)/2.0
  tau = (mid - ll)/10.0

  x = []
  y = []

  xL = nx.arange(ll,mid,0.1)
  for i in xL:
x.append(xL[i])
  yL = 1.0 / (1.0 + nx.exp(-(xL-flexL)/tau))
  for j in yL:
y.append(yL[j])

  xR = nx.arange(mid,lr,0.1)
  for i in xR:
x.append(xR[i])
  yR = 1 - (1.0 / (1.0 + nx.exp(-(xR-flexR)/tau)))
  for j in yR:
y.append(yR[j])

  appData.plotX = x
  appData.plotY = y

Python complains about adding to the list:

 yL = 1.0 / (1.0 + nx.exp(-(x-flexL)/tau))
TypeError: unsupported operand type(s) for -: 'list' and 'float'

   What is the appropriate way to generate two sigmoid curves so that the x
values range from the left end to the right end and the y values rise from
0.0 to 1.0 at the midpoint, then lower to 0.0 again?

Rich

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Angus McMorland
2008/5/2 Rich Shepard [EMAIL PROTECTED]:
When I last visited I was given excellent advice about Gaussian and other
  bell-shaped curves. Upon further reflection I realized that the Gaussian
  curves will not do; the curve does need to have y=0.0 at each end.

I tried to apply a Beta distribution, but I cannot correlate the alpha and
  beta parameters with the curve characteristics that I have.

What will work (I call it a pi curve) is a matched pair of sigmoid curves,
  the ascending curve on the left and the descending curve on the right. Using
  the Boltzmann function for these I can calculate and plot each individually,
  but I'm having difficulty in appending the x and y values across the entire
  range. This is where I would appreciate your assistance.

The curve parameters passed to the function are the left end, right end,
  and midpoint. The inflection point (where y = 0.5) is half-way between the
  ends and the midpoint.

What I have in the function is this:

 def piCurve(ll, lr, mid):
   flexL = (mid - ll)/2.0
   flexR = (lr - mid)/2.0
   tau = (mid - ll)/10.0

   x = []
   y = []

   xL = nx.arange(ll,mid,0.1)
   for i in xL:
 x.append(xL[i])
   yL = 1.0 / (1.0 + nx.exp(-(xL-flexL)/tau))
   for j in yL:
 y.append(yL[j])

   xR = nx.arange(mid,lr,0.1)
   for i in xR:
 x.append(xR[i])
   yR = 1 - (1.0 / (1.0 + nx.exp(-(xR-flexR)/tau)))
   for j in yR:
 y.append(yR[j])

   appData.plotX = x
   appData.plotY = y

  Python complains about adding to the list:

  yL = 1.0 / (1.0 + nx.exp(-(x-flexL)/tau))
  TypeError: unsupported operand type(s) for -: 'list' and 'float'

What is the appropriate way to generate two sigmoid curves so that the x
  values range from the left end to the right end and the y values rise from
  0.0 to 1.0 at the midpoint, then lower to 0.0 again?

How about multiplying two Boltzmann terms together, ala:

f(x) = 1/(1+exp(-(x-flex1)/tau1)) * 1/(1+exp((x-flex2)/tau2))

You'll find if your two flexion points get too close together, the
peak will drop below the maximum for each individual curve, but the
transition will be continuous.

Angus.
-- 
AJC McMorland, PhD candidate
Physiology, University of Auckland

(Nearly) post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Rich Shepard
On Fri, 2 May 2008, Angus McMorland wrote:

 How about multiplying two Boltzmann terms together, ala:

 f(x) = 1/(1+exp(-(x-flex1)/tau1)) * 1/(1+exp((x-flex2)/tau2))

 You'll find if your two flexion points get too close together, the peak
 will drop below the maximum for each individual curve, but the transition
 will be continuous.

Angus,

   With an x range from 0.0-100.0 (and the flexion points at 25.0 and 75.0),
the above formula provides a nice bell-shaped curve from x=0.0 to x=50.0,
and a maximum y of only 0.25 rather than 2.0.

   Modifying the above so the second term is subtracted from 1 before the
multiplication, or by negating the exponent in the second term, yields only
the first half: the ascending 'S' curve from 0-50.

Thanks,

Rich

-- 
Richard B. Shepard, Ph.D.   |  IntegrityCredibility
Applied Ecosystem Services, Inc.|Innovation
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Christopher Barker
Rich,

this could use some serious vectorization/numpyification! Poke around 
the scipy Wiki and whatever other tutorials you can find -- you'll be 
glad you did. A hint:

When you are writing a loop like:
 for i in xL:
   x.append(xL[i])

You should be doing array operations!

Specifics:
   xL = nx.arange(ll,mid,0.1)

# you've now created an array of your X values
   for i in xL:
  x.append(xL[i])
#this dumps them into a list - why not keep them an array?
# if you really need an array (you don't here), then you can use:
xL.tolist()

That's why you get this error:
TypeError: unsupported operand type(s) for -: 'list' and 'float'

you're trying to do array operations on a list.

Also, you probably want np.linspace, rather than arange, it's a better 
option for floats.

Here is my version:
import numpy as np

ll, lr, mid = (-10, 10, 0)

flexL = (mid - ll)/2.0
flexR = (lr - mid)/2.0
tau = (mid - ll)/10.0

xL = np.linspace(ll, mid, 5)
yL = 1.0 / (1.0 + np.exp(-(xL-flexL)/tau))

print xL
print yL

xR = np.linspace(mid, lr, 5)
yR = 1 - (1.0 / (1.0 + np.exp(-(xR-flexR)/tau)))

print xR
print yR

# now put them together:
x = np.hstack((xL, xR[1:])) # don't want to duplicate the midpoint
y = np.hstack((yL, yR[1:]))

print x
print y

Though it doesn't look like the numbers are right.

Also, you don't need to create the separate left and right arraysand put 
them together, slicing gives a view, so you could do:

numpoints = 11 # should be an odd number to get the midpoint
x = linspace(ll,lr, numpoints)
y = zeros_like(x)

xL = x[:numpoints/2]
yL = y[:numpoints/2]

yL[:] = 1.0 / (1.0 + np.exp(-(xL-flexL)/tau))

you could also use something like:

np.where(x0, .)

left as an exercise for the reader...

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Rich Shepard
On Fri, 2 May 2008, Christopher Barker wrote:

 this could use some serious vectorization/numpyification! Poke around the
 scipy Wiki and whatever other tutorials you can find -- you'll be glad you
 did. A hint:

 When you are writing a loop like:
for i in xL:
  x.append(xL[i])

 You should be doing array operations!

Chris,

   Good suggestions. I need to append the two numpy arrays so I return all x
values in a single list and all y values in another list. I'll read the
numpy book again.

 That's why you get this error:
 TypeError: unsupported operand type(s) for -: 'list' and 'float'

 you're trying to do array operations on a list.

   Ah, so!

 Also, you probably want np.linspace, rather than arange, it's a better
 option for floats.

   arange() has worked just fine in other functions.

 # now put them together:
 x = np.hstack((xL, xR[1:])) # don't want to duplicate the midpoint
 y = np.hstack((yL, yR[1:]))

 Also, you don't need to create the separate left and right arraysand put
 them together, slicing gives a view, so you could do:

   I assume that there's a way to mate the two curves in a single equation.
I've not been able to find what that is.

Thanks,

Rich
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [IT] Maintenance and scheduled downtime this evening and weekend

2008-05-02 Thread Peter Wang
Hi everyone,

This evening and this weekend, we will be doing a major overhaul of  
Enthought's internal network infrastructure.  We will be cleaning up a  
large amount of legacy structure and transitioning to a more  
maintainable, better documented configuration.

We have planned the work so that externally-facing servers will  
experience a minimum of downtime.  In the event of unforeseen  
difficulties that cause outages to extend beyond the times given  
below, we will update the network status page, located at 
http://dr0.enthought.com/status/ 
.  This page will remain available and be unaffected by any network  
outages.

Downtimes:
Friday May 2, 2008, 8pm - 10pm Central time
 (= 9pm - 11pm Eastern time)
 (= 1am - 3am UTC Sat. May 3)

Saturday May 2, 2008, 10am - 11am Central time
 (= 11am - 12 noon Eastern time)
 (= 3pm - 4pm UTC)

To reach us during the downtime, please use the contact information  
provided on the network status page.  Please let me know if you have  
any questions or concerns.

We will send out another email once the outage is complete.  Thanks  
for your patience!


-Peter
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Anne Archibald
2008/5/2 Rich Shepard [EMAIL PROTECTED]:

What will work (I call it a pi curve) is a matched pair of sigmoid curves,
  the ascending curve on the left and the descending curve on the right. Using
  the Boltzmann function for these I can calculate and plot each individually,
  but I'm having difficulty in appending the x and y values across the entire
  range. This is where I would appreciate your assistance.

It's better not to work point-by-point, appending things, when working
with numpy. Ideally you could find a formula which just produced the
right curve, and then you'd apply it to the input vector and get the
output vector all at once. For example for a Gaussian (which you're
not using, I know), you'd write a function something like

def f(x):
return np.exp(-x**2)

and then call it like:

f(np.linspace(-1,1,1000)).

This is efficient and clear. It does not seem to me that the logistic
function, 1/(1+np.exp(x)) does quite what you want. How about using
the cosine?

def f(left, right, x):
scaled_x = (x-(right+left)/2)/((right-left)/2)
return (1+np.cos((np.pi/2) * scaled_x))/2

exactly zero at both endpoints, exactly one at the midpoint,
inflection points midway between, where the value is 1/2. If you want
to normalize it so that the area underneath is one, that's easy to do.

More generally, the trick of producing a scaled_x as above lets you
move any function anywhere you like.

If you want the peak not to be at the midpoint of the interval, you
will need to do something a litle more clever, perhaps choosing a
different function, scaling the x values with a quadratic polynomial
so that (left, middle, right) becomes (-1,0,1), or using a piecewise
function.

Good luck,
Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Rich Shepard
On Fri, 2 May 2008, Anne Archibald wrote:

 It's better not to work point-by-point, appending things, when working
 with numpy. Ideally you could find a formula which just produced the right
 curve, and then you'd apply it to the input vector and get the output
 vector all at once.

Anne,

   That's been my goal. :-)

 How about using the cosine?

 def f(left, right, x):
scaled_x = (x-(right+left)/2)/((right-left)/2)
return (1+np.cos((np.pi/2) * scaled_x))/2

 exactly zero at both endpoints, exactly one at the midpoint,
 inflection points midway between, where the value is 1/2. If you want
 to normalize it so that the area underneath is one, that's easy to do.

 More generally, the trick of producing a scaled_x as above lets you
 move any function anywhere you like.

   This looks like a pragmatic solution. When I print scaled_x (using left =
0.0 and right = 100.0), the values range from -1.0 to +0.998. So, I need to
figure out the scale_x that sets the end points at 0 and 100.

Thanks very much,

Rich
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Anne Archibald
2008/5/2 Rich Shepard [EMAIL PROTECTED]:
 On Fri, 2 May 2008, Anne Archibald wrote:

   It's better not to work point-by-point, appending things, when working
   with numpy. Ideally you could find a formula which just produced the right
   curve, and then you'd apply it to the input vector and get the output
   vector all at once.

  Anne,

That's been my goal. :-)


   How about using the cosine?
  
   def f(left, right, x):
  scaled_x = (x-(right+left)/2)/((right-left)/2)
  return (1+np.cos((np.pi/2) * scaled_x))/2
  
   exactly zero at both endpoints, exactly one at the midpoint,
   inflection points midway between, where the value is 1/2. If you want
   to normalize it so that the area underneath is one, that's easy to do.

   More generally, the trick of producing a scaled_x as above lets you
   move any function anywhere you like.

This looks like a pragmatic solution. When I print scaled_x (using left =
  0.0 and right = 100.0), the values range from -1.0 to +0.998. So, I need to
  figure out the scale_x that sets the end points at 0 and 100.

No, no. You *want* scaled_x to range from -1 to 1. (The 0.998 is
because you didn't include the endpoint, 100.) The function I gave,
(1+np.cos((np.pi/2) * scaled_x))/2, takes [-1, 1] to a nice
bump-shaped function.  If you feed in numbers from 0 to 100 as x, they
get transformed to scaled_x, and you feed them to the function,
getting a result that goes from 0 at x=0 to 1 at x=50 to 0 at x=100.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] [IT] Maintenance and scheduled downtime this evening and weekend

2008-05-02 Thread Damian Eads
Will this effect SVN and Trac access?

Thanks!

Damian

Peter Wang wrote:
 Hi everyone,
 
 This evening and this weekend, we will be doing a major overhaul of  
 Enthought's internal network infrastructure.  We will be cleaning up a  
 large amount of legacy structure and transitioning to a more  
 maintainable, better documented configuration.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] [IT] Maintenance and scheduled downtime this evening and weekend

2008-05-02 Thread Peter Wang
On May 2, 2008, at 5:45 PM, Damian Eads wrote:
 Will this effect SVN and Trac access?
 Thanks!
 Damian

Yes, this affects svn, trac, web, mailing lists,...everything, because  
we will be working on the underlying network infrastructure.


-Peter


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Christopher Barker
Anne Archibald wrote:
 2008/5/2 Rich Shepard [EMAIL PROTECTED]:
 No, no. You *want* scaled_x to range from -1 to 1. 

Why not just scale to -pi to pi right there?

(The 0.998 is because you didn't include the endpoint, 100.)

Which is why you want linspace, rather than arange. Really, trust me on 
this!

If the cosine curve isn't right for you, a little create use of np.where 
would let you do what you want in maybe another line or two of code. 
Something like:

y = np.where( x  0 , Leftfun(x), Rightfun(x) )

or just compute one side and then flip it to make the other side:

y = np.zeros_like(x)
y[center:] = fun(x[center:])
y[:center] = y[center+1:][::-1] # don't want the center point twice

if it's symetric anyway.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Combining Sigmoid Curves

2008-05-02 Thread Rich Shepard
On Fri, 2 May 2008, Christopher Barker wrote:

 Why not just scale to -pi to pi right there?

   Dunno, Chris. As I wrote to Anne (including a couple of files and the
resulting plot), it's been almost three decades since I dealt with the math
underlying distribution functions.

 Which is why you want linspace, rather than arange. Really, trust me on
 this!

   I see the difference in the book. Didn't know about linspace(), but adding
True brings the end point to 100.0

 If the cosine curve isn't right for you, a little create use of np.where
 would let you do what you want in maybe another line or two of code.
 Something like:

 y = np.where( x  0 , Leftfun(x), Rightfun(x) )

 or just compute one side and then flip it to make the other side:

 y = np.zeros_like(x)
 y[center:] = fun(x[center:])
 y[:center] = y[center+1:][::-1] # don't want the center point twice

 if it's symetric anyway.

   I'll look into this over the weekend (after upgrading my machines to
Slackware-12.1). I can get nice sigmoid curves with the Boltzmann function.
This thread started when I asked how to combine the two into a single curve
with one set of x,y points.

Much appreciated,

Rich

-- 
Richard B. Shepard, Ph.D.   |  IntegrityCredibility
Applied Ecosystem Services, Inc.|Innovation
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Ruby's NMatrix and NVector

2008-05-02 Thread Travis E. Oliphant

http://narray.rubyforge.org/matrix-e.html

It seems they've implemented some of what Tim is looking for, in 
particular.  Perhaps there is information to be gleaned from what they 
are doing.   It looks promising..

-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Faster

2008-05-02 Thread Keith Goodman
How can I make this function faster? It removes the i-th row and
column from an array.

def cut(x, i):
idx = range(x.shape[0])
idx.remove(i)
y = x[idx,:]
y = y[:,idx]
return y

 import numpy as np
 x = np.random.rand(500,500)
 timeit cut(x, 100)
100 loops, best of 3: 16.8 ms per loop
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Keith Goodman
On Fri, May 2, 2008 at 5:38 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 On Fri, May 2, 2008 at 6:24 PM, Keith Goodman [EMAIL PROTECTED] wrote:
  How can I make this function faster? It removes the i-th row and
  column from an array.
 

 Why do you want to do that?

Single linkage clustering; x is the distance matrix.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Keith Goodman
On Fri, May 2, 2008 at 5:47 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Fri, May 2, 2008 at 5:38 PM, Charles R Harris
  [EMAIL PROTECTED] wrote:
   On Fri, May 2, 2008 at 6:24 PM, Keith Goodman [EMAIL PROTECTED] wrote:
How can I make this function faster? It removes the i-th row and
column from an array.
   
  
   Why do you want to do that?

  Single linkage clustering; x is the distance matrix.

Here's the full code if you are interested. I haven't used it yet
other than running the test and test2 so it may be full of bugs.

import time

import numpy as np

class Cluster:
Single linkage hierarchical clustering

def __init__(self, dist, label=None, debug=False):

dist   Distance matrix, NxN numpy array
label  Labels of each row of the distance matrix, list of N items,
   default is range(N)

assert dist.shape[0] == dist.shape[1], 'dist must be square (nxn)'
assert (np.abs(dist - dist.T)  1e-8).all(), 'dist must be symmetric'
if label is None:
label = range(dist.shape[0])
assert dist.shape[0] == len(label), 'dist and label must match
in size'
self.c = [[[z] for z in label]]
self.label = label
self.dist = dist
self.debug = debug

def run(self):
for level in xrange(len(self.label) - 1):
i, j = self.min_dist()
self.join(i, j)

def join(self, i, j):

assert i != j, 'Cannot combine a cluster with itself'

# Join labels
new = list(self.c[-1])
new[i] = new[i] + new[j]
new.pop(j)
self.c.append(new)

# Join distance matrix
self.dist[:,i] = self.dist[:,[i,j]].min(1)
self.dist[i,:] = self.dist[:,i]
idx = range(self.dist.shape[1])
idx.remove(j)
self.dist = self.dist[:,idx]
self.dist = self.dist[idx,:]

# Debug output
if self.debug:
print
print len(self.c) - 1
print 'Clusters'
print self.c[-1]
print 'Distance'
print self.dist

def min_dist(self):
dist = self.dist + 1e10 * np.eye(self.dist.shape[0])
i, j = np.where(dist == dist.min())
return i[0], j[0]


def test():
# Example from
# home.dei.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html
label =  ['BA', 'FI', 'MI', 'NA', 'RM', 'TO']
dist = np.array([[0,662,  877,  255,  412,  996],
 [662,  0,295,  468,  268,  400],
 [877,  295,  0,754,  564,  138],
 [255,  468,  754,  0,219,  869],
 [412,  268,  564,  219,  0,669],
 [996,  400,  138,  869,  669,  0  ]])
clust = Cluster(dist, label, debug=True)
clust.run()

def test2(n):
x = np.random.rand(n,n)
x = x + x.T
c = Cluster(x)
t1 = time.time()
c.run()
t2 = time.time()
print 'n = %d took %0.2f seconds' % (n, t2-t1)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Robert Kern
On Fri, May 2, 2008 at 7:24 PM, Keith Goodman [EMAIL PROTECTED] wrote:
 How can I make this function faster? It removes the i-th row and
  column from an array.

  def cut(x, i):
 idx = range(x.shape[0])
 idx.remove(i)
 y = x[idx,:]
 y = y[:,idx]
 return y

   import numpy as np
   x = np.random.rand(500,500)
   timeit cut(x, 100)
  100 loops, best of 3: 16.8 ms per loop

I can get a ~20% improvement with the following:

In [8]: %timeit cut(x, 100)
10 loops, best of 3: 21.6 ms per loop

In [9]: def mycut(x, i):
   ...: A = x[:i,:i]
   ...: B = x[:i,i+1:]
   ...: C = x[i+1:,:i]
   ...: D = x[i+1:,i+1:]
   ...: return hstack([vstack([A,C]),vstack([B,D])])
   ...:

In [10]: %timeit mycut(x, 100)
10 loops, best of 3: 17.3 ms per loop

The hstack(vstack, vstack) seems to be somewhat better than
vstack(hstack, hstack), at least for these sizes.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Keith Goodman
On Fri, May 2, 2008 at 6:05 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Fri, May 2, 2008 at 7:24 PM, Keith Goodman [EMAIL PROTECTED] wrote:


  How can I make this function faster? It removes the i-th row and
column from an array.
  
def cut(x, i):
   idx = range(x.shape[0])
   idx.remove(i)
   y = x[idx,:]
   y = y[:,idx]
   return y
  
 import numpy as np
 x = np.random.rand(500,500)
 timeit cut(x, 100)
100 loops, best of 3: 16.8 ms per loop

  I can get a ~20% improvement with the following:

  In [8]: %timeit cut(x, 100)
  10 loops, best of 3: 21.6 ms per loop

  In [9]: def mycut(x, i):
...: A = x[:i,:i]
...: B = x[:i,i+1:]
...: C = x[i+1:,:i]
...: D = x[i+1:,i+1:]
...: return hstack([vstack([A,C]),vstack([B,D])])
...:

  In [10]: %timeit mycut(x, 100)
  10 loops, best of 3: 17.3 ms per loop

  The hstack(vstack, vstack) seems to be somewhat better than
  vstack(hstack, hstack), at least for these sizes.

Wow. I never would have come up with that. And I probably never will.

Original code:
 cluster.test2(500)
n = 500 took 5.28 seconds

Your improvement:
 cluster.test2(500)
n = 500 took 3.52 seconds

Much more than a 20% improvement when used in the larger program. Thank you.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Charles R Harris
On Fri, May 2, 2008 at 7:16 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Fri, May 2, 2008 at 6:05 PM, Robert Kern [EMAIL PROTECTED] wrote:
  On Fri, May 2, 2008 at 7:24 PM, Keith Goodman [EMAIL PROTECTED]
 wrote:
 
 
   How can I make this function faster? It removes the i-th row and
 column from an array.
   
 def cut(x, i):
idx = range(x.shape[0])
idx.remove(i)
y = x[idx,:]
y = y[:,idx]
return y
   
  import numpy as np
  x = np.random.rand(500,500)
  timeit cut(x, 100)
 100 loops, best of 3: 16.8 ms per loop
 
   I can get a ~20% improvement with the following:
 
   In [8]: %timeit cut(x, 100)
   10 loops, best of 3: 21.6 ms per loop
 
   In [9]: def mycut(x, i):
 ...: A = x[:i,:i]
 ...: B = x[:i,i+1:]
 ...: C = x[i+1:,:i]
 ...: D = x[i+1:,i+1:]
 ...: return hstack([vstack([A,C]),vstack([B,D])])
 ...:
 
   In [10]: %timeit mycut(x, 100)
   10 loops, best of 3: 17.3 ms per loop
 
   The hstack(vstack, vstack) seems to be somewhat better than
   vstack(hstack, hstack), at least for these sizes.

 Wow. I never would have come up with that. And I probably never will.

 Original code:
  cluster.test2(500)
 n = 500 took 5.28 seconds

 Your improvement:
  cluster.test2(500)
 n = 500 took 3.52 seconds

 Much more than a 20% improvement when used in the larger program. Thank
 you.


Isn't the lengthy part finding the distance between clusters?  I can think
of several ways to do that, but I think you will get a real speedup by doing
that in c or c++. I have a module made in boost python that holds clusters
and returns a list of lists containing their elements. Clusters are joined
by joining any two elements, one from each. It wouldn't take much to add a
distance function, but you could use the list of indices in each cluster to
pull a subset out of the distance matrix and then find the minimum function
in that. This also reminds me of Huffman codes.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Keith Goodman
On Fri, May 2, 2008 at 6:29 PM, Charles R Harris
[EMAIL PROTECTED] wrote:
 Isn't the lengthy part finding the distance between clusters?  I can think
 of several ways to do that, but I think you will get a real speedup by doing
 that in c or c++. I have a module made in boost python that holds clusters
 and returns a list of lists containing their elements. Clusters are joined
 by joining any two elements, one from each. It wouldn't take much to add a
 distance function, but you could use the list of indices in each cluster to
 pull a subset out of the distance matrix and then find the minimum function
 in that. This also reminds me of Huffman codes.

You're right. Finding the distance is slow. Is there any way to speed
up the function below? It returns the row and column indices of the
min value of the NxN array x.

def dist(x):
x = x + 1e10 * np.eye(x.shape[0])
i, j = np.where(x == x.min())
return i[0], j[0]

 x = np.random.rand(500,500)
 timeit dist(x)
100 loops, best of 3: 14.1 ms per loop

If the clustering gives me useful results, I'll ask you about your
boost code. I'll also take a look at Damian Eads's scipy-cluster.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Charles R Harris
On Fri, May 2, 2008 at 8:02 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Fri, May 2, 2008 at 6:29 PM, Charles R Harris
 [EMAIL PROTECTED] wrote:
  Isn't the lengthy part finding the distance between clusters?  I can
 think
  of several ways to do that, but I think you will get a real speedup by
 doing
  that in c or c++. I have a module made in boost python that holds
 clusters
  and returns a list of lists containing their elements. Clusters are
 joined
  by joining any two elements, one from each. It wouldn't take much to add
 a
  distance function, but you could use the list of indices in each cluster
 to
  pull a subset out of the distance matrix and then find the minimum
 function
  in that. This also reminds me of Huffman codes.

 You're right. Finding the distance is slow. Is there any way to speed
 up the function below? It returns the row and column indices of the
 min value of the NxN array x.

 def dist(x):
x = x + 1e10 * np.eye(x.shape[0])


x += x + diag(ones(x.shape[0])*1e10

would be faster.


i, j = np.where(x == x.min())
 return i[0], j[0]


i = x.argmin()
j = i % x.shape[0]
i = i / x.shape[0]

But I wouldn't worry about speed yet if you are just trying things out.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Robert Kern
On Fri, May 2, 2008 at 9:02 PM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Fri, May 2, 2008 at 6:29 PM, Charles R Harris

 [EMAIL PROTECTED] wrote:

  Isn't the lengthy part finding the distance between clusters?  I can think
   of several ways to do that, but I think you will get a real speedup by 
 doing
   that in c or c++. I have a module made in boost python that holds clusters
   and returns a list of lists containing their elements. Clusters are joined
   by joining any two elements, one from each. It wouldn't take much to add a
   distance function, but you could use the list of indices in each cluster to
   pull a subset out of the distance matrix and then find the minimum function
   in that. This also reminds me of Huffman codes.

  You're right. Finding the distance is slow. Is there any way to speed
  up the function below? It returns the row and column indices of the
  min value of the NxN array x.

  def dist(x):
 x = x + 1e10 * np.eye(x.shape[0])
 i, j = np.where(x == x.min())

 return i[0], j[0]

Assuming x is contiguous and you can modify x in-place:


In [1]: from numpy import *

In [2]: def dist(x):
   ...:x = x + 1e10 * eye(x.shape[0])
   ...:i, j = where(x == x.min())
   ...:return i[0], j[0]
   ...:

In [3]: def symmdist(N):
   ...: x = random.rand(N, N)
   ...: x = x + x.T
   ...: x.flat[::N+1] = 0
   ...: return x
   ...:

In [4]: symmdist(5)
Out[4]:
array([[ 0.,  0.87508654,  1.11691704,  0.80366071,  0.57966808],
   [ 0.87508654,  0.,  1.5521685 ,  1.74010886,  0.52156877],
   [ 1.11691704,  1.5521685 ,  0.,  1.22725396,  1.04101992],
   [ 0.80366071,  1.74010886,  1.22725396,  0.,  1.94577965],
   [ 0.57966808,  0.52156877,  1.04101992,  1.94577965,  0.]])

In [5]: def kerndist(x):
   ...: N = x.shape[0]
   ...: x.flat[::N+1] = x.max()
   ...: ij = argmin(x.flat)
   ...: i, j = divmod(ij, N)
   ...: return i, j
   ...:

In [10]: x = symmdist(500)

In [15]: %timeit dist(x)
10 loops, best of 3: 19.9 ms per loop

In [16]: %timeit kerndist(x)
100 loops, best of 3: 4.38 ms per loop

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Charles R Harris
On Fri, May 2, 2008 at 8:02 PM, Keith Goodman [EMAIL PROTECTED] wrote:

 On Fri, May 2, 2008 at 6:29 PM, Charles R Harris
 [EMAIL PROTECTED] wrote:
  Isn't the lengthy part finding the distance between clusters?  I can
 think
  of several ways to do that, but I think you will get a real speedup by
 doing
  that in c or c++. I have a module made in boost python that holds
 clusters
  and returns a list of lists containing their elements. Clusters are
 joined
  by joining any two elements, one from each. It wouldn't take much to add
 a
  distance function, but you could use the list of indices in each cluster
 to
  pull a subset out of the distance matrix and then find the minimum
 function
  in that. This also reminds me of Huffman codes.

 You're right. Finding the distance is slow. Is there any way to speed
 up the function below? It returns the row and column indices of the
 min value of the NxN array x.

 def dist(x):
x = x + 1e10 * np.eye(x.shape[0])
i, j = np.where(x == x.min())
 return i[0], j[0]

  x = np.random.rand(500,500)
  timeit dist(x)
 100 loops, best of 3: 14.1 ms per loop

 If the clustering gives me useful results, I'll ask you about your
 boost code. I'll also take a look at Damian Eads's scipy-cluster.


That package looks nice. I think your time would be better spent learning
how to use it than in rolling your own routines.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Faster

2008-05-02 Thread Robert Kern
On Fri, May 2, 2008 at 10:48 PM, Keith Goodman [EMAIL PROTECTED] wrote:
 On Fri, May 2, 2008 at 7:25 PM, Robert Kern [EMAIL PROTECTED] wrote:
In [5]: def kerndist(x):
  ...: N = x.shape[0]
  ...: x.flat[::N+1] = x.max()
  ...: ij = argmin(x.flat)
  ...: i, j = divmod(ij, N)
  ...: return i, j

  I replaced

  ij = argmin(x.flat)

  with

  x.argmin()

  (they're the same in this context, right?) for a slight speed up. Now
  I'm down to 1.9 seconds.

Yes, they're the same. I forgot that .argmin() flattened by default.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion