Re: Computing correlations with SciPy

2006-03-19 Thread tkpmep
Tested it and it works like a charm! Thank you very much for fixing
this. Not knowing what an SVN is, I simply copied the code into the
appropriate library files and it works perfectly well.

May I suggest a simple enhancement: modify corrcoef so that if it is
fed two 1 dimensional arrays, it returns a scalar. cov does something
similar for covariances: if you feed it just one vector, it returns a
scalar, and if you feed it two, it returns the covariance matrix i.e:

>>> x = [1, 2, 3, 4, 5]

>>> z = [5, 4, 3, 2, 1]

>>> scipy.cov(x,z)
array([[ 2.5, -2.5],
   [-2.5,  2.5]])

>>> scipy.cov(x)
2.5

I suspect that the majority of users use corrcoef to obtain point
estimates of the covariance of two vectors, and relatively few will
estimate a covariance matrix, as this method tends not to be robust to
the presence of noise and/or errors in the data.

Thomas Philips

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Computing correlations with SciPy

2006-03-17 Thread Travis Oliphant
[EMAIL PROTECTED] wrote:
> I want to compute the correlation between two sequences X and Y, and
> tried using SciPy to do so without success.l Here's what I have, how
> can I correct it?
> 

This was a bug in NumPy (inherited from Numeric actually).  The fix is 
in SVN of NumPy.

Here are the new versions of those functions that should work as you 
wish (again, these are in SVN, but perhaps you have a binary install).

These functions belong in /numpy/lib/function_base.py



def cov(m,y=None, rowvar=1, bias=0):
 """Estimate the covariance matrix.

 If m is a vector, return the variance.  For matrices return the
 covariance matrix.

 If y is given it is treated as an additional (set of)
 variable(s).

 Normalization is by (N-1) where N is the number of observations
 (unbiased estimate).  If bias is 1 then normalization is by N.

 If rowvar is non-zero (default), then each row is a variable with
 observations in the columns, otherwise each column
 is a variable and the observations are in the rows.
 """

 X = asarray(m,ndmin=2)
 if X.shape[0] == 1:
 rowvar = 1
 if rowvar:
 axis = 0
 tup = (slice(None),newaxis)
 else:
 axis = 1
 tup = (newaxis, slice(None))


 if y is not None:
 y = asarray(y,ndmin=2)
 X = concatenate((X,y),axis)

 X -= X.mean(axis=1-axis)[tup]
 if rowvar:
 N = X.shape[1]
 else:
 N = X.shape[0]

 if bias:
 fact = N*1.0
 else:
 fact = N-1.0

 if not rowvar:
 return (dot(X.transpose(), X.conj()) / fact).squeeze()
 else:
 return (dot(X,X.transpose().conj())/fact).squeeze()

def corrcoef(x, y=None, rowvar=1, bias=0):
 """The correlation coefficients
 """
 c = cov(x, y, rowvar, bias)
 try:
 d = diag(c)
 except ValueError: # scalar covariance
 return 1
 return c/sqrt(multiply.outer(d,d))

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Computing correlations with SciPy

2006-03-16 Thread John Hunter
> "tkpmep" == tkpmep  <[EMAIL PROTECTED]> writes:

tkpmep> I want to compute the correlation between two sequences X
tkpmep> and Y, and tried using SciPy to do so without success.l
tkpmep> Here's what I have, how can I correct it?

 X = [1, 2, 3, 4, 5] Y = [5, 4, 3, 2, 1] import scipy
 scipy.corrcoef(X,Y)
tkpmep> Traceback (most recent call last): File " input>", line 1, in ?  File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\function_base.py",
tkpmep> line 671, in corrcoef d = diag(c) File
tkpmep> "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py",
tkpmep> line 80, in diag raise ValueError, "Input must be 1- or
tkpmep> 2-d."  ValueError: Input must be 1- or 2-d.


Hmm, this may be a bug in scipy.  matplotlib also defines a corrcoef
function, which you may want to use until this problem gets sorted out

In [9]: matplotlib.mlab.corrcoef(X,Y)

In [10]: X = [1, 2, 3, 4, 5]

In [11]: Y = [5, 4, 3, 2, 1]

In [12]: matplotlib.mlab.corrcoef(X,Y)
Out[12]:
array([[ 1., -1.],
   [-1.,  1.]])


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Computing correlations with SciPy

2006-03-16 Thread Felipe Almeida Lessa
Em Qui, 2006-03-16 às 07:49 -0800, [EMAIL PROTECTED] escreveu:
> I want to compute the correlation between two sequences X and Y, and
> tried using SciPy to do so without success.l Here's what I have, how
> can I correct it?

$ python2.4
Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 2005 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = [1,2,3,4,5]
>>> y = [5,4,3,2,1]
>>> import scipy
>>> scipy.corrcoef(x, y)
array([[ 1., -1.],
   [-1.,  1.]])
>>> # Looks fine for me...


-- 
http://mail.python.org/mailman/listinfo/python-list

Computing correlations with SciPy

2006-03-16 Thread tkpmep
I want to compute the correlation between two sequences X and Y, and
tried using SciPy to do so without success.l Here's what I have, how
can I correct it?

>>> X = [1, 2, 3, 4, 5]
>>> Y = [5, 4, 3, 2, 1]
>>> import scipy
>>> scipy.corrcoef(X,Y)
Traceback (most recent call last):
  File "", line 1, in ?
  File "C:\Python24\Lib\site-packages\numpy\lib\function_base.py", line
671, in corrcoef
d = diag(c)
  File "C:\Python24\Lib\site-packages\numpy\lib\twodim_base.py", line
80, in diag
raise ValueError, "Input must be 1- or 2-d."
ValueError: Input must be 1- or 2-d.
>>> 

Thanks in advance

Thomas Philips

-- 
http://mail.python.org/mailman/listinfo/python-list