from:"Colin J. Williams"


  
  
I have a small program which builds random matrices for
increasing matrix orders, inverts the matrix and checks the
precision of the product.  At some point, one would expect
operations to fail, when the memory capacity is exceeded.  In
both Python 2.7 and 3.2 matrices of order 3,071 area handled,
but not 6,143.  

Using wall-clock times, with win32, Python 3.2 is slower than
Python 2.7.  The profiler indicates a problem in the solver.

Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221
GB of free disk space.  Both Python 3.2.3 and Python 2.7.3 use
numpy 1.6.2.

The results are show below.

Colin W.

_
2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit
(Intel)]
order=    2   measure ofimprecision= 0.097   Time elapsed
(seconds)=  0.004143
order=    5   measure ofimprecision= 2.207   Time elapsed
(seconds)=  0.001514
order=   11   measure ofimprecision= 2.372   Time elapsed
(seconds)=  0.001455
order=   23   measure ofimprecision= 3.318   Time elapsed
(seconds)=  0.001608
order=   47   measure ofimprecision= 4.257   Time elapsed
(seconds)=  0.002339
order=   95   measure ofimprecision= 4.986   Time elapsed
(seconds)=  0.005747
order=  191   measure ofimprecision= 5.788   Time elapsed
(seconds)=  0.029974
order=  383   measure ofimprecision= 6.765   Time elapsed
(seconds)=  0.145339
order=  767   measure ofimprecision= 7.909   Time elapsed
(seconds)=  0.841142
order= 1535   measure ofimprecision= 8.532   Time elapsed
(seconds)=  5.793630
order= 3071   measure ofimprecision= 9.774   Time elapsed
(seconds)= 39.559540
order=  6143 Process terminated by a MemoryError

Above: 2.7.3  Below: Python 3.2.3

bbb_bbb
3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit
(Intel)]
order=    2   measure ofimprecision= 0.000   Time elapsed
(seconds)=  0.113930
order=    5   measure ofimprecision= 1.807   Time elapsed
(seconds)=  0.001373
order=   11   measure ofimprecision= 2.395   Time elapsed
(seconds)=  0.001468
order=   23   measure ofimprecision= 3.073   Time elapsed
(seconds)=  0.001609
order=   47   measure ofimprecision= 5.642   Time elapsed
(seconds)=  0.002687
order=   95   measure ofimprecision= 5.745   Time elapsed
(seconds)=  0.013510
order=  191   measure ofimprecision= 5.866   Time elapsed
(seconds)=  0.061560
order=  383   measure ofimprecision= 7.129   Time elapsed
(seconds)=  0.418490
order=  767   measure ofimprecision= 8.240   Time elapsed
(seconds)=  3.815713
order= 1535   measure ofimprecision= 8.735   Time elapsed
(seconds)= 27.877270
order= 3071   measure ofimprecision= 9.996   Time elapsed
(seconds)=212.545610
order=  6143 Process terminated by a MemoryError


  
  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy


  
  
On 20/03/2013 10:14 AM, Daπid wrote:


  Without much detailed knowledge of the topic, I would expect both
versions to give very similar timing, as it is essentially a call to
ATLAS function, not much is done in Python.

Given this, maybe the difference is in ATLAS itself. How have you
installed it? 

I know nothing about what goes on behind the scenes.  I am
using the win32 binary package.

Colin W.

  When you compile ATLAS, it will do some machine-specific
optimisation, but if you have installed a binary chances are that your
version is optimised for a machine quite different from yours. So, two
different installations could have been compiled in different machines
and so one is more suited for your machine. If you want to be sure, I
would try to compile ATLAS (this may be difficult) or check the same
on a very different machine (like an AMD processor, different
architecture...).



Just for reference, on Linux Python 2.7 64 bits can deal with these
matrices easily.

%timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat);
res = np.dot(mat, matinv); diff= res-np.eye(6143); print
np.sum(np.abs(diff))
2.41799631031e-05
1.13955868701e-05
3.64338191541e-05
1.13484781021e-05
1 loops, best of 3: 156 s per loop

Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository
(I don't run heavy stuff on this computer).

On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote:

  
I have a small program which builds random matrices for increasing matrix
orders, inverts the matrix and checks the precision of the product.  At some
point, one would expect operations to fail, when the memory capacity is
exceeded.  In both Python 2.7 and 3.2 matrices of order 3,071 area handled,
but not 6,143.

Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7.
The profiler indicates a problem in the solver.

Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free
disk space.  Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2.

The results are show below.

Colin W.

_
2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
order=2   measure ofimprecision= 0.097   Time elapsed (seconds)=
0.004143
order=5   measure ofimprecision= 2.207   Time elapsed (seconds)=
0.001514
order=   11   measure ofimprecision= 2.372   Time elapsed (seconds)=
0.001455
order=   23   measure ofimprecision= 3.318   Time elapsed (seconds)=
0.001608
order=   47   measure ofimprecision= 4.257   Time elapsed (seconds)=
0.002339
order=   95   measure ofimprecision= 4.986   Time elapsed (seconds)=
0.005747
order=  191   measure ofimprecision= 5.788   Time elapsed (seconds)=
0.029974
order=  383   measure ofimprecision= 6.765   Time elapsed (seconds)=
0.145339
order=  767   measure ofimprecision= 7.909   Time elapsed (seconds)=
0.841142
order= 1535   measure ofimprecision= 8.532   Time elapsed (seconds)=
5.793630
order= 3071   measure ofimprecision= 9.774   Time elapsed (seconds)=
39.559540
order=  6143 Process terminated by a MemoryError

Above: 2.7.3  Below: Python 3.2.3

bbb_bbb
3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
order=2   measure ofimprecision= 0.000   Time elapsed (seconds)=
0.113930
order=5   measure ofimprecision= 1.807   Time elapsed (seconds)=
0.001373
order=   11   measure ofimprecision= 2.395   Time elapsed (seconds)=
0.001468
order=   23   measure ofimprecision= 3.073   Time elapsed (seconds)=
0.001609
order=   47   measure ofimprecision= 5.642   Time elapsed (seconds)=
0.002687
order=   95   measure ofimprecision= 5.745   Time elapsed (seconds)=
0.013510
order=  191   measure ofimprecision= 5.866   Time elapsed (seconds)=
0.061560
order=  383   measure ofimprecision= 7.129   Time elapsed (seconds)=
0.418490
order=  767   measure ofimprecision= 8.240   Time elapsed (seconds)=
3.815713
order= 1535   measure ofimprecision= 8.735   Time elapsed (seconds)=
27.877270
order= 3071   measure ofimprecision= 9.996   Time elapsed
(seconds)=212.545610
order=  6143 Process terminated by a MemoryError



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


  
  ___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion




  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy


  
  
On 20/03/2013 10:29 AM, Jens Nielsen
  wrote:


  Hi, 


Could also be that they are linked to different
  libs such as atlas and standart Blas. What is the output of 
numpy.show_config() in the two different python
  versions. 


Jens 

  

Thanks for this pointer.
The result for Py2.7:
  
 numpy.show_config()
  atlas_threads_info:
    NOT AVAILABLE
  blas_opt_info:
      libraries = ['f77blas', 'cblas', 'atlas']
      library_dirs = ['C:\\local\\lib\\yop\\sse3']
      define_macros = [('NO_ATLAS_INFO', -1)]
      language = c
  atlas_blas_threads_info:
    NOT AVAILABLE
  lapack_opt_info:
      libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
      library_dirs = ['C:\\local\\lib\\yop\\sse3']
      define_macros = [('NO_ATLAS_INFO', -1)]
      language = f77
  atlas_info:
      libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
      library_dirs = ['C:\\local\\lib\\yop\\sse3']
      define_macros = [('NO_ATLAS_INFO', -1)]
      language = f77
  lapack_mkl_info:
    NOT AVAILABLE
  blas_mkl_info:
    NOT AVAILABLE
  atlas_blas_info:
      libraries = ['f77blas', 'cblas', 'atlas']
      library_dirs = ['C:\\local\\lib\\yop\\sse3']
      define_macros = [('NO_ATLAS_INFO', -1)]
      language = c
  mkl_info:
    NOT AVAILABLE
  

The result for 3.2:
  
 import numpy
   numpy.show_config()
  lapack_info:
    NOT AVAILABLE
  lapack_opt_info:
    NOT AVAILABLE
  blas_info:
    NOT AVAILABLE
  atlas_threads_info:
    NOT AVAILABLE
  blas_src_info:
    NOT AVAILABLE
  atlas_blas_info:
    NOT AVAILABLE
  lapack_src_info:
    NOT AVAILABLE
  atlas_blas_threads_info:
    NOT AVAILABLE
  blas_mkl_info:
    NOT AVAILABLE
  blas_opt_info:
    NOT AVAILABLE
  atlas_info:
    NOT AVAILABLE
  lapack_mkl_info:
    NOT AVAILABLE
  mkl_info:
    NOT AVAILABLE
  

I hope that this helps.

Colin W.
  

  
On Wed, Mar 20, 2013 at 2:14 PM, Daπid
  davidmen...@gmail.com
  wrote:
  Without
much detailed knowledge of the topic, I would expect both
versions to give very similar timing, as it is essentially a
call to
ATLAS function, not much is done in Python.

Given this, maybe the difference is in ATLAS itself. How
have you
installed it? When you compile ATLAS, it will do some
machine-specific
optimisation, but if you have installed a binary chances are
that your
version is optimised for a machine quite different from
yours. So, two
different installations could have been compiled in
different machines
and so one is more suited for your machine. If you want to
be sure, I
would try to compile ATLAS (this may be difficult) or check
the same
on a very different machine (like an AMD processor,
different
architecture...).



Just for reference, on Linux Python 2.7 64 bits can deal
with these
matrices easily.

%timeit mat=np.random.random((6143,6143)); matinv=
np.linalg.inv(mat);
res = np.dot(mat, matinv); diff= res-np.eye(6143); print
np.sum(np.abs(diff))
2.41799631031e-05
1.13955868701e-05
3.64338191541e-05
1.13484781021e-05
1 loops, best of 3: 156 s per loop

Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora
repository
(I don't run heavy stuff on this computer).

  
On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca
wrote:
 I have a small program which builds random matrices
for increasing matrix
 orders, inverts the matrix and checks the precision
of the product.  At some
 point, one would expect operations to fail, when
the memory capacity is
 exceeded.  In both Python 2.7 and 3.2 matrices of
order 3,071 area handled,
 but not 6,143.

 Using wall-clock times, with win32, Python 3.2 is
slower

Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy

On 20/03/2013 10:30 AM, Frédéric Bastien wrote:
 Hi,

 win32 do not mean it is a 32 bits windows. sys.platform always return
 win32 on 32bits and 64 bits windows even for python 64 bits.

 But that is a good question, is your python 32 or 64 bits?
32 bits.

Colin W.

 Fred

 On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote:
 Without much detailed knowledge of the topic, I would expect both
 versions to give very similar timing, as it is essentially a call to
 ATLAS function, not much is done in Python.

 Given this, maybe the difference is in ATLAS itself. How have you
 installed it? When you compile ATLAS, it will do some machine-specific
 optimisation, but if you have installed a binary chances are that your
 version is optimised for a machine quite different from yours. So, two
 different installations could have been compiled in different machines
 and so one is more suited for your machine. If you want to be sure, I
 would try to compile ATLAS (this may be difficult) or check the same
 on a very different machine (like an AMD processor, different
 architecture...).



 Just for reference, on Linux Python 2.7 64 bits can deal with these
 matrices easily.

 %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat);
 res = np.dot(mat, matinv); diff= res-np.eye(6143); print
 np.sum(np.abs(diff))
 2.41799631031e-05
 1.13955868701e-05
 3.64338191541e-05
 1.13484781021e-05
 1 loops, best of 3: 156 s per loop

 Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository
 (I don't run heavy stuff on this computer).

 On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote:
 I have a small program which builds random matrices for increasing matrix
 orders, inverts the matrix and checks the precision of the product.  At some
 point, one would expect operations to fail, when the memory capacity is
 exceeded.  In both Python 2.7 and 3.2 matrices of order 3,071 area handled,
 but not 6,143.

 Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7.
 The profiler indicates a problem in the solver.

 Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free
 disk space.  Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2.

 The results are show below.

 Colin W.

 _
 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
 order=2   measure ofimprecision= 0.097   Time elapsed (seconds)=
 0.004143
 order=5   measure ofimprecision= 2.207   Time elapsed (seconds)=
 0.001514
 order=   11   measure ofimprecision= 2.372   Time elapsed (seconds)=
 0.001455
 order=   23   measure ofimprecision= 3.318   Time elapsed (seconds)=
 0.001608
 order=   47   measure ofimprecision= 4.257   Time elapsed (seconds)=
 0.002339
 order=   95   measure ofimprecision= 4.986   Time elapsed (seconds)=
 0.005747
 order=  191   measure ofimprecision= 5.788   Time elapsed (seconds)=
 0.029974
 order=  383   measure ofimprecision= 6.765   Time elapsed (seconds)=
 0.145339
 order=  767   measure ofimprecision= 7.909   Time elapsed (seconds)=
 0.841142
 order= 1535   measure ofimprecision= 8.532   Time elapsed (seconds)=
 5.793630
 order= 3071   measure ofimprecision= 9.774   Time elapsed (seconds)=
 39.559540
 order=  6143 Process terminated by a MemoryError

 Above: 2.7.3  Below: Python 3.2.3

 bbb_bbb
 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
 order=2   measure ofimprecision= 0.000   Time elapsed (seconds)=
 0.113930
 order=5   measure ofimprecision= 1.807   Time elapsed (seconds)=
 0.001373
 order=   11   measure ofimprecision= 2.395   Time elapsed (seconds)=
 0.001468
 order=   23   measure ofimprecision= 3.073   Time elapsed (seconds)=
 0.001609
 order=   47   measure ofimprecision= 5.642   Time elapsed (seconds)=
 0.002687
 order=   95   measure ofimprecision= 5.745   Time elapsed (seconds)=
 0.013510
 order=  191   measure ofimprecision= 5.866   Time elapsed (seconds)=
 0.061560
 order=  383   measure ofimprecision= 7.129   Time elapsed (seconds)=
 0.418490
 order=  767   measure ofimprecision= 8.240   Time elapsed (seconds)=
 3.815713
 order= 1535   measure ofimprecision= 8.735   Time elapsed (seconds)=
 27.877270
 order= 3071   measure ofimprecision= 9.996   Time elapsed
 (seconds)=212.545610
 order=  6143 Process terminated by a MemoryError



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy


  
  
On 20/03/2013 11:06 AM, Jens Nielsen
  wrote:


  The python3 version is compiled without
any optimised library and is falling back on a slow version.
Where did you get this installation from?


Jens
  
  

From the SciPy site.

Colin W.
  
  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy


  
  
On 20/03/2013 11:12 AM, Frédéric
  Bastien wrote:


  On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams
cjwilliam...@gmail.com wrote:

  
On 20/03/2013 10:30 AM, Frédéric Bastien wrote:


  
Hi,

win32 do not mean it is a 32 bits windows. sys.platform always return
win32 on 32bits and 64 bits windows even for python 64 bits.

But that is a good question, is your python 32 or 64 bits?



32 bits.

  
  
That explain why you have memory problem but not other people with 64
bits version. So if you want to work with bigger input, change to a
python 64 bits.

Fred



But my machine is only 32 bit.

Colin W.
  
  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Synonym standards

2012-07-26 Thread Colin J. Williams


  
  
It seems that these standards
have been adopted, which is good:
  

  The following import conventions are used throughout the NumPy
source and documentation:
  import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt 


Is there some similar standard for PyLab?

Thanks,

Colin W.
   
  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Synonym standards

2012-07-26 Thread Colin J. Williams

Sent from my BlackBerry® PlayBook™
www.blackberry.com

--
*From:* Benjamin Root ben.r...@ou.edu
*To:* Discussion of Numerical Python numpy-discussion@scipy.org
*Sent:* 26 July 2012 16:57
*Subject:* Re: [Numpy-discussion] Synonym standards


On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams fn...@ncf.ca wrote:

  It seems that these standards have been adopted, which is good:

 The following import conventions are used throughout the NumPy source and
 documentation:

 import numpy as np
 import matplotlib as mpl
 import matplotlib.pyplot as plt

 Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt

  Is there some similar standard for PyLab?

 Thanks,

 Colin W.



Colin,

Typically, with pylab mode of matplotlib, you do:

from pylab import *

This is essentially equivalent to:

from numpy import *
from matplotlib.pyplot import *

Note that the pylab module is actually a part of matplotlib and is a
shortcut to provide an environment that is very familiar to Matlab users.
Converts are then encouraged to use the imports you mentioned in order to
properly utilize python namespaces.

I hope that helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Synonym standards

2012-07-26 Thread Colin J. Williams


  
  
On 26/07/2012 4:57 PM, Benjamin Root
  wrote:


  On Thu, Jul 26, 2012 at 4:45 PM, Colin J.
Williams fn...@ncf.ca
wrote:

   It seems that these standards have been
adopted, which is good:
  

  The following import conventions are used throughout
the NumPy source and documentation:
  import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt 


Is there some similar standard for PyLab?

Thanks,

Colin W.
   
  


  
  Colin,
  
  Typically, with pylab mode of matplotlib, you do:
  
  from pylab import *
  
  This is essentially equivalent to:
  
  from numpy import *
  from matplotlib.pyplot import *
  
  Note that the pylab "module" is actually a part of matplotlib
  and is a shortcut to provide an environment that is very
  familiar to Matlab users. Converts are then encouraged to use
  the imports you mentioned in order to properly utilize python
  namespaces.
  
  I hope that helps!
  Ben Root
  

  
  
  
  
  ___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Thanks Ben,

I would prefer not to use: from xxx import *,

because of the name pollution.

The name convention that I copied above facilitates avoiding
the pollution.

In the same spirit, I've used:
import pylab as plb

I had suspected, but hadn't checked, that pylab contains the
total namespace of numpy and matplotlib,
thanks for confirming this.

Colin W,
  
  

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Rounding to next lowest float

2011-10-11 Thread Colin J. Williams


If you are using integers, why not use Python's Long?

Colin W.

On 11/10/2011 2:00 PM, Matthew Brett wrote:

Hi,

Can anyone think of a clever way to round an integer to the next
lowest integer represented in a particular floating point format?

For example:

In [247]: a = 2**25+3

This is out of range of the continuous integers representable by float32, hence:

In [248]: print a, int(np.float32(a))
33554435 33554436

But I want to round down (floor) the integer in float32.  That is, in
this case I want:


floor_exact(a, np.float32)

33554432

I can break the float into its parts to do it:

https://github.com/matthew-brett/nibabel/blob/f687bfc88d1676a09fc76c968a346bc81e4d0d04/nibabel/floating.py

but that's obviously rather ugly...  Is there a simpler way?  I'm sure
there is and I haven't thought of it...

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

2009-12-06 Thread Colin J. Williams



On 04-Dec-09 10:54 AM, Bruce Southey wrote:
 On 12/04/2009 06:18 AM, yogesh karpate wrote:
 @ Pauli and @ Colin:
   Sorry for the late reply. I was 
 busy in some other assignments.
 # As far as  normalization by(n) is concerned then its common 
 assumption that the population is normally distributed and population 
 size is fairly large enough to fit the normal distribution. But this 
 standard deviation, when applied to a small population, tends to be 
 too low therefore it is called  as biased.
 # The correction known as bessel correction is there for small sample 
 size std. deviation. i.e. normalization by (n-1).
 # In electrical-and-electronic-measurements-and-instrumentation by 
 A.K. Sawhney . In 1st chapter of the book Fundamentals of 
 Meausrements  . Its shown that for N=16 the std. deviation 
 normalization was (n-1)=15
 # While I was learning statistics in my course Instructor would 
 advise to take n=20 for normalization by (n-1)
 # Probability and statistics by Schuam Series  is good reading.
 Regards
 ~ymk


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 Hi,
 Basically, all that I see with these arbitrary values is that you are 
 relying on the 'central limit theorem' 
 (http://en.wikipedia.org/wiki/Central_limit_theorem).  Really the 
 issue in using these values is how much statistical bias will you 
 tolerate especially in the impact on usage of that estimate because 
 the usage of variance (such as in statistical tests) tend to be more 
 influenced by bias than the estimate of variance. (Of course, many 
 features rely on asymptotic properties so bias concerns are less 
 apparent in large sample sizes.)

 Obviously the default relies on the developers background and 
 requirements. There are multiple valid variance estimators in 
 statistics with different denominators like N (maximum likelihood 
 estimator), N-1 (restricted maximum likelihood estimator and certain 
 Bayesian estimators) and Stein's 
 (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So 
 thecurrent default behavior is a valid and documented. Consequently 
 you can not just have one option or different functions (like certain 
 programs) and Numpy's implementation actually allows you do all these 
 in a single function. So I also see no reason change even if I have to 
 add the ddof=1 argument, after all 'Explicit is better than implicit' :-).

 Bruce
Bruce,

I suggest that the Central Limit Theorem is tied in with the Law of 
Large Numbers.

When one has a smallish sample size, what give the best estimate of the 
variance?  The Bessel Correction provides a rationale, based on 
expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction).

It is difficult to understand the proof of Stein: 
http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example

The symbols used are not clearly stated.  He seems interested in a 
decision rule for the calculation of the mean of a sample and claims 
that his approach is better than the traditional Least Squares approach.

In most cases, the interest is likely to be in the variance, with a view 
to establishing a confidence interval.

In the widely used Analysis of Variance (ANOVA), the degrees of freedom 
are reduced for each mean estimated, see:
http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below:

*Analysis of Variance Table* ** Source of
Variation   Sum of
Squares Degrees of
Freedom Mean
Square  F Ratio p
Between Groups  25.20   2   12.60   5.178   .05
Within Groups   29.20   12  2.43

Total   54.40   14  




There is a sample of 15 observations, which is divided into three 
groups, depending on the number of hours of therapy.
Thus, the Total degrees of freedom are 15-1 = 14,  the Between Groups 
3-1 = 2 and the Residual is 14 - 2 = 12.

Colin W.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

2009-12-05 Thread Colin J. Williams



On 04-Dec-09 05:21 AM, Pauli Virtanen wrote:
 pe, 2009-12-04 kello 11:19 +0100, Chris Colbert kirjoitti:

 Why cant the divisor constant just be made an optional kwarg that
 defaults to zero?
  
 It already is an optional kwarg that defaults to zero.

 Cheers,

I suggested that 1 (one) would be a better default but Robert Kern told 
us that it won't happen.

Colin W.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

2009-12-05 Thread Colin J. Williams



On 04-Dec-09 07:18 AM, yogesh karpate wrote:
 @ Pauli and @ Colin:
   Sorry for the late reply. I was busy 
 in some other assignments.
 # As far as  normalization by(n) is concerned then its common 
 assumption that the population is normally distributed and population 
 size is fairly large enough to fit the normal distribution. But this 
 standard deviation, when applied to a small population, tends to be 
 too low therefore it is called  as biased.
 # The correction known as bessel correction is there for small sample 
 size std. deviation. i.e. normalization by (n-1).
 # In electrical-and-electronic-measurements-and-instrumentation by 
 A.K. Sawhney . In 1st chapter of the book Fundamentals of 
 Meausrements  . Its shown that for N=16 the std. deviation 
 normalization was (n-1)=15
 # While I was learning statistics in my course Instructor would advise 
 to take n=20 for normalization by (n-1)
 # Probability and statistics by Schuam Series  is good reading.
 Regards
 ~ymk




Yogesh,

Thanks for the Bessel name, I hadn't come across that before.

The Wikipedia reference for the Bessel Correction uses a divisor of n-1: 
http://en.wikipedia.org/wiki/Bessel%27s_correction

Perhaps the simplification for larger n comes from the fact that for 
large n, 1/n  = 1/(n-1).

I would suggest C. E. Weatherburn - Mathematical Statistics,  but I 
doubt whether it is still widely available.

Colin W.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

2009-12-03 Thread Colin J. Williams

Yogesh,

Could you explain the rationale for this choice please?

Colin W.

On 03-Dec-09 00:35 AM, yogesh karpate wrote:
 The thing is that the normalization by (n-1) is done for the no. of 
 samples 20 or23(Not sure about this no. but sure about the thing that 
 this no isnt greater than 25) and below that we use normalization by n.
 Regards
 ~ymk



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] non-standard standard deviation

2009-11-29 Thread Colin J. Williams



On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote:
 All of the statistical packages that I am currently using and have used in
 the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the
 sqrt(1/(n-1)) normalization, which gives a result that is unbiased when
 sampling from a normally-distributed population.  NumPy uses the sqrt(1/n)
 normalization.  I'm currently using the following code to calculate standard
 deviations, but would much prefer if this could be fixed in NumPy itself:

 def mystd(x=numpy.array([]), axis=None):
 This function calculates the standard deviation of the input using the
 definition of standard deviation that gives an unbiased result for
 samples
 from a normally-distributed population.

 xd= x - x.mean(axis=axis)
 return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) )

Anne Archibald has suggested a work-around.  Perhaps ddof could be set, 
by default to
1 as other values are rarely required.

Where the distribution of a variate is not known a priori, then I 
believe that it can be shown
that the n-1 divisor provides the best estimate of the variance.

Colin W.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Resize method

2009-11-23 Thread Colin J. Williams

Access by the interpreter prevents array resizing.

Yes, one can use the function, in place of the method but this appears 
to require copying the whole array.

If one sets b= a, then that reference can be deleted with del b.

Is there any similar technique for the interpreter?

Colin W.

Python 2.6 (r26:66721, Oct  2 2008, 11:35:03) [MSC v.1500 32 bit 
(Intel)] on win32
Type help, copyright, credits or license for more information.
  from numpy import *
  a= array(7*[3])
  a.resize((3,7))
  a
array([[3, 3, 3, 3, 3, 3, 3],
   [0, 0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0, 0]])
  a.resize((4,7))
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: cannot resize an array that has been referenced or is 
referencing
another array in this way.  Use the resize function
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Resize method

2009-11-23 Thread Colin J. Williams


Christopher Barker wrote:
 Colin J. Williams wrote:
 Access by the interpreter prevents array resizing.

 yup -- resize is really fragile for that reason. It really should be 
 used quite sparingly.

 Personally, I think it should probably only be used when wrapped with 
 a higher level layer.

 I've been working on an extendable array class, I call an accumulator 
 (bad name...). The idea is that you can use it to accumulate values 
 when you don't know how big it's going to end up, rather than using a 
 list for this, which is the standard idiom.

 In [2]: import accumulator

 In [3]: a = accumulator.accumulator((1,2,3,4,))

 In [4]: a
 Out[4]: accumulator([1, 2, 3, 4])

 In [5]: a.append(5)

 In [6]: a
 Out[6]: accumulator([1, 2, 3, 4, 5])

 In [8]: a.extend((6,7,8,9))
 In [9]: a
 Out[9]: accumulator([1, 2, 3, 4, 5, 6, 7, 8, 9])


 At the moment, it only support 1-d arrays, though I'd like to extend 
 it to n-d, probably only allowing growing on the first axis.

 This has been discussed on this list a fair bit, with mixed reviews as 
 to whether there is any point. It's slower than lists in common usage, 
 but has other advantages -- I'd like to see a C version, but don't 
 know if I'll ever have the time for that.

 I've enclosed to code for your viewing pleasure

 -Chris

Thanks for this.  My aim is to extract a row of data from a line in a 
file and append it to an array.  The number of columns is fixed but, at 
the start, the number of rows is unknown.

I think that I have sorted out the resize approach but I need more tests 
before I share it.

Your accumulator idea is interesting.  Back in 2004, I worked on 
MyMatrix, based on numarray - abandoned when numpy came onto the scene.

One of the capabilities there was an /append/ method, intended to add a 
conforming matrix to the right or below the given matrix.  It was 
probably not efficient but it provided a means of joining together block 
matrices,

The append signature, from a January 2005 backup is here:

  def append(self, other, toRight= False):
'''
Return self, with other appended, to the Right or Below,
default: Below.

other - a matrix, a list of matrices,
or objects which can be converted into matrices.   
   
'''
assert self.iscontiguous()
assert self.rank == 2
if isinstance(other, _n.NumArray):
  ...

Colin W.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] subclassing matrix

2008-01-12 Thread Colin J. Williams

Basilisk96 wrote:
 On Jan 12, 1:36 am, Timothy Hochberg [EMAIL PROTECTED] wrote:
 I believe that you need to look at __array_finalize__ and __array_priority__
 (and there may be one other thing as well, I can't remember; it's late).
 Search for __array_finalize__ and that will probably help get you started.

 
 Well sonovagun!
 I removed the hack.
 Then just by setting __array_priority__ = 20.0 in the class body,
 things are magically working almost as I expect. I say almost
 because of this custom method:
 
 def cross(self, other):
 Cross product of this vector and another vector
 return _N.cross(self, other, axis=0)
 
 That call to numpy.cross returns a numpy.ndarray. Unless I do return
 Vector(_N.cross(self, other, axis=0)), I get problems downstream.
 
 When is __array_finalize__ called? By adding some print traces, I can
 see it's called every time an array is modified in any way i.e.,
 reshaped, transposed, etc., and also during operations like u+v, u-v,
 A*u. But it's not called during the call to numpy.cross. Why?
 
 Cheers,
 -Basilisk96

This may help.  It is based on your 
initial script.

The Vectors are considered as columns 
but presented as rows.

This adds a complication which is not 
resolved.

Colin W.

#-- vector.py
import numpy as _N
import math as _M
#default tolerance for equality tests
TOL_EQ = 1e-6
#default format for pretty-printing 
Vector instances
FMT_VECTOR_DEFAULT = %+.5f

class Vector(_N.matrix):
 
 2D/3D vector class that supports 
numpy matrix operations and more.

 Examples:
 u = Vector([1,2,3])
 v = Vector('3 4 5')
 w = Vector([1, 2])
 
 def __new__(cls, data=0. 0. 0., 
dtype=_N.float64):
 
 Subclass instance constructor.

 If data is not specified, a 
zero Vector is constructed.
 The constructor always 
returns a Vector instance.
 The instance gets a 
customizable Format attribute, which
 controls the printing 
precision.
 
 data= [1, 2, 3]
 ret= _N.matrix(data, dtype)
##ret = super(Vector, 
cls).__new__(cls, data, dtype=dtype)

###promote the instance to cls type.
##ret.__class__ = cls
 assert ret.size in (2, 3), 
'Vector must have either two or three 
components'
 if ret.shape[0] == 1:
 ret = ret.T
 assert ret.shape == 
(ret.shape[0], 1), 'could not express 
Vector as a Mx1 matrix'
 if ret.shape[0] == 2:
 ret = _N.vstack((ret, 0.))
 ret.Format = FMT_VECTOR_DEFAULT
 ret=  _N.ndarray.__new__(cls, 
ret.shape, dtype,
 
buffer=ret.data)
 return ret

 def __str__(self):
 fmt = getattr(self, Format, 
FMT_VECTOR_DEFAULT)
 fmt = ', '.join([fmt]*3)
 return ''.join([(, fmt, )]) 
% tuple(self.T.tolist()[0])

 def __repr__(self):
 fmt = ', '.join(['%s']*3)
 return ''.join([%s([, fmt, 
])]) % tuple([self.__class__.__name__] 
+ self.T.tolist()[0])

 def __mul__(self, mult):
   ''' self * multiplicand '''
   if isinstance(mult, _N.matrix):
 return _N.dot(self, mult)
   else:
 raise DataError, 'multiplicand 
must be a Vector or a matrix'

 def __rmul__(self, mult):
   ''' multiplier * self.__mul__ '''
   if isinstance(mult, _N.matrix):
 return Vector(_N.dot(mult, self))
   else:
 raise DataError, 'multiplier 
must be a Vector or a matrix'

  the remaining methods are 
Vector-specific math operations, 
including the X,Y,Z properties...
if __name__ == '__main__':
   u = Vector('1 2 3')
   print str(u)
   print repr(u)
   A = _N.matrix('2 0 0; 0 2 0; 0 0 2')
   print A
   p = A * u
   print p
   print  p.__class__
   q= u.T * A
   try:
 print q
   except:
 print we don't allow for the 
display of row vectors
   print q.A, q.T
   print q.__class__
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] subclassing matrix

2008-01-10 Thread Colin J. Williams

Basilisk96 wrote:
 Hello folks,
 
 In the course of a project that involved heavy use of geometry and
 linear algebra, I found it useful to create a Vector subclass of
 numpy.matrix (represented as a column vector in my case).

Why not consider a matrix with a shape 
of (1, n) as a row vector and
one with (n, 1) as a column vector?

Then you can simply write A * u or u.T * A.

Does this not meet the need?

You could add methods isRowVector and 
isColumnVector to the Matrix class.

Colin W.
 
 I'd like to hear comments about my use of this class promotion
 statement in __new__:
 ret.__class__ = cls
 
 It seems to me that it is hackish to just change an instance's class
 on the fly, so perhaps someone could clue me in on a better practice.
 Here is my reason for doing this:
 Many applications of this code involve operations between instances of
 numpy.matrix and instances of Vector, such as applying a linear-
 operator matrix on a vector. If I omit that class promotion
 statement, then the results of such operations cannot be instantiated
 as Vector types:
  from vector import Vector
  import numpy
  u = Vector('1 2 3')
  A = numpy.matrix('2 0 0; 0 2 0; 0 0 2')
  p = Vector(A * u)
  p.__class__
 class 'numpy.core.defmatrix.matrix'
 
 This is undesirable because the calculation result loses the custom
 Vector methods and attributes that I want to use. However, if I use
 that class promotion statement, the p.__class__ lookup returns what
 I want:
  p.__class__
 class 'vector.Vector'
 
 Is there a better way to achieve that?
 
 Here is the partial subclass code:
 #-- vector.py
 import numpy as _N
 import math as _M
 #default tolerance for equality tests
 TOL_EQ = 1e-6
 #default format for pretty-printing Vector instances
 FMT_VECTOR_DEFAULT = %+.5f
 
 class Vector(_N.matrix):
 
 2D/3D vector class that supports numpy matrix operations and more.
 
 Examples:
 u = Vector([1,2,3])
 v = Vector('3 4 5')
 w = Vector([1, 2])
 
 def __new__(cls, data=0. 0. 0., dtype=_N.float64):
 
 Subclass instance constructor.
 
 If data is not specified, a zero Vector is constructed.
 The constructor always returns a Vector instance.
 The instance gets a customizable Format attribute, which
 controls the printing precision.
 
 ret = super(Vector, cls).__new__(cls, data, dtype=dtype)
 #promote the instance to cls type.
 ret.__class__ = cls
 assert ret.size in (2, 3), 'Vector must have either two or
 three components'
 if ret.shape[0] == 1:
 ret = ret.T
 assert ret.shape == (ret.shape[0], 1), 'could not express
 Vector as a Mx1 matrix'
 if ret.shape[0] == 2:
 ret = _N.vstack((ret, 0.))
 ret.Format = FMT_VECTOR_DEFAULT
 return ret
 
 def __str__(self):
 fmt = getattr(self, Format, FMT_VECTOR_DEFAULT)
 fmt = ', '.join([fmt]*3)
 return ''.join([(, fmt, )]) % (self.X, self.Y, self.Z)
 
 def __repr__(self):
 fmt = ', '.join(['%s']*3)
 return ''.join([%s([, fmt, ])]) %
 (self.__class__.__name__, self.X, self.Y, self.Z)
 
  the remaining methods are Vector-specific math operations,
 including the X,Y,Z properties...
 
 
 Cheers,
 -Basilisk96

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] defmatrix.py

2007-03-27 Thread Colin J. Williams

Charles R Harris wrote:
 
 
 On 3/26/07, *Travis Oliphant* [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:
 
 
   I think that might be the simplest thing, dot overrides subtypes.
 BTW,
   here is another ambiguity
  
   In [6]: dot(array([[1]]),ones(2))
  
 
 ---
 
   exceptions.ValueErrorTraceback (most
   recent call last)
  
   /home/charris/ipython console
  
   ValueError: matrices are not aligned
  
   Note that in this case dot acts like the rhs is always a column
 vector
   although it returns a 1-d vector. I don't know that this is a bad
   thing, but perhaps we should extend this behaviour to matrices,
 which
   would be different from the now current 1-d is always a *row*
 vector, i.e.
 
 
 The rule 1-d is always a *row* vector only applies when converting to a
 matrix.
 
 In this case, the dot operator does not convert to a matrix but uses
 rules for operating with mixed 2-d and 1-d arrays inherited from
 Numeric.
 
 I'm very hesitant to change those rules.
 
 
 I wasn't suggesting that, just noticing that the rule was 1-d vector on 
 right is treated as a column vector by dot, which is why an exception 
 was raised in the posted case. If it is traditional for matrix routines 
 always treat is as a row vector, so be it.

My recollection is that text books treat the column vector, represented 
by a lower case letter, bold or underlined, as the default.  If b 
(dressed as described before) is a column vector, then b' represents a 
row vector.

For numpy, it makes sense to consider b as a row vector, since the 
underlying array uses the C convention where each row is stored 
contiguously.

Colin W.

 
 Chuck
 
 
 
 
 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Colin J. Williams

Alan G Isaac wrote:
 On Mon, 26 Mar 2007, Colin J. Williams apparently wrote: 
 One would expect the iteration over A to return row 
 vectors, represented by (1, n) matrices. 
 
 This is again simple assertion.
 **Why** would one expect this?
 Some people clearly do not.
 
 One person commented that this unexpected behavior was 
 a source of error in their code.
 
 Another person commented that they did not even guess that 
 such a thing would be possible.
 
 Experience with Python should lead to the ability to 
 anticipate the outcome.  Apparently this is not the case.
 That suggests a design problem.
 
 What about **Python** would lead us to expect this behavior??
 
 In *contrast*, everyone agrees that for a matrix M,
 we should get a matrix from M[0,:].
 This is expected and desirable.

Perhaps our differences lies in two things:

1. the fact that the text books typically take the column vector as the 
default.  For a Python version, based on C it makes more sense to treat 
the rows as vectors, as data is stored contiguously by row.

2. the convention has been proposed that the vector is more conveniently 
implemented as a matrix, where one dimension is one. The vector could be 
treated as a subclass of the matrix but this adds complexity with little 
clear benefit.  PyMatrix has matrix methods isVector, isCVector and 
isRVector.

I can see some merit in conforming to text book usage and would be glad 
to consider changes when I complete the port to numpy, in a few months.

Colin W.
 
 Cheers,
 Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Bill Baxter wrote:
On 3/26/07, Colin J. Williams [EMAIL PROTECTED] wrote:
Bill Baxter wrote:
This may sound silly, but I really think seeing all those brackets is
what makes it feel wrong. Matlab's output doesn't put it in your
face that your 4 is really a matrix([[4]]), even though that's what it
is to Matlab. But I don't see a good way to change that behavior.

The other thing I find problematic about matrices is the inability to
go higher than 2d. To me that means that it's impossible to go pure
matrix in my code because I'll have to switch back to arrays any time
I want more than 2d (or use a mixed solution like a list of matrices).
Matlab allows allows 2D.

--bb
pure matrix seems to me an area of exploration, does it have any
application in numerical computation at this time?

I'm not sure what you thought I meant, but all I meant by going pure
matrix was having my Numpy code use the 'matrix' type exclusively
instead of some mix of 'matrix' and the base 'ndarray' type.
It was a term I had not come across before but I assumed that you were
referring to something like this link - beyond my comprehension.

http://72.14.203.104/search?q=cache:Yu9gbUQEfWkJ:math.ca/Events/winter05/abs/pdf/ma-df.pdf+pure+matrixhl=enct=clnkcd=4gl=calr=lang_en
Things
become messy when you mix and match them because you don't know any
more if an expression like A[1] is going to give you a 1-D thing or a
2-D thing, and you can't be sure what A * B will do without always
coercing A and B.

Yes, to my mind it's best to consider the multi-dimensional array and
the matrix to be two distinct data types. In most cases, it's best that
conversions between the two should be explicit.

A list of matrices seems to be a logical structure.

Yes, and it's the only option if you want to make a list of matrices
of different shapes, but I frequently have a need for things like a
list of per-point transformation matrices. Each column from each of
those matrices can be thought of as a vector. Sometimes its
convenient to consider all the X basis vectors together, for instance,
which is a simple and efficient M[:,:,0] slice if I have all the data
in a 3-D array, but it's a slow list comprehension plus matrix
constructor if I have the matrices in a list -- something like
matrix([m[:,0] for m in M])
but that line is probably incorrect.

Logically, this makes sense, where M is a list of matrices.

My guess is that it would be a little faster to build one larger matrix
and then slice it as needed.

PyMatrix deals with
lists in building a larger matrix from sub-matrices.

Suppose that we have matrices A (3, 4), B (3, 6), C (4, 2) and D (4, 8).

Then E= M([[A, B], [C, D]]) gives E (7, 10).

Numpy generally tries to treat all lists and tuples as array literals.
That's not likely to change.
That need no be a problem is there is clarity of thinking about the
essential difference between the matrix data type (even if is is built
as a sub-type of the array) and the multi-dimensional array.

--bb

Colin W.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Alan G Isaac wrote:
 On 3/26/07, Alan G Isaac [EMAIL PROTECTED] wrote: 
 finds itself in basic conflict with the idea that 
 I ought to be able to iterate over the objects in an 
 iterable container.  I mean really, does this not feel 
 wrong? ::
  for item in x: print item.__repr__()
  ...
  matrix([[1, 2]])
  matrix([[3, 4]]) 
 
 
 On Mon, 26 Mar 2007, Bill Baxter apparently wrote: 
 So you're saying this is what you'd find more pythonic? 
 X[1] 
 matrix([2,3]) 
 X[:,1] 
 matrix([[3, 
 4]]) 
 Just trying to make it clear what you're proposing. 
 
 
 No; that is not possible, since a matrix is inherently 2d.
 I just want to get the constituent arrays when I iterate
 over the matrix object or use regular Python indexing, but 
 a matrix when I use matrix/array indexing.  That is ::
 
  X[1] 
 array([2,3]) 
  X[1,:] 
 matrix([[3, 4]]) 
 
 That behavior seems completely natural and unsurprising.

Perhaps things would be clearer if we thought of the constituent groups 
of data in a matrix as being themselves matrices.

X[1] could represent the second row of a matrix. A row of a matrix is a 
row vector, a special case of a matrix.  To get an array, I suggest that 
an explicit conversion X[1].A is a clearer way to handle things.

Similarly, X[2, 3] is best returned as a value which is of a Python type.

Colin W.
 
 
 Probably about half the bugs I get from mixing and matching matrix and 
 array are things like 
row = A[i] 
... 
z = row[2]
 Which works for an array but not for a matrix. 
 
 
 Exactly!
 That is the evidence of a bad surprise in the current 
 behavior.  Iterating over a Python iterable should provide 
 access to the contained objects.
 
 Cheers,
 Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Alan G Isaac wrote:
 Alan G Isaac wrote: 
 So this :: 
  x[1] 
 matrix([[1, 0]]) 
 feels wrong.  (Similarly when iterating across rows.) 
 Of course I realize that I can just :: 
  x.A[1] 
 array([1, 0]) 
 
 
 On Sun, 25 Mar 2007, Colin J. Williams apparently wrote: 
 An array and a matrix are different animals.  Conversion 
 from one to the other should be spelled out. 
 
 
 But you are just begging the question here.
 The question is: when I iterate across matrix rows,
 why am I iterating across matrices and not arrays.
 This seems quite out of whack with general Python practice.
 
 You cannot just say conversion should be explicit
 because that assumes (incorrectly actually) that
 the rows are matrices.  The conversion should be explicit
 argument actually cuts in the opposite direction of what
 you appear to believe.

Alan,

Yes, this is where we appear to differ.  I believe that vectors are best 
represented as matrices, with a shape of (1, n) or (m, 1).  The choice 
of these determines whether we have a column or a row vectors.

Thus any (m, n) matrix can be considered as either a collection of 
column vectors or a collection of row vectors.

If the end result is required as an array or a list, this can be done 
explicitly with X[1].A or A[1].tolist().

Here, A is a property of the M (matrix) class.
 
 Cheers,
 Alan Isaac

A long time ago, you proposed that PyMatrix should provide for matrix 
division in two way, as is done in MatLab.  This was implemented, but 
PyMatrix has not yet been ported to numpy - perhaps this summer.

Regards,

Colin W.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Alan G Isaac wrote:
 On Mon, 26 Mar 2007, Colin J. Williams apparently wrote: 
 Perhaps things would be clearer if we thought of the 
 constituent groups of data in a matrix as being themselves 
 matrices. 
 
 This thinking of is what you have suggested before.
 You need to explain why it is not begging the question.
 
 Cheers,
 Alan Isaac

Perhaps it would now help if you redefined the question.

In an earlier posting, you appeared anxious that the matrix and the 
array behave in the same way.  Since they are different animals, I see 
sameness of behaviour as being lower on the list of desirables than 
fitting the standard ideas of matrix algebra.

Suppose that a is a row vector, b a column vector and A a conforming 
matrix then:
  a * A
  A * b
and  b.T * A are all acceptable operations.

One would expect the iteration over A to return row vectors, represented 
by (1, n) matrices.

Colin W.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Detect subclass of ndarray

Alan G Isaac wrote:
 On Sat, 24 Mar 2007, Charles R Harris apparently wrote: 
 Yes, that is what I am thinking. Given that there are only the two 
 possibilities, row or column, choose the only one that is compatible with 
 the multiplying matrix. The result will not always be a column vector, for 
 instance, mat([[1]])*ones(3) will be a 1x3 row vector. 
 
 
 
 Ack!  The simple rule `post multiply means its a column vector`
 would be horrible enough: A*ones(n)*B becomes utterly obscure.
 Now even that simple rule is to be violated??

It depends whether ones delivers an instance of the Matrix/vector class 
or a simple array.

I assume that, in the above A and B represent matrices.

Colin W.
 
 Down this path lies madness.
 Please, just raise an exception.
 
 Cheers,
 Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Detect subclass of ndarray

Colin J. Williams wrote:
 Alan G Isaac wrote:
 On Sat, 24 Mar 2007, Charles R Harris apparently wrote: 
 Yes, that is what I am thinking. Given that there are only the two 
 possibilities, row or column, choose the only one that is compatible with 
 the multiplying matrix. The result will not always be a column vector, for 
 instance, mat([[1]])*ones(3) will be a 1x3 row vector. 


 Ack!  The simple rule `post multiply means its a column vector`
 would be horrible enough: A*ones(n)*B becomes utterly obscure.
 Now even that simple rule is to be violated??
 
 It depends whether ones delivers an instance of the Matrix/vector class 
 or a simple array.
 
 I assume that, in the above A and B represent matrices.
 
 Colin W.

Postscript:  I hadn't read the later postings when I posted the above.

PyMatrix used the convention mentioned in an earlier posting.  Simply a 
vector is considered as a single row matrix or a single column matrix.

This same approach can largely be used with numpy's mat:

*** Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit 
(Intel)] on win32. ***
  import numpy as _n
  _n.ones(3)
array([ 1.,  1.,  1.])
  a= _n.ones(3)
  a.T
array([ 1.,  1.,  1.])
  _n.mat(a)
matrix([[ 1.,  1.,  1.]])
  _n.mat(a).T
matrix([[ 1.],
 [ 1.],
 [ 1.]])
  b= _n.mat(a).T
  a * b
matrix([[ 3.]])   #  Something has gone wrong here - it 
looks as though there is normalization under the counter.
 

In any event, the problem posed by Alan Isaac can be handled with this 
approach:

A * mat(ones(3)).t * B can produce the desired result.  I haven't tested it.

Colin W.
 Down this path lies madness.
 Please, just raise an exception.

 Cheers,
 Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Simple multi-arg wrapper for dot()

Bill Baxter wrote:
 On 3/25/07, Robert Kern [EMAIL PROTECTED] wrote:
 Bill Baxter wrote:
 
 I don't know. Given our previous history with convenience functions with
 different calling semantics (anyone remember rand()?), I think it probably 
 will
 confuse some people.

 I'd really like to see it on a cookbook page, though. I'd use it.
 
 Done.
 http://www.scipy.org/Cookbook/MultiDot
 
 --bb
I wasn't able to connect to this link but I gather that the proposal was 
to used dot(A, B, C) to represent the product of the 3 arrays.

if A, B and C were matrices then this could more clearly be written as
A * B * C

Colin W.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Alan G Isaac wrote:
 One thing keeps bugging me when I use numpy.matrix.
 
 All this is fine::
 
  x=N.mat('1 1;1 0')
  x
 matrix([[1, 1],
 [1, 0]])
  x[1,:]
 matrix([[1, 0]])
 
 But it seems to me that I should be able
 to extract a matrix row as an array.

This can easily be done:
*** Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit 
(Intel)] on win32. ***
  import numpy as _n
  A= _n.mat([[1, 2], [3, 4]])
  A[1]
matrix([[3, 4]])
  A[1].getA1()
array([3, 4])

An array and a matrix are different animals.  Conversion from one to the 
other should be spelled out.

As you have done below.

Colin W.

 So this ::
 
  x[1]
 matrix([[1, 0]])
 
 feels wrong.  (Similarly when iterating across rows.)
 Of course I realize that I can just ::
 
  x.A[1]
 array([1, 0])
 
 but since the above keeps feeling wrong I felt I should 
 raise this as a possible design issue, better discussed
 early than latter.
 
 Cheers,
 Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question

Alan G Isaac wrote:
 Em Dom, 2007-03-25 Ã s 13:07 -0400, Alan G Isaac escreveu:
  x[1]
 matrix([[1, 0]])
 feels wrong.  (Similarly when iterating across rows.)
 
 
 On Sun, 25 Mar 2007, Paulo Jose da Silva e Silva apparently wrote:
 I think the point here is that if you are using matrices, 
 then all you should want are matrices, just like in 
 MATLAB:
  b = A(1, :)
 b =
  1 2
 
 
 Yes, that is the idea behind this, which I am also 
 accustomed to from GAUSS.  But note again that the Matlab 
 equivalent ::
 
  x=N.mat('1 2;3 4')
  x[0,:]
 matrix([[1, 2]])
 
 does provide this behavior.  The question I am raising
 is a design question and is I think really not addressed
 by the rule of thumb you offer.  Specifically, that rule
 of thumb if it is indeed the justification of  ::
 
  x[1]
 matrix([[3, 4]])
 
 finds itself in basic conflict with the idea that I ought to 
 be able to iterate over the objects in an iterable container.
 
 I mean really, does this not feel wrong? ::
 
  for item in x: print item.__repr__()
 ...
 matrix([[1, 2]])
 matrix([[3, 4]])
 
 Cheers,
 Alan Isaac
 
 
Perhaps this would be clearer with:

 for rowVector in x: print item.__repr__()
 ...
 matrix([[1, 2]])
 matrix([[3, 4]])

Colin W.


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matrix indexing question