Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64

2006-09-21 Thread Peter Bienstman
Cleaning out and rebuilding did the trick!

Thanks,

Peter

On Thursday 21 September 2006 18:33, 
[EMAIL PROTECTED] wrote:
> Subject: Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64

> 
>
> I don't see this running the latest from svn on AMD64 here. Not sayin'
> there might not be a problem with rc1, I just don't see it with my sources.
>
> Python 2.4.3 (#1, Jun 13 2006, 11:46:22)
> [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>
> >>> import numpy
> >>> numpy.version.version
>
> '1.0.dev3202'
>
> >>> numpy.version.os.uname()
>
> ('Linux', 'tethys', '2.6.17-1.2187_FC5', '#1 SMP Mon Sep 11 01:16:59 EDT
> 2006', 'x86_64')
>
> If you are building on Gentoo maybe you could delete the build directory
> (and maybe the numpy site package) and rebuild.
>
> Chuck.


pgpyNbhhIC5R6.pgp
Description: PGP signature
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] Putmask/take ?

2006-09-21 Thread PGM

Folks,
I'm running into the following problem with putmask on take. 

>>> import numpy
>>> x = N.arange(12.)
>>> m = [1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1]
>>> i = N.nonzero(m)[0]
>>> w = N.array([-1, -2, -3, -4.])
>>> x.putmask(w,m)
>>> x.take(i)
>>> N.allclose(x.take(i),w)
False


I'm wondering ifit is intentional, or if it's a problem on my build (1.0b5),  or if somebody experienced it as well.
Thanks a lot for your input.

P.
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] take behaviour has changed

2006-09-21 Thread Christian Kristukat
Bill Baxter  gmail.com> writes:

> 
> Yep, check the release notes:
> http://www.scipy.org/ReleaseNotes/NumPy_1.0
> search for 'take' on that page to find out what others have changed as well.
> --bb

Ok. Does axis=None then mean, that take(a, ind) operates on the flattened array?
This it at least what it seem to be. I noticed that the ufunc behaves
differently. a.take(ind) and a.take(ind, axis=0) behave the same, so the default
argument to axis is 0 rather than None.

Christian






-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] take behaviour has changed

2006-09-21 Thread Bill Baxter
Yep, check the release notes:
http://www.scipy.org/ReleaseNotes/NumPy_1.0
search for 'take' on that page to find out what others have changed as well.
--bb

On 9/22/06, Christian Kristukat <[EMAIL PROTECTED]> wrote:
> Hi,
> from 1.0b1 to 1.0rc1 the default behaviour of take seems to have changed when
> omitting the axis argument:
>
> In [13]: a = reshape(arange(12),(3,4))
>
> In [14]: take(a,[2,3])
> Out[14]: array([2, 3])
>
> In [15]: take(a,[2,3],1)
> Out[15]:
> array([[ 2,  3],
>[ 6,  7],
>[10, 11]])
>
> Is this intended?
>
> Christian
>
>
>
> -
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> ___
> Numpy-discussion mailing list
> Numpy-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] general version of repmat?

2006-09-21 Thread Bill Baxter
Is there some way to get the equivalent of repmat() for ndim == 1 and ndim >2.
For ndim == 1, repmat always returns a 2-d array, instead of remaining 1-d.
For ndim >2, repmat just doesn't work.

Maybe we could add a 'reparray', with the signature:
   reparray(A, repeats, axis=None)
where repeats is a scalar or a sequence.
If 'repeats' is a scalar then the matrix is duplicated along 'axis'
that many times.
If 'repeats' is a sequence of length N, then A is duplicated
repeats[i] times along axis[i].  If axis is None then it is assumed to
be (0,1,2...N).

Er that's not quite complete, because it doesn't specify what happens
when you reparray an array to a higher dimension, like a 1-d to a 3-d.
 Like reparray([1,2], (2,2,2)).  I guess the axis parameter could have
some 'newaxis' entries to accomodate that.

--bb

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] take behaviour has changed

2006-09-21 Thread Christian Kristukat
Hi,
from 1.0b1 to 1.0rc1 the default behaviour of take seems to have changed when
omitting the axis argument:

In [13]: a = reshape(arange(12),(3,4))

In [14]: take(a,[2,3])
Out[14]: array([2, 3])

In [15]: take(a,[2,3],1)
Out[15]:
array([[ 2,  3],
   [ 6,  7],
   [10, 11]])

Is this intended?

Christian



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] matrixmultiply moved (release notes?)

2006-09-21 Thread Bill Baxter
Apparently numpy.matrixmultiply got moved into
numpy.oldnumeric.matrixmultiply at some point (or rather ceased to be
imported into the numpy namespace).  Is there any list of all such
methods that got banished?  This would be nice to have in the release
notes.

--bb

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] putmask/take ?

2006-09-21 Thread P GM
Folks,I'm running into the following problem with putmask on take. >>> import numpy>>> x = N.arange(12.)>>> m = [1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1]>>> i = N.nonzero
(m)[0]>>> w = N.array([-1, -2, -3, -4.])>>> x.putmask(w,m)>>> x.take(i)>>> N.allclose(x.take(i),w)FalseI'm wondering ifit is intentional, or if it's a problem on my build (
1.0b5),  or if somebody experiences that as well.Thanks a lot for your input.P.
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Alan G Isaac
On Thu, 21 Sep 2006, Charles R Harris apparently wrote: 
> As to the oddness of \param or @param, here is an example from 
> Epydoc using Epytext 
> @type  m: number 
> @param m: The slope of the line. 
> @type  b: number 
> @param b: The y intercept of the line. 

Compare to definition list style for consolidated field 
lists in section 5.1 of
http://epydoc.sourceforge.net/fields.html#rst
which is much more elegant, IMO.

Cheers,
Alan Isaac





-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Alan G Isaac
On Thu, 21 Sep 2006, "David M. Cooke" apparently wrote: 
> Foremost for Python doc strings, I think, is that it look 
> ok when using pydoc or similar (ipython's ?, for 
> instance). That means a minimal amount of markup. 

IMO reStructuredText is very natural for documentation,
and it is nicely handled by epydoc.

fwiw,
Alan Isaac



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] change of default dtype

2006-09-21 Thread David Grant
On 9/20/06, Bill Baxter <[EMAIL PROTECTED]> wrote:
Hey Andrew, point taken, but I think it would be better if someone whoactually knows the full extent of the change made the edit.  I knowzeros and ones changed.  Did anything else?Anyway, I'm surprised the release notes page is publicly editable.
I'm glad that it is editable. I hate wikis that are only editable by a select few. Defeats the purpose (or at least does not maximize the capability of a wiki).-- 
David Granthttp://www.davidgrant.ca
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Sebastian Haase
On Thursday 21 September 2006 15:28, Tim Hochberg wrote:
> David M. Cooke wrote:
> > On Thu, 21 Sep 2006 11:34:42 -0700
> >
> > Tim Hochberg <[EMAIL PROTECTED]> wrote:
> >> Tim Hochberg wrote:
> >>> Robert Kern wrote:
>  David M. Cooke wrote:
> > On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:
> >> Let me offer a third path: the algorithms used for .mean() and
> >> .var() are substandard. There are much better incremental algorithms
> >> that entirely avoid the need to accumulate such large (and therefore
> >> precision-losing) intermediate values. The algorithms look like the
> >> following for 1D arrays in Python:
> >>
> >> def mean(a):
> >>  m = a[0]
> >>  for i in range(1, len(a)):
> >>  m += (a[i] - m) / (i + 1)
> >>  return m
> >
> > This isn't really going to be any better than using a simple sum.
> > It'll also be slower (a division per iteration).
> 
>  With one exception, every test that I've thrown at it shows that it's
>  better for float32. That exception is uniformly spaced arrays, like
>  linspace().
> 
>   > You do avoid
>   > accumulating large sums, but then doing the division a[i]/len(a)
>   > and adding that will do the same.
> 
>  Okay, this is true.
> 
> > Now, if you want to avoid losing precision, you want to use a better
> > summation technique, like compensated (or Kahan) summation:
> >
> > def mean(a):
> > s = e = a.dtype.type(0)
> > for i in range(0, len(a)):
> > temp = s
> > y = a[i] + e
> > s = temp + y
> > e = (temp - s) + y
> > return s / len(a)
> >
> >> def var(a):
> >>  m = a[0]
> >>  t = a.dtype.type(0)
> >>  for i in range(1, len(a)):
> >>  q = a[i] - m
> >>  r = q / (i+1)
> >>  m += r
> >>  t += i * q * r
> >>  t /= len(a)
> >>  return t
> >>
> >> Alternatively, from Knuth:
> >>
> >> def var_knuth(a):
> >>  m = a.dtype.type(0)
> >>  variance = a.dtype.type(0)
> >>  for i in range(len(a)):
> >>  delta = a[i] - m
> >>  m += delta / (i+1)
> >>  variance += delta * (a[i] - m)
> >>  variance /= len(a)
> >>  return variance
> >>
> >> I'm going to go ahead and attach a module containing the versions of
> >> mean, var, etc that I've been playing with in case someone wants to mess
> >> with them. Some were stolen from traffic on this list, for others I
> >> grabbed the algorithms from wikipedia or equivalent.
> >
> > I looked into this a bit more. I checked float32 (single precision) and
> > float64 (double precision), using long doubles (float96) for the "exact"
> > results. This is based on your code. Results are compared using
> > abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in
> > the range of [-100, 900]
> >
> > First, the mean. In float32, the Kahan summation in single precision is
> > better by about 2 orders of magnitude than simple summation. However,
> > accumulating the sum in double precision is better by about 9 orders of
> > magnitude than simple summation (7 orders more than Kahan).
> >
> > In float64, Kahan summation is the way to go, by 2 orders of magnitude.
> >
> > For the variance, in float32, Knuth's method is *no better* than the
> > two-pass method. Tim's code does an implicit conversion of intermediate
> > results to float64, which is why he saw a much better result.
>
> Doh! And I fixed that same problem in the mean implementation earlier
> too. I was astounded by how good knuth was doing, but not astounded
> enough apparently.
>
> Does it seem weird to anyone else that in:
> numpy_scalar  python_scalar
> the precision ends up being controlled by the python scalar? I would
> expect the numpy_scalar to control the resulting precision just like
> numpy arrays do in similar circumstances. Perhaps the egg on my face is
> just clouding my vision though.
>
> > The two-pass method using
> > Kahan summation (again, in single precision), is better by about 2 orders
> > of magnitude. There is practically no difference when using a
> > double-precision accumulator amongst the techniques: they're all about 9
> > orders of magnitude better than single-precision two-pass.
> >
> > In float64, Kahan summation is again better than the rest, by about 2
> > orders of magnitude.
> >
> > I've put my adaptation of Tim's code, and box-and-whisker plots of the
> > results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/
> >
> > Conclusions:
> >
> > - If you're going to calculate everything in single precision, use Kahan
> > summation. Using it in double-precision also helps.
> > - If you can use a double-precision accumulator, it's much better than
> > any of the techniques in single-precision only.
> >
> > - for speed+precision in the va

[Numpy-discussion] numpy 1.0rc1 bdist_rpm fails

2006-09-21 Thread Christian Kristukat

Hi, 
on linux I get an error when trying to build a rpm package from numpy 1.0rc1:

building extension "numpy.core.umath" sources
  adding 'build/src.linux-i686-2.4/numpy/core/config.h' to sources.
executing numpy/core/code_generators/generate_ufunc_api.py
  adding 'build/src.linux-i686-2.4/numpy/core/__ufunc_api.h' to sources.
creating build/src.linux-i686-2.4/src
conv_template:> build/src.linux-i686-2.4/src/umathmodule.c
error: src/umathmodule.c.src: No such file or directory
error: Bad exit status from /home/ck/testarea/rpm/tmp/rpm-tmp.68597 (%build)


RPM build errors:
Bad exit status from /home/ck/testarea/rpm/tmp/rpm-tmp.68597 (%build)
error: command 'rpmbuild' failed with exit status 1

Christian


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Tim Hochberg
David M. Cooke wrote:
> On Thu, 21 Sep 2006 11:34:42 -0700
> Tim Hochberg <[EMAIL PROTECTED]> wrote:
>
>   
>> Tim Hochberg wrote:
>> 
>>> Robert Kern wrote:
>>>   
>>>   
 David M. Cooke wrote:
   
 
 
> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:
> 
>   
>   
>> Let me offer a third path: the algorithms used for .mean() and .var()
>> are substandard. There are much better incremental algorithms that
>> entirely avoid the need to accumulate such large (and therefore
>> precision-losing) intermediate values. The algorithms look like the
>> following for 1D arrays in Python:
>>
>> def mean(a):
>>  m = a[0]
>>  for i in range(1, len(a)):
>>  m += (a[i] - m) / (i + 1)
>>  return m
>>   
>> 
>> 
> This isn't really going to be any better than using a simple sum.
> It'll also be slower (a division per iteration).
> 
>   
>   
 With one exception, every test that I've thrown at it shows that it's
 better for float32. That exception is uniformly spaced arrays, like
 linspace().

  > You do avoid
  > accumulating large sums, but then doing the division a[i]/len(a) and
  > adding that will do the same.

 Okay, this is true.

   
 
 
> Now, if you want to avoid losing precision, you want to use a better
> summation technique, like compensated (or Kahan) summation:
>
> def mean(a):
> s = e = a.dtype.type(0)
> for i in range(0, len(a)):
> temp = s
> y = a[i] + e
> s = temp + y
> e = (temp - s) + y
> return s / len(a)
>   
 
 
>> def var(a):
>>  m = a[0]
>>  t = a.dtype.type(0)
>>  for i in range(1, len(a)):
>>  q = a[i] - m
>>  r = q / (i+1)
>>  m += r
>>  t += i * q * r
>>  t /= len(a)
>>  return t
>>
>> Alternatively, from Knuth:
>>
>> def var_knuth(a):
>>  m = a.dtype.type(0)
>>  variance = a.dtype.type(0)
>>  for i in range(len(a)):
>>  delta = a[i] - m
>>  m += delta / (i+1)
>>  variance += delta * (a[i] - m)
>>  variance /= len(a)
>>  return variance
>> 
>> I'm going to go ahead and attach a module containing the versions of 
>> mean, var, etc that I've been playing with in case someone wants to mess 
>> with them. Some were stolen from traffic on this list, for others I 
>> grabbed the algorithms from wikipedia or equivalent.
>> 
>
> I looked into this a bit more. I checked float32 (single precision) and
> float64 (double precision), using long doubles (float96) for the "exact"
> results. This is based on your code. Results are compared using
> abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the
> range of [-100, 900]
>
> First, the mean. In float32, the Kahan summation in single precision is
> better by about 2 orders of magnitude than simple summation. However,
> accumulating the sum in double precision is better by about 9 orders of
> magnitude than simple summation (7 orders more than Kahan).
>
> In float64, Kahan summation is the way to go, by 2 orders of magnitude.
>
> For the variance, in float32, Knuth's method is *no better* than the two-pass
> method. Tim's code does an implicit conversion of intermediate results to
> float64, which is why he saw a much better result. The two-pass method using
> Kahan summation (again, in single precision), is better by about 2 orders of
> magnitude. There is practically no difference when using a double-precision
> accumulator amongst the techniques: they're all about 9 orders of magnitude
> better than single-precision two-pass.
>
> In float64, Kahan summation is again better than the rest, by about 2 orders
> of magnitude.
>
> I've put my adaptation of Tim's code, and box-and-whisker plots of the
> results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/
>
> Conclusions:
>
> - If you're going to calculate everything in single precision, use Kahan
> summation. Using it in double-precision also helps.
> - If you can use a double-precision accumulator, it's much better than any of
> the techniques in single-precision only.
>
> - for speed+precision in the variance, either use Kahan summation in single
> precision with the two-pass method, or use double precision with simple
> summation with the two-pass method. Knuth buys you nothing, except slower
> code :-)
>
> After 1.0 is out, we should look at doing one of the above.
>   
One more little tidbit; it appears possible to "fix up" Knuth's 
algorithm so that it's comparable in accuracy to the two pass Kahan 
version by doing Kahan summation while accumulating the variance. 
Testing on this was 

Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Tim Hochberg
David M. Cooke wrote:
> On Thu, 21 Sep 2006 11:34:42 -0700
> Tim Hochberg <[EMAIL PROTECTED]> wrote:
>
>   
>> Tim Hochberg wrote:
>> 
>>> Robert Kern wrote:
>>>   
>>>   
 David M. Cooke wrote:
   
 
 
> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:
> 
>   
>   
>> Let me offer a third path: the algorithms used for .mean() and .var()
>> are substandard. There are much better incremental algorithms that
>> entirely avoid the need to accumulate such large (and therefore
>> precision-losing) intermediate values. The algorithms look like the
>> following for 1D arrays in Python:
>>
>> def mean(a):
>>  m = a[0]
>>  for i in range(1, len(a)):
>>  m += (a[i] - m) / (i + 1)
>>  return m
>>   
>> 
>> 
> This isn't really going to be any better than using a simple sum.
> It'll also be slower (a division per iteration).
> 
>   
>   
 With one exception, every test that I've thrown at it shows that it's
 better for float32. That exception is uniformly spaced arrays, like
 linspace().

  > You do avoid
  > accumulating large sums, but then doing the division a[i]/len(a) and
  > adding that will do the same.

 Okay, this is true.

   
 
 
> Now, if you want to avoid losing precision, you want to use a better
> summation technique, like compensated (or Kahan) summation:
>
> def mean(a):
> s = e = a.dtype.type(0)
> for i in range(0, len(a)):
> temp = s
> y = a[i] + e
> s = temp + y
> e = (temp - s) + y
> return s / len(a)
>   
 
 
>> def var(a):
>>  m = a[0]
>>  t = a.dtype.type(0)
>>  for i in range(1, len(a)):
>>  q = a[i] - m
>>  r = q / (i+1)
>>  m += r
>>  t += i * q * r
>>  t /= len(a)
>>  return t
>>
>> Alternatively, from Knuth:
>>
>> def var_knuth(a):
>>  m = a.dtype.type(0)
>>  variance = a.dtype.type(0)
>>  for i in range(len(a)):
>>  delta = a[i] - m
>>  m += delta / (i+1)
>>  variance += delta * (a[i] - m)
>>  variance /= len(a)
>>  return variance
>> 
>> I'm going to go ahead and attach a module containing the versions of 
>> mean, var, etc that I've been playing with in case someone wants to mess 
>> with them. Some were stolen from traffic on this list, for others I 
>> grabbed the algorithms from wikipedia or equivalent.
>> 
>
> I looked into this a bit more. I checked float32 (single precision) and
> float64 (double precision), using long doubles (float96) for the "exact"
> results. This is based on your code. Results are compared using
> abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the
> range of [-100, 900]
>
> First, the mean. In float32, the Kahan summation in single precision is
> better by about 2 orders of magnitude than simple summation. However,
> accumulating the sum in double precision is better by about 9 orders of
> magnitude than simple summation (7 orders more than Kahan).
>
> In float64, Kahan summation is the way to go, by 2 orders of magnitude.
>
> For the variance, in float32, Knuth's method is *no better* than the two-pass
> method. Tim's code does an implicit conversion of intermediate results to
> float64, which is why he saw a much better result. 
Doh! And I fixed that same problem in the mean implementation earlier 
too. I was astounded by how good knuth was doing, but not astounded 
enough apparently.

Does it seem weird to anyone else that in:
numpy_scalar  python_scalar
the precision ends up being controlled by the python scalar? I would 
expect the numpy_scalar to control the resulting precision just like 
numpy arrays do in similar circumstances. Perhaps the egg on my face is 
just clouding my vision though.

> The two-pass method using
> Kahan summation (again, in single precision), is better by about 2 orders of
> magnitude. There is practically no difference when using a double-precision
> accumulator amongst the techniques: they're all about 9 orders of magnitude
> better than single-precision two-pass.
>
> In float64, Kahan summation is again better than the rest, by about 2 orders
> of magnitude.
>
> I've put my adaptation of Tim's code, and box-and-whisker plots of the
> results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/
>
> Conclusions:
>
> - If you're going to calculate everything in single precision, use Kahan
> summation. Using it in double-precision also helps.
> - If you can use a double-precision accumulator, it's much better than any of
> the techniques in single-precision only.
>
> - for speed+precision in the variance, eith

Re: [Numpy-discussion] immutable arrays

2006-09-21 Thread Martin Wiechert
On Thursday 21 September 2006 18:24, Travis Oliphant wrote:
> Martin Wiechert wrote:
> > Thanks Travis.
> >
> > Do I understand correctly that the only way to be really safe is to make
> > a copy and not to export a reference to it?
> > Because anybody having a reference to the owner of the data can override
> > the flag?
>
> No, that's not quite correct.   Of course in C, anybody can do anything
> they want to the flags.
>
> In Python, only the owner of the object itself can change the writeable
> flag once it is set to False.   So, if you only return a "view" of the
> array (a.view())  then the Python user will not be able to change the
> flags.
>
> Example:
>
> a = array([1,2,3])
> a.flags.writeable = False
>
> b = a.view()
>
> b.flags.writeable = True   # raises an error.
>
> c = a
> c.flags.writeable = True  # can be done because c is a direct alias to a.
>
> Hopefully, that explains the situation a bit better.
>

It does. Thanks Travis.

> -Travis
>
>
>
>
>
>
>
>
>
> -
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
> your opinions on IT & business topics through brief surveys -- and earn
> cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> ___
> Numpy-discussion mailing list
> Numpy-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
AV scanned by FortiGate
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Travis Oliphant
David M. Cooke wrote:

>
>Conclusions:
>
>- If you're going to calculate everything in single precision, use Kahan
>summation. Using it in double-precision also helps.
>- If you can use a double-precision accumulator, it's much better than any of
>the techniques in single-precision only.
>
>- for speed+precision in the variance, either use Kahan summation in single
>precision with the two-pass method, or use double precision with simple
>summation with the two-pass method. Knuth buys you nothing, except slower
>code :-)
>
>After 1.0 is out, we should look at doing one of the above.
>  
>

+1



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread David M. Cooke
On Thu, 21 Sep 2006 11:34:42 -0700
Tim Hochberg <[EMAIL PROTECTED]> wrote:

> Tim Hochberg wrote:
> > Robert Kern wrote:
> >   
> >> David M. Cooke wrote:
> >>   
> >> 
> >>> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:
> >>> 
> >>>   
>  Let me offer a third path: the algorithms used for .mean() and .var()
>  are substandard. There are much better incremental algorithms that
>  entirely avoid the need to accumulate such large (and therefore
>  precision-losing) intermediate values. The algorithms look like the
>  following for 1D arrays in Python:
> 
>  def mean(a):
>   m = a[0]
>   for i in range(1, len(a)):
>   m += (a[i] - m) / (i + 1)
>   return m
>    
>  
> >>> This isn't really going to be any better than using a simple sum.
> >>> It'll also be slower (a division per iteration).
> >>> 
> >>>   
> >> With one exception, every test that I've thrown at it shows that it's
> >> better for float32. That exception is uniformly spaced arrays, like
> >> linspace().
> >>
> >>  > You do avoid
> >>  > accumulating large sums, but then doing the division a[i]/len(a) and
> >>  > adding that will do the same.
> >>
> >> Okay, this is true.
> >>
> >>   
> >> 
> >>> Now, if you want to avoid losing precision, you want to use a better
> >>> summation technique, like compensated (or Kahan) summation:
> >>>
> >>> def mean(a):
> >>> s = e = a.dtype.type(0)
> >>> for i in range(0, len(a)):
> >>> temp = s
> >>> y = a[i] + e
> >>> s = temp + y
> >>> e = (temp - s) + y
> >>> return s / len(a)
> >> 
>  def var(a):
>   m = a[0]
>   t = a.dtype.type(0)
>   for i in range(1, len(a)):
>   q = a[i] - m
>   r = q / (i+1)
>   m += r
>   t += i * q * r
>   t /= len(a)
>   return t
> 
>  Alternatively, from Knuth:
> 
>  def var_knuth(a):
>   m = a.dtype.type(0)
>   variance = a.dtype.type(0)
>   for i in range(len(a)):
>   delta = a[i] - m
>   m += delta / (i+1)
>   variance += delta * (a[i] - m)
>   variance /= len(a)
>   return variance
> 
> I'm going to go ahead and attach a module containing the versions of 
> mean, var, etc that I've been playing with in case someone wants to mess 
> with them. Some were stolen from traffic on this list, for others I 
> grabbed the algorithms from wikipedia or equivalent.

I looked into this a bit more. I checked float32 (single precision) and
float64 (double precision), using long doubles (float96) for the "exact"
results. This is based on your code. Results are compared using
abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the
range of [-100, 900]

First, the mean. In float32, the Kahan summation in single precision is
better by about 2 orders of magnitude than simple summation. However,
accumulating the sum in double precision is better by about 9 orders of
magnitude than simple summation (7 orders more than Kahan).

In float64, Kahan summation is the way to go, by 2 orders of magnitude.

For the variance, in float32, Knuth's method is *no better* than the two-pass
method. Tim's code does an implicit conversion of intermediate results to
float64, which is why he saw a much better result. The two-pass method using
Kahan summation (again, in single precision), is better by about 2 orders of
magnitude. There is practically no difference when using a double-precision
accumulator amongst the techniques: they're all about 9 orders of magnitude
better than single-precision two-pass.

In float64, Kahan summation is again better than the rest, by about 2 orders
of magnitude.

I've put my adaptation of Tim's code, and box-and-whisker plots of the
results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/

Conclusions:

- If you're going to calculate everything in single precision, use Kahan
summation. Using it in double-precision also helps.
- If you can use a double-precision accumulator, it's much better than any of
the techniques in single-precision only.

- for speed+precision in the variance, either use Kahan summation in single
precision with the two-pass method, or use double precision with simple
summation with the two-pass method. Knuth buys you nothing, except slower
code :-)

After 1.0 is out, we should look at doing one of the above.

-- 
|>|\/|<
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default

Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Charles R Harris
Hi,
On 9/21/06, Robert Kern <[EMAIL PROTECTED]> wrote:
Steve Lianoglou wrote:> So .. I guess I'm wondering why we want to break from the standard?We don't as far as Python code goes. The code that Chuck added Doxygen-stylecomments to was C code. I presume he was simply answering Sebastian's question
rather than suggesting we use Doxygen for Python code, too.
Exactly. I also don't think the Python hack description applies to
doxygen any longer. As to the oddness of \param or @param, here is an
example from Epydoc using Epytext
@type  m: number@param m: The slope of the line.@type  b: number@param b:
 The y intercept of the line.  The X{y intercept} of a
Looks like they borrowed something there ;) The main advantage of
epydoc vs doxygen seems to be that you can use the markup inside the
normal python docstring without having to make a separate comment
block. Or would that be a disadvantage? Then again, I've been thinking
of moving the python function docstrings into the add_newdocs.py file
so everything is together in one spot and that would separate the
Python docstrings from the functions anyway.

I'll fool around with doxygen a bit and see what it does. The C code is the code that most needs documentation in any case.

Chuck

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Robert Kern
Steve Lianoglou wrote:
> So .. I guess I'm wondering why we want to break from the standard?

We don't as far as Python code goes. The code that Chuck added Doxygen-style 
comments to was C code. I presume he was simply answering Sebastian's question 
rather than suggesting we use Doxygen for Python code, too.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread David M. Cooke
On Thu, 21 Sep 2006 10:05:58 -0600
"Charles R Harris" <[EMAIL PROTECTED]> wrote:

> Travis,
> 
> A few questions.
> 
> 1) I can't find any systematic code testing units, although there seem to be
> tests for regressions and such. Is there a place we should be putting such
> tests?
> 
> 2) Any plans for code documentation? I documented some of my stuff with
> doxygen markups and wonder if we should include a Doxyfile as part of the
> package.

We don't have much of a defined standard for docs. Personally, I wouldn't use
doxygen: what I've seen for Python versions are hacks, whose output looks
like C++, and which requires markup that's not like commonly-used conventions
in Python (\brief, for instance).

Foremost for Python doc strings, I think, is that it look ok when using pydoc
or similar (ipython's ?, for instance). That means a minimal amount of
markup.

Someone previously mentioned including cross-references; I think that's a
good idea. A 'See also' line, for instance. Examples are good too, especially
if there's been disputes on the interpretation of the command :-)

For the C code, documentation is autogenerated from the /** ... API */
comments that determine which functions are part of the C API. This are put
into files multiarray_api.txt and ufunc_api.txt (in the include/ directory).
The files are in reST format, so the comments should/could be. At some point
I've got to through and add more :-)

-- 
|>|\/|<
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Steve Lianoglou
> Are able to use doxygen for Python code ? I thought it only worked  
> for C (and
> alike) ?
>
> IIRC correctly, it now does Python too. Let's see... here is an  
> example
> ## Documentation for this module.
> #
> # More details.
>
> ## Documentation for a function.
> #
> # More details.
> def func():
> pass
> Looks like ## replaces the /**

I never found it (although I haven't looked too hard), but I always  
thought there was an official way to document python code --  
minimally to put the documentation in the docstring following the  
function definition:

def func(..):
 """One liner.

 Continue docs -- some type of reStructredText style
 """
 pas

Isn't that the same docstring that ipython uses to bring up help,  
when you do:

In [1]: myobject.some_func?


So .. I guess I'm wondering why we want to break from the standard?

-steve



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Tim Hochberg

Tim Hochberg wrote:

Robert Kern wrote:
  

David M. Cooke wrote:
  


On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:

  
Let me offer a third path: the algorithms used for .mean() and .var() are 
substandard. There are much better incremental algorithms that entirely avoid 
the need to accumulate such large (and therefore precision-losing) intermediate 
values. The algorithms look like the following for 1D arrays in Python:


def mean(a):
 m = a[0]
 for i in range(1, len(a)):
 m += (a[i] - m) / (i + 1)
 return m
  


This isn't really going to be any better than using a simple sum.
It'll also be slower (a division per iteration).

  
With one exception, every test that I've thrown at it shows that it's better for 
float32. That exception is uniformly spaced arrays, like linspace().


 > You do avoid
 > accumulating large sums, but then doing the division a[i]/len(a) and
 > adding that will do the same.

Okay, this is true.

  


Now, if you want to avoid losing precision, you want to use a better
summation technique, like compensated (or Kahan) summation:

def mean(a):
s = e = a.dtype.type(0)
for i in range(0, len(a)):
temp = s
y = a[i] + e
s = temp + y
e = (temp - s) + y
return s / len(a)

Some numerical experiments in Maple using 5-digit precision show that
your mean is maybe a bit better in some cases, but can also be much
worse, than sum(a)/len(a), but both are quite poor in comparision to the
Kahan summation.

(We could probably use a fast implementation of Kahan summation in
addition to a.sum())

  

+1

  


def var(a):
 m = a[0]
 t = a.dtype.type(0)
 for i in range(1, len(a)):
 q = a[i] - m
 r = q / (i+1)
 m += r
 t += i * q * r
 t /= len(a)
 return t

Alternatively, from Knuth:

def var_knuth(a):
 m = a.dtype.type(0)
 variance = a.dtype.type(0)
 for i in range(len(a)):
 delta = a[i] - m
 m += delta / (i+1)
 variance += delta * (a[i] - m)
 variance /= len(a)
 return variance
  


These formulas are good when you can only do one pass over the data
(like in a calculator where you don't store all the data points), but
are slightly worse than doing two passes. Kahan summation would probably
also be good here too.

  
Again, my tests show otherwise for float32. I'll condense my ipython log into a 
module for everyone's perusal. It's possible that the Kahan summation of the 
squared residuals will work better than the current two-pass algorithm and the 
implementations I give above.
  

This is what my tests show as well var_knuth outperformed any simple two 
pass algorithm I could come up with, even ones using Kahan sums. 
Interestingly, for 1D arrays the built in float32 variance performs 
better than it should. After a bit of twiddling around I discovered that 
it actually does most of it's calculations in float64. It uses a two 
pass calculation, the result of mean is a scalar, and in the process of 
converting that back to an array we end up with float64 values. Or 
something like that; I was mostly reverse engineering the sequence of 
events from the results.
  
Here's a simple of example of how var is a little wacky. A shape-[N] 
array will give you a different result than a shape-[1,N] array. The 
reason is clear -- in the second case the mean is not a scalar so there 
isn't the inadvertent promotion to float64, but it's still odd.


>>> data = (1000*(random.random([1]) - 0.1)).astype(float32)
>>> print data.var() - data.reshape([1, -1]).var(-1)
[ 0.1171875]

I'm going to go ahead and attach a module containing the versions of 
mean, var, etc that I've been playing with in case someone wants to mess 
with them. Some were stolen from traffic on this list, for others I 
grabbed the algorithms from wikipedia or equivalent.


-tim




-tim




-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


  


def raw_kahan_sum(values):
"""raw_kahan_sum(values) -> sum(values), residual

where sum(values) is computed using Kahan's summation algorithm and the 
residual is value of the lower order bits when finished.

"""
total = c = values.dtype.type(0)
for x in values:
y = x + c  
t = total + y  
c = y - (t - total)  
total = t 
return total, c

def sum(values):
"""sum(values) -> sum of

Re: [Numpy-discussion] please change mean to use dtype=float

2006-09-21 Thread Tim Hochberg
Robert Kern wrote:
> David M. Cooke wrote:
>   
>> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote:
>> 
>>> Let me offer a third path: the algorithms used for .mean() and .var() are 
>>> substandard. There are much better incremental algorithms that entirely 
>>> avoid 
>>> the need to accumulate such large (and therefore precision-losing) 
>>> intermediate 
>>> values. The algorithms look like the following for 1D arrays in Python:
>>>
>>> def mean(a):
>>>  m = a[0]
>>>  for i in range(1, len(a)):
>>>  m += (a[i] - m) / (i + 1)
>>>  return m
>>>   
>> This isn't really going to be any better than using a simple sum.
>> It'll also be slower (a division per iteration).
>> 
>
> With one exception, every test that I've thrown at it shows that it's better 
> for 
> float32. That exception is uniformly spaced arrays, like linspace().
>
>  > You do avoid
>  > accumulating large sums, but then doing the division a[i]/len(a) and
>  > adding that will do the same.
>
> Okay, this is true.
>
>   
>> Now, if you want to avoid losing precision, you want to use a better
>> summation technique, like compensated (or Kahan) summation:
>>
>> def mean(a):
>> s = e = a.dtype.type(0)
>> for i in range(0, len(a)):
>> temp = s
>> y = a[i] + e
>> s = temp + y
>> e = (temp - s) + y
>> return s / len(a)
>>
>> Some numerical experiments in Maple using 5-digit precision show that
>> your mean is maybe a bit better in some cases, but can also be much
>> worse, than sum(a)/len(a), but both are quite poor in comparision to the
>> Kahan summation.
>>
>> (We could probably use a fast implementation of Kahan summation in
>> addition to a.sum())
>> 
>
> +1
>
>   
>>> def var(a):
>>>  m = a[0]
>>>  t = a.dtype.type(0)
>>>  for i in range(1, len(a)):
>>>  q = a[i] - m
>>>  r = q / (i+1)
>>>  m += r
>>>  t += i * q * r
>>>  t /= len(a)
>>>  return t
>>>
>>> Alternatively, from Knuth:
>>>
>>> def var_knuth(a):
>>>  m = a.dtype.type(0)
>>>  variance = a.dtype.type(0)
>>>  for i in range(len(a)):
>>>  delta = a[i] - m
>>>  m += delta / (i+1)
>>>  variance += delta * (a[i] - m)
>>>  variance /= len(a)
>>>  return variance
>>>   
>> These formulas are good when you can only do one pass over the data
>> (like in a calculator where you don't store all the data points), but
>> are slightly worse than doing two passes. Kahan summation would probably
>> also be good here too.
>> 
>
> Again, my tests show otherwise for float32. I'll condense my ipython log into 
> a 
> module for everyone's perusal. It's possible that the Kahan summation of the 
> squared residuals will work better than the current two-pass algorithm and 
> the 
> implementations I give above.
>   
This is what my tests show as well var_knuth outperformed any simple two 
pass algorithm I could come up with, even ones using Kahan sums. 
Interestingly, for 1D arrays the built in float32 variance performs 
better than it should. After a bit of twiddling around I discovered that 
it actually does most of it's calculations in float64. It uses a two 
pass calculation, the result of mean is a scalar, and in the process of 
converting that back to an array we end up with float64 values. Or 
something like that; I was mostly reverse engineering the sequence of 
events from the results.

-tim




-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Question about recarray

2006-09-21 Thread Travis Oliphant
Lionel Roubeyrie wrote:
> find any solution for that. I have tried with arrays of dtype=object, but I 
> have problem when I want to compute min, max, ... with an error like:
> TypeError: function not supported for these types, and can't coerce safely to 
> supported types.
>   
I just added support for min and max methods of object arrays, by adding 
support for Object arrays to the minimum and maximum functions.

-Travis


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] arr.dtype.kind is 'i' for dtype=unit !?

2006-09-21 Thread Travis Oliphant
Matthew Brett wrote:
> Hi,
>
>   
>> It's in the array interface specification:
>>
>> http://numpy.scipy.org/array_interface.shtml
>> 
>
> I was interested in the 't' (bitfield) type - is there an example of
> usage somewhere?
>   
No,  It's not implemented in NumPy.  It's just part of the array 
interface specification for completeness.

-Travis



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] arr.dtype.kind is 'i' for dtype=unit !?

2006-09-21 Thread Matthew Brett
Hi,

> It's in the array interface specification:
>
> http://numpy.scipy.org/array_interface.shtml

I was interested in the 't' (bitfield) type - is there an example of
usage somewhere?

In [13]: dtype('t8')
---
exceptions.TypeError Traceback (most
recent call last)

/home/mb312/python/

TypeError: data type not understood

Best,

Matthew

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Question about recarray

2006-09-21 Thread Travis Oliphant
Lionel Roubeyrie wrote:
> Hi all,
> Is it possible to put masked values into recarrays, I need a array with 
> heterogenous types of datas (datetime objects in the first col, all others 
> are float) but with missing values in some records. For the moment, I don't 
> find any solution for that. 
Either use "nans" or "inf" for missing values or use the masked array 
object with a complex data-type.   You don't need to use a recarray 
object to get "records".  Any array can have "records".  Therefore, you 
can have a masked array of "records" by creating an array with the 
appropriate data-type.  

It may also be possible to use a recarray as the "array" for the masked 
array object becuase the recarray is a sub-class of the array.

> I have tried with arrays of dtype=object, but I 
> have problem when I want to compute min, max, ... with an error like:
> TypeError: function not supported for these types, and can't coerce safely to 
> supported types.
>   
It looks like the max and min functions are not supported for Object 
arrays.

import numpy as N
N.maximum.types

does not include Object arrays. 

It probably should.

-Travis


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Charles R Harris
On 9/21/06, Sebastian Haase <[EMAIL PROTECTED]> wrote:
On Thursday 21 September 2006 09:05, Charles R Harris wrote:> Travis,>> A few questions.>> 1) I can't find any systematic code testing units, although there seem to> be tests for regressions and such. Is there a place we should be putting
> such tests?>> 2) Any plans for code documentation? I documented some of my stuff with> doxygen markups and wonder if we should include a Doxyfile as part of the> package.Are able to use doxygen for Python code ? I thought it only worked for C (and
alike) ?IIRC correctly, it now does Python too. Let's see... here is an example## Documentation for this module.
##  More details.
## Documentation for a function.#
#  More details.def func():pass
Looks like ## replaces the /**Chuck
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Louis Cordier

> Are able to use doxygen for Python code ? I thought it only worked for C (and
> alike) ?

There is an ugly-hack :)
http://i31www.ira.uka.de/~baas/pydoxy/

But I wouldn't recommend using it, rather stick with Epydoc.


-- 
Louis Cordier <[EMAIL PROTECTED]> cell: +27721472305
Point45 Entertainment (Pty) Ltd. http://www.point45.org


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Travis Oliphant
Charles R Harris wrote:
> Travis,
>
> A few questions.
>
> 1) I can't find any systematic code testing units, although there seem 
> to be tests for regressions and such. Is there a place we should be 
> putting such tests?
All tests are placed under the tests directory of the corresponding 
sub-package.  They will only be picked up by .test(level < 10) if the 
file is named test_..test(level>10) should pick up all 
test files.   If you want to name something different but still have it 
run at a test level < 10,  then you need to run the test from one of the 
other test files that will be picked up (test_regression.py and 
test_unicode.py are doing that for example). 
>
> 2) Any plans for code documentation? I documented some of my stuff 
> with doxygen markups and wonder if we should include a Doxyfile as 
> part of the package.
I'm not familiar with Doxygen, but would welcome any improvements to the 
code documentation.
>
> 3) Would you consider breaking out the Converters into a separate .c 
> file for inclusion? The code generator seems to take care of the ordering.
You are right that it doesn't matter which order the API subroutines are 
placed.  I'm not opposed to more breaking up of the .c files, as long as 
it is clear where things will be located.The #include strategy is 
necessary to get it all in one Python module, but having smaller .c 
files usually makes for faster editing.   It's the arrayobject.c file 
that is "too-large" IMHO, however.   That's where I would look for ways 
to break it up.

The iterobject and the data-type object could be taken out, for example.


-Travis



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] immutable arrays

2006-09-21 Thread Travis Oliphant
Martin Wiechert wrote:
> Thanks Travis.
>
> Do I understand correctly that the only way to be really safe is to make a 
> copy and not to export a reference to it?
> Because anybody having a reference to the owner of the data can override the 
> flag?
>   
No, that's not quite correct.   Of course in C, anybody can do anything 
they want to the flags.

In Python, only the owner of the object itself can change the writeable 
flag once it is set to False.   So, if you only return a "view" of the 
array (a.view())  then the Python user will not be able to change the 
flags.

Example:

a = array([1,2,3])
a.flags.writeable = False

b = a.view()

b.flags.writeable = True   # raises an error.

c = a
c.flags.writeable = True  # can be done because c is a direct alias to a.

Hopefully, that explains the situation a bit better.

-Travis









-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] Tests and code documentation

2006-09-21 Thread Sebastian Haase
On Thursday 21 September 2006 09:05, Charles R Harris wrote:
> Travis,
>
> A few questions.
>
> 1) I can't find any systematic code testing units, although there seem to
> be tests for regressions and such. Is there a place we should be putting
> such tests?
>
> 2) Any plans for code documentation? I documented some of my stuff with
> doxygen markups and wonder if we should include a Doxyfile as part of the
> package.

Are able to use doxygen for Python code ? I thought it only worked for C (and 
alike) ?

>
> 3) Would you consider breaking out the Converters into a separate .c file
> for inclusion? The code generator seems to take care of the ordering.
>
> Chuck

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] Tests and code documentation

2006-09-21 Thread Charles R Harris
Travis,A few questions.1) I can't find any systematic code testing units, although there seem to be tests for regressions and such. Is there a place we should be putting such tests?2) Any plans for code documentation? I documented some of my stuff with doxygen markups and wonder if we should include a Doxyfile as part of the package.
3) Would you consider breaking out the Converters into a separate .c file for inclusion? The code generator seems to take care of the ordering.Chuck
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64

2006-09-21 Thread Charles R Harris
On 9/21/06, Peter Bienstman <[EMAIL PROTECTED]> wrote:
Hi,I just installed rc1 on an AMD64 machine. but I get this error message whentrying to import it:Python 2.4.3 (#1, Sep 21 2006, 13:06:42)[GCC 4.1.1 (Gentoo 4.1.1)] on linux2Type "help", "copyright", "credits" or "license" for more information.
>>> import numpyTraceback (most recent call last):I don't see this running the latest from svn on AMD64 here. Not sayin' there might not be a problem with rc1, I just don't see it with my sources.
Python 2.4.3 (#1, Jun 13 2006, 11:46:22)[GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import numpy
>>> numpy.version.version'1.0.dev3202'>>> numpy.version.os.uname()('Linux', 'tethys', '2.6.17-1.2187_FC5', '#1 SMP Mon Sep 11 01:16:59 EDT 2006', 'x86_64')If you are building on Gentoo maybe you could delete the build directory (and maybe the numpy site package) and rebuild.
Chuck. 
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] Question about recarray

2006-09-21 Thread Lionel Roubeyrie
Hi all,
Is it possible to put masked values into recarrays, I need a array with 
heterogenous types of datas (datetime objects in the first col, all others 
are float) but with missing values in some records. For the moment, I don't 
find any solution for that. I have tried with arrays of dtype=object, but I 
have problem when I want to compute min, max, ... with an error like:
TypeError: function not supported for these types, and can't coerce safely to 
supported types.
thanks

-- 
Lionel Roubeyrie - [EMAIL PROTECTED]
LIMAIR
http://www.limair.asso.fr

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


[Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64

2006-09-21 Thread Peter Bienstman
Hi,

I just installed rc1 on an AMD64 machine. but I get this error message when 
trying to import it:

Python 2.4.3 (#1, Sep 21 2006, 13:06:42)
[GCC 4.1.1 (Gentoo 4.1.1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
Traceback (most recent call last):
  File "", line 1, in ?
  File "/usr/lib64/python2.4/site-packages/numpy/__init__.py", line 36, in ?
import core
  File "/usr/lib64/python2.4/site-packages/numpy/core/__init__.py", line 7, 
in ?
import numerictypes as nt
  File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 
191, in ?
_add_aliases()
  File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 
169, in _add_aliases
base, bit, char = bitname(typeobj)
  File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 
119, in bitname
char = base[0]
IndexError: string index out of range

Thanks!

Peter

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion


Re: [Numpy-discussion] immutable arrays

2006-09-21 Thread Martin Wiechert
Thanks Travis.

Do I understand correctly that the only way to be really safe is to make a 
copy and not to export a reference to it?
Because anybody having a reference to the owner of the data can override the 
flag?

Cheers,
Martin

On Wednesday 20 September 2006 20:18, Travis Oliphant wrote:
> Martin Wiechert wrote:
> > Hi list,
> >
> > I just stumbled accross NPY_WRITEABLE flag.
> > Now I'd like to know if there are ways either from Python or C to make an
> > array temporarily immutable.
>
> Just setting the flag
>
> Python:
>
>   make immutable:
>   a.flags.writeable = False
>
>   make mutable again:
>   a.flags.writeable = True
>
>
> C:
>
>   make immutable:
>   a->flags &= ~NPY_WRITEABLE
>
>   make mutable again:
>   a->flags |= NPY_WRITEABLE
>
>
> In C you can play with immutability all you want.  In Python you can
> only make something writeable if you either 1) own the data or 2) the
> object that owns the data is itself "writeable"
>
>
> -Travis
>
>
> -
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
> your opinions on IT & business topics through brief surveys -- and earn
> cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> ___
> Numpy-discussion mailing list
> Numpy-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion