Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64
Cleaning out and rebuilding did the trick! Thanks, Peter On Thursday 21 September 2006 18:33, [EMAIL PROTECTED] wrote: > Subject: Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64 > > > I don't see this running the latest from svn on AMD64 here. Not sayin' > there might not be a problem with rc1, I just don't see it with my sources. > > Python 2.4.3 (#1, Jun 13 2006, 11:46:22) > [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > > >>> import numpy > >>> numpy.version.version > > '1.0.dev3202' > > >>> numpy.version.os.uname() > > ('Linux', 'tethys', '2.6.17-1.2187_FC5', '#1 SMP Mon Sep 11 01:16:59 EDT > 2006', 'x86_64') > > If you are building on Gentoo maybe you could delete the build directory > (and maybe the numpy site package) and rebuild. > > Chuck. pgpyNbhhIC5R6.pgp Description: PGP signature - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] Putmask/take ?
Folks, I'm running into the following problem with putmask on take. >>> import numpy >>> x = N.arange(12.) >>> m = [1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1] >>> i = N.nonzero(m)[0] >>> w = N.array([-1, -2, -3, -4.]) >>> x.putmask(w,m) >>> x.take(i) >>> N.allclose(x.take(i),w) False I'm wondering ifit is intentional, or if it's a problem on my build (1.0b5), or if somebody experienced it as well. Thanks a lot for your input. P. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] take behaviour has changed
Bill Baxter gmail.com> writes: > > Yep, check the release notes: > http://www.scipy.org/ReleaseNotes/NumPy_1.0 > search for 'take' on that page to find out what others have changed as well. > --bb Ok. Does axis=None then mean, that take(a, ind) operates on the flattened array? This it at least what it seem to be. I noticed that the ufunc behaves differently. a.take(ind) and a.take(ind, axis=0) behave the same, so the default argument to axis is 0 rather than None. Christian - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] take behaviour has changed
Yep, check the release notes: http://www.scipy.org/ReleaseNotes/NumPy_1.0 search for 'take' on that page to find out what others have changed as well. --bb On 9/22/06, Christian Kristukat <[EMAIL PROTECTED]> wrote: > Hi, > from 1.0b1 to 1.0rc1 the default behaviour of take seems to have changed when > omitting the axis argument: > > In [13]: a = reshape(arange(12),(3,4)) > > In [14]: take(a,[2,3]) > Out[14]: array([2, 3]) > > In [15]: take(a,[2,3],1) > Out[15]: > array([[ 2, 3], >[ 6, 7], >[10, 11]]) > > Is this intended? > > Christian > > > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > ___ > Numpy-discussion mailing list > Numpy-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] general version of repmat?
Is there some way to get the equivalent of repmat() for ndim == 1 and ndim >2. For ndim == 1, repmat always returns a 2-d array, instead of remaining 1-d. For ndim >2, repmat just doesn't work. Maybe we could add a 'reparray', with the signature: reparray(A, repeats, axis=None) where repeats is a scalar or a sequence. If 'repeats' is a scalar then the matrix is duplicated along 'axis' that many times. If 'repeats' is a sequence of length N, then A is duplicated repeats[i] times along axis[i]. If axis is None then it is assumed to be (0,1,2...N). Er that's not quite complete, because it doesn't specify what happens when you reparray an array to a higher dimension, like a 1-d to a 3-d. Like reparray([1,2], (2,2,2)). I guess the axis parameter could have some 'newaxis' entries to accomodate that. --bb - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] take behaviour has changed
Hi, from 1.0b1 to 1.0rc1 the default behaviour of take seems to have changed when omitting the axis argument: In [13]: a = reshape(arange(12),(3,4)) In [14]: take(a,[2,3]) Out[14]: array([2, 3]) In [15]: take(a,[2,3],1) Out[15]: array([[ 2, 3], [ 6, 7], [10, 11]]) Is this intended? Christian - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] matrixmultiply moved (release notes?)
Apparently numpy.matrixmultiply got moved into numpy.oldnumeric.matrixmultiply at some point (or rather ceased to be imported into the numpy namespace). Is there any list of all such methods that got banished? This would be nice to have in the release notes. --bb - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] putmask/take ?
Folks,I'm running into the following problem with putmask on take. >>> import numpy>>> x = N.arange(12.)>>> m = [1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1]>>> i = N.nonzero (m)[0]>>> w = N.array([-1, -2, -3, -4.])>>> x.putmask(w,m)>>> x.take(i)>>> N.allclose(x.take(i),w)FalseI'm wondering ifit is intentional, or if it's a problem on my build ( 1.0b5), or if somebody experiences that as well.Thanks a lot for your input.P. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
On Thu, 21 Sep 2006, Charles R Harris apparently wrote: > As to the oddness of \param or @param, here is an example from > Epydoc using Epytext > @type m: number > @param m: The slope of the line. > @type b: number > @param b: The y intercept of the line. Compare to definition list style for consolidated field lists in section 5.1 of http://epydoc.sourceforge.net/fields.html#rst which is much more elegant, IMO. Cheers, Alan Isaac - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
On Thu, 21 Sep 2006, "David M. Cooke" apparently wrote: > Foremost for Python doc strings, I think, is that it look > ok when using pydoc or similar (ipython's ?, for > instance). That means a minimal amount of markup. IMO reStructuredText is very natural for documentation, and it is nicely handled by epydoc. fwiw, Alan Isaac - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] change of default dtype
On 9/20/06, Bill Baxter <[EMAIL PROTECTED]> wrote: Hey Andrew, point taken, but I think it would be better if someone whoactually knows the full extent of the change made the edit. I knowzeros and ones changed. Did anything else?Anyway, I'm surprised the release notes page is publicly editable. I'm glad that it is editable. I hate wikis that are only editable by a select few. Defeats the purpose (or at least does not maximize the capability of a wiki).-- David Granthttp://www.davidgrant.ca - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] please change mean to use dtype=float
On Thursday 21 September 2006 15:28, Tim Hochberg wrote: > David M. Cooke wrote: > > On Thu, 21 Sep 2006 11:34:42 -0700 > > > > Tim Hochberg <[EMAIL PROTECTED]> wrote: > >> Tim Hochberg wrote: > >>> Robert Kern wrote: > David M. Cooke wrote: > > On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: > >> Let me offer a third path: the algorithms used for .mean() and > >> .var() are substandard. There are much better incremental algorithms > >> that entirely avoid the need to accumulate such large (and therefore > >> precision-losing) intermediate values. The algorithms look like the > >> following for 1D arrays in Python: > >> > >> def mean(a): > >> m = a[0] > >> for i in range(1, len(a)): > >> m += (a[i] - m) / (i + 1) > >> return m > > > > This isn't really going to be any better than using a simple sum. > > It'll also be slower (a division per iteration). > > With one exception, every test that I've thrown at it shows that it's > better for float32. That exception is uniformly spaced arrays, like > linspace(). > > > You do avoid > > accumulating large sums, but then doing the division a[i]/len(a) > > and adding that will do the same. > > Okay, this is true. > > > Now, if you want to avoid losing precision, you want to use a better > > summation technique, like compensated (or Kahan) summation: > > > > def mean(a): > > s = e = a.dtype.type(0) > > for i in range(0, len(a)): > > temp = s > > y = a[i] + e > > s = temp + y > > e = (temp - s) + y > > return s / len(a) > > > >> def var(a): > >> m = a[0] > >> t = a.dtype.type(0) > >> for i in range(1, len(a)): > >> q = a[i] - m > >> r = q / (i+1) > >> m += r > >> t += i * q * r > >> t /= len(a) > >> return t > >> > >> Alternatively, from Knuth: > >> > >> def var_knuth(a): > >> m = a.dtype.type(0) > >> variance = a.dtype.type(0) > >> for i in range(len(a)): > >> delta = a[i] - m > >> m += delta / (i+1) > >> variance += delta * (a[i] - m) > >> variance /= len(a) > >> return variance > >> > >> I'm going to go ahead and attach a module containing the versions of > >> mean, var, etc that I've been playing with in case someone wants to mess > >> with them. Some were stolen from traffic on this list, for others I > >> grabbed the algorithms from wikipedia or equivalent. > > > > I looked into this a bit more. I checked float32 (single precision) and > > float64 (double precision), using long doubles (float96) for the "exact" > > results. This is based on your code. Results are compared using > > abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in > > the range of [-100, 900] > > > > First, the mean. In float32, the Kahan summation in single precision is > > better by about 2 orders of magnitude than simple summation. However, > > accumulating the sum in double precision is better by about 9 orders of > > magnitude than simple summation (7 orders more than Kahan). > > > > In float64, Kahan summation is the way to go, by 2 orders of magnitude. > > > > For the variance, in float32, Knuth's method is *no better* than the > > two-pass method. Tim's code does an implicit conversion of intermediate > > results to float64, which is why he saw a much better result. > > Doh! And I fixed that same problem in the mean implementation earlier > too. I was astounded by how good knuth was doing, but not astounded > enough apparently. > > Does it seem weird to anyone else that in: > numpy_scalar python_scalar > the precision ends up being controlled by the python scalar? I would > expect the numpy_scalar to control the resulting precision just like > numpy arrays do in similar circumstances. Perhaps the egg on my face is > just clouding my vision though. > > > The two-pass method using > > Kahan summation (again, in single precision), is better by about 2 orders > > of magnitude. There is practically no difference when using a > > double-precision accumulator amongst the techniques: they're all about 9 > > orders of magnitude better than single-precision two-pass. > > > > In float64, Kahan summation is again better than the rest, by about 2 > > orders of magnitude. > > > > I've put my adaptation of Tim's code, and box-and-whisker plots of the > > results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/ > > > > Conclusions: > > > > - If you're going to calculate everything in single precision, use Kahan > > summation. Using it in double-precision also helps. > > - If you can use a double-precision accumulator, it's much better than > > any of the techniques in single-precision only. > > > > - for speed+precision in the va
[Numpy-discussion] numpy 1.0rc1 bdist_rpm fails
Hi, on linux I get an error when trying to build a rpm package from numpy 1.0rc1: building extension "numpy.core.umath" sources adding 'build/src.linux-i686-2.4/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-i686-2.4/numpy/core/__ufunc_api.h' to sources. creating build/src.linux-i686-2.4/src conv_template:> build/src.linux-i686-2.4/src/umathmodule.c error: src/umathmodule.c.src: No such file or directory error: Bad exit status from /home/ck/testarea/rpm/tmp/rpm-tmp.68597 (%build) RPM build errors: Bad exit status from /home/ck/testarea/rpm/tmp/rpm-tmp.68597 (%build) error: command 'rpmbuild' failed with exit status 1 Christian - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] please change mean to use dtype=float
David M. Cooke wrote: > On Thu, 21 Sep 2006 11:34:42 -0700 > Tim Hochberg <[EMAIL PROTECTED]> wrote: > > >> Tim Hochberg wrote: >> >>> Robert Kern wrote: >>> >>> David M. Cooke wrote: > On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: > > > >> Let me offer a third path: the algorithms used for .mean() and .var() >> are substandard. There are much better incremental algorithms that >> entirely avoid the need to accumulate such large (and therefore >> precision-losing) intermediate values. The algorithms look like the >> following for 1D arrays in Python: >> >> def mean(a): >> m = a[0] >> for i in range(1, len(a)): >> m += (a[i] - m) / (i + 1) >> return m >> >> >> > This isn't really going to be any better than using a simple sum. > It'll also be slower (a division per iteration). > > > With one exception, every test that I've thrown at it shows that it's better for float32. That exception is uniformly spaced arrays, like linspace(). > You do avoid > accumulating large sums, but then doing the division a[i]/len(a) and > adding that will do the same. Okay, this is true. > Now, if you want to avoid losing precision, you want to use a better > summation technique, like compensated (or Kahan) summation: > > def mean(a): > s = e = a.dtype.type(0) > for i in range(0, len(a)): > temp = s > y = a[i] + e > s = temp + y > e = (temp - s) + y > return s / len(a) > >> def var(a): >> m = a[0] >> t = a.dtype.type(0) >> for i in range(1, len(a)): >> q = a[i] - m >> r = q / (i+1) >> m += r >> t += i * q * r >> t /= len(a) >> return t >> >> Alternatively, from Knuth: >> >> def var_knuth(a): >> m = a.dtype.type(0) >> variance = a.dtype.type(0) >> for i in range(len(a)): >> delta = a[i] - m >> m += delta / (i+1) >> variance += delta * (a[i] - m) >> variance /= len(a) >> return variance >> >> I'm going to go ahead and attach a module containing the versions of >> mean, var, etc that I've been playing with in case someone wants to mess >> with them. Some were stolen from traffic on this list, for others I >> grabbed the algorithms from wikipedia or equivalent. >> > > I looked into this a bit more. I checked float32 (single precision) and > float64 (double precision), using long doubles (float96) for the "exact" > results. This is based on your code. Results are compared using > abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the > range of [-100, 900] > > First, the mean. In float32, the Kahan summation in single precision is > better by about 2 orders of magnitude than simple summation. However, > accumulating the sum in double precision is better by about 9 orders of > magnitude than simple summation (7 orders more than Kahan). > > In float64, Kahan summation is the way to go, by 2 orders of magnitude. > > For the variance, in float32, Knuth's method is *no better* than the two-pass > method. Tim's code does an implicit conversion of intermediate results to > float64, which is why he saw a much better result. The two-pass method using > Kahan summation (again, in single precision), is better by about 2 orders of > magnitude. There is practically no difference when using a double-precision > accumulator amongst the techniques: they're all about 9 orders of magnitude > better than single-precision two-pass. > > In float64, Kahan summation is again better than the rest, by about 2 orders > of magnitude. > > I've put my adaptation of Tim's code, and box-and-whisker plots of the > results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/ > > Conclusions: > > - If you're going to calculate everything in single precision, use Kahan > summation. Using it in double-precision also helps. > - If you can use a double-precision accumulator, it's much better than any of > the techniques in single-precision only. > > - for speed+precision in the variance, either use Kahan summation in single > precision with the two-pass method, or use double precision with simple > summation with the two-pass method. Knuth buys you nothing, except slower > code :-) > > After 1.0 is out, we should look at doing one of the above. > One more little tidbit; it appears possible to "fix up" Knuth's algorithm so that it's comparable in accuracy to the two pass Kahan version by doing Kahan summation while accumulating the variance. Testing on this was
Re: [Numpy-discussion] please change mean to use dtype=float
David M. Cooke wrote: > On Thu, 21 Sep 2006 11:34:42 -0700 > Tim Hochberg <[EMAIL PROTECTED]> wrote: > > >> Tim Hochberg wrote: >> >>> Robert Kern wrote: >>> >>> David M. Cooke wrote: > On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: > > > >> Let me offer a third path: the algorithms used for .mean() and .var() >> are substandard. There are much better incremental algorithms that >> entirely avoid the need to accumulate such large (and therefore >> precision-losing) intermediate values. The algorithms look like the >> following for 1D arrays in Python: >> >> def mean(a): >> m = a[0] >> for i in range(1, len(a)): >> m += (a[i] - m) / (i + 1) >> return m >> >> >> > This isn't really going to be any better than using a simple sum. > It'll also be slower (a division per iteration). > > > With one exception, every test that I've thrown at it shows that it's better for float32. That exception is uniformly spaced arrays, like linspace(). > You do avoid > accumulating large sums, but then doing the division a[i]/len(a) and > adding that will do the same. Okay, this is true. > Now, if you want to avoid losing precision, you want to use a better > summation technique, like compensated (or Kahan) summation: > > def mean(a): > s = e = a.dtype.type(0) > for i in range(0, len(a)): > temp = s > y = a[i] + e > s = temp + y > e = (temp - s) + y > return s / len(a) > >> def var(a): >> m = a[0] >> t = a.dtype.type(0) >> for i in range(1, len(a)): >> q = a[i] - m >> r = q / (i+1) >> m += r >> t += i * q * r >> t /= len(a) >> return t >> >> Alternatively, from Knuth: >> >> def var_knuth(a): >> m = a.dtype.type(0) >> variance = a.dtype.type(0) >> for i in range(len(a)): >> delta = a[i] - m >> m += delta / (i+1) >> variance += delta * (a[i] - m) >> variance /= len(a) >> return variance >> >> I'm going to go ahead and attach a module containing the versions of >> mean, var, etc that I've been playing with in case someone wants to mess >> with them. Some were stolen from traffic on this list, for others I >> grabbed the algorithms from wikipedia or equivalent. >> > > I looked into this a bit more. I checked float32 (single precision) and > float64 (double precision), using long doubles (float96) for the "exact" > results. This is based on your code. Results are compared using > abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the > range of [-100, 900] > > First, the mean. In float32, the Kahan summation in single precision is > better by about 2 orders of magnitude than simple summation. However, > accumulating the sum in double precision is better by about 9 orders of > magnitude than simple summation (7 orders more than Kahan). > > In float64, Kahan summation is the way to go, by 2 orders of magnitude. > > For the variance, in float32, Knuth's method is *no better* than the two-pass > method. Tim's code does an implicit conversion of intermediate results to > float64, which is why he saw a much better result. Doh! And I fixed that same problem in the mean implementation earlier too. I was astounded by how good knuth was doing, but not astounded enough apparently. Does it seem weird to anyone else that in: numpy_scalar python_scalar the precision ends up being controlled by the python scalar? I would expect the numpy_scalar to control the resulting precision just like numpy arrays do in similar circumstances. Perhaps the egg on my face is just clouding my vision though. > The two-pass method using > Kahan summation (again, in single precision), is better by about 2 orders of > magnitude. There is practically no difference when using a double-precision > accumulator amongst the techniques: they're all about 9 orders of magnitude > better than single-precision two-pass. > > In float64, Kahan summation is again better than the rest, by about 2 orders > of magnitude. > > I've put my adaptation of Tim's code, and box-and-whisker plots of the > results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/ > > Conclusions: > > - If you're going to calculate everything in single precision, use Kahan > summation. Using it in double-precision also helps. > - If you can use a double-precision accumulator, it's much better than any of > the techniques in single-precision only. > > - for speed+precision in the variance, eith
Re: [Numpy-discussion] immutable arrays
On Thursday 21 September 2006 18:24, Travis Oliphant wrote: > Martin Wiechert wrote: > > Thanks Travis. > > > > Do I understand correctly that the only way to be really safe is to make > > a copy and not to export a reference to it? > > Because anybody having a reference to the owner of the data can override > > the flag? > > No, that's not quite correct. Of course in C, anybody can do anything > they want to the flags. > > In Python, only the owner of the object itself can change the writeable > flag once it is set to False. So, if you only return a "view" of the > array (a.view()) then the Python user will not be able to change the > flags. > > Example: > > a = array([1,2,3]) > a.flags.writeable = False > > b = a.view() > > b.flags.writeable = True # raises an error. > > c = a > c.flags.writeable = True # can be done because c is a direct alias to a. > > Hopefully, that explains the situation a bit better. > It does. Thanks Travis. > -Travis > > > > > > > > > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your opinions on IT & business topics through brief surveys -- and earn > cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > ___ > Numpy-discussion mailing list > Numpy-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion AV scanned by FortiGate - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] please change mean to use dtype=float
David M. Cooke wrote: > >Conclusions: > >- If you're going to calculate everything in single precision, use Kahan >summation. Using it in double-precision also helps. >- If you can use a double-precision accumulator, it's much better than any of >the techniques in single-precision only. > >- for speed+precision in the variance, either use Kahan summation in single >precision with the two-pass method, or use double precision with simple >summation with the two-pass method. Knuth buys you nothing, except slower >code :-) > >After 1.0 is out, we should look at doing one of the above. > > +1 - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] please change mean to use dtype=float
On Thu, 21 Sep 2006 11:34:42 -0700 Tim Hochberg <[EMAIL PROTECTED]> wrote: > Tim Hochberg wrote: > > Robert Kern wrote: > > > >> David M. Cooke wrote: > >> > >> > >>> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: > >>> > >>> > Let me offer a third path: the algorithms used for .mean() and .var() > are substandard. There are much better incremental algorithms that > entirely avoid the need to accumulate such large (and therefore > precision-losing) intermediate values. The algorithms look like the > following for 1D arrays in Python: > > def mean(a): > m = a[0] > for i in range(1, len(a)): > m += (a[i] - m) / (i + 1) > return m > > > >>> This isn't really going to be any better than using a simple sum. > >>> It'll also be slower (a division per iteration). > >>> > >>> > >> With one exception, every test that I've thrown at it shows that it's > >> better for float32. That exception is uniformly spaced arrays, like > >> linspace(). > >> > >> > You do avoid > >> > accumulating large sums, but then doing the division a[i]/len(a) and > >> > adding that will do the same. > >> > >> Okay, this is true. > >> > >> > >> > >>> Now, if you want to avoid losing precision, you want to use a better > >>> summation technique, like compensated (or Kahan) summation: > >>> > >>> def mean(a): > >>> s = e = a.dtype.type(0) > >>> for i in range(0, len(a)): > >>> temp = s > >>> y = a[i] + e > >>> s = temp + y > >>> e = (temp - s) + y > >>> return s / len(a) > >> > def var(a): > m = a[0] > t = a.dtype.type(0) > for i in range(1, len(a)): > q = a[i] - m > r = q / (i+1) > m += r > t += i * q * r > t /= len(a) > return t > > Alternatively, from Knuth: > > def var_knuth(a): > m = a.dtype.type(0) > variance = a.dtype.type(0) > for i in range(len(a)): > delta = a[i] - m > m += delta / (i+1) > variance += delta * (a[i] - m) > variance /= len(a) > return variance > > I'm going to go ahead and attach a module containing the versions of > mean, var, etc that I've been playing with in case someone wants to mess > with them. Some were stolen from traffic on this list, for others I > grabbed the algorithms from wikipedia or equivalent. I looked into this a bit more. I checked float32 (single precision) and float64 (double precision), using long doubles (float96) for the "exact" results. This is based on your code. Results are compared using abs(exact_stat - computed_stat) / max(abs(values)), with 1 values in the range of [-100, 900] First, the mean. In float32, the Kahan summation in single precision is better by about 2 orders of magnitude than simple summation. However, accumulating the sum in double precision is better by about 9 orders of magnitude than simple summation (7 orders more than Kahan). In float64, Kahan summation is the way to go, by 2 orders of magnitude. For the variance, in float32, Knuth's method is *no better* than the two-pass method. Tim's code does an implicit conversion of intermediate results to float64, which is why he saw a much better result. The two-pass method using Kahan summation (again, in single precision), is better by about 2 orders of magnitude. There is practically no difference when using a double-precision accumulator amongst the techniques: they're all about 9 orders of magnitude better than single-precision two-pass. In float64, Kahan summation is again better than the rest, by about 2 orders of magnitude. I've put my adaptation of Tim's code, and box-and-whisker plots of the results, at http://arbutus.mcmaster.ca/dmc/numpy/variance/ Conclusions: - If you're going to calculate everything in single precision, use Kahan summation. Using it in double-precision also helps. - If you can use a double-precision accumulator, it's much better than any of the techniques in single-precision only. - for speed+precision in the variance, either use Kahan summation in single precision with the two-pass method, or use double precision with simple summation with the two-pass method. Knuth buys you nothing, except slower code :-) After 1.0 is out, we should look at doing one of the above. -- |>|\/|< /--\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |[EMAIL PROTECTED] - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default
Re: [Numpy-discussion] Tests and code documentation
Hi, On 9/21/06, Robert Kern <[EMAIL PROTECTED]> wrote: Steve Lianoglou wrote:> So .. I guess I'm wondering why we want to break from the standard?We don't as far as Python code goes. The code that Chuck added Doxygen-stylecomments to was C code. I presume he was simply answering Sebastian's question rather than suggesting we use Doxygen for Python code, too. Exactly. I also don't think the Python hack description applies to doxygen any longer. As to the oddness of \param or @param, here is an example from Epydoc using Epytext @type m: number@param m: The slope of the line.@type b: number@param b: The y intercept of the line. The X{y intercept} of a Looks like they borrowed something there ;) The main advantage of epydoc vs doxygen seems to be that you can use the markup inside the normal python docstring without having to make a separate comment block. Or would that be a disadvantage? Then again, I've been thinking of moving the python function docstrings into the add_newdocs.py file so everything is together in one spot and that would separate the Python docstrings from the functions anyway. I'll fool around with doxygen a bit and see what it does. The C code is the code that most needs documentation in any case. Chuck - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
Steve Lianoglou wrote: > So .. I guess I'm wondering why we want to break from the standard? We don't as far as Python code goes. The code that Chuck added Doxygen-style comments to was C code. I presume he was simply answering Sebastian's question rather than suggesting we use Doxygen for Python code, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
On Thu, 21 Sep 2006 10:05:58 -0600 "Charles R Harris" <[EMAIL PROTECTED]> wrote: > Travis, > > A few questions. > > 1) I can't find any systematic code testing units, although there seem to be > tests for regressions and such. Is there a place we should be putting such > tests? > > 2) Any plans for code documentation? I documented some of my stuff with > doxygen markups and wonder if we should include a Doxyfile as part of the > package. We don't have much of a defined standard for docs. Personally, I wouldn't use doxygen: what I've seen for Python versions are hacks, whose output looks like C++, and which requires markup that's not like commonly-used conventions in Python (\brief, for instance). Foremost for Python doc strings, I think, is that it look ok when using pydoc or similar (ipython's ?, for instance). That means a minimal amount of markup. Someone previously mentioned including cross-references; I think that's a good idea. A 'See also' line, for instance. Examples are good too, especially if there's been disputes on the interpretation of the command :-) For the C code, documentation is autogenerated from the /** ... API */ comments that determine which functions are part of the C API. This are put into files multiarray_api.txt and ufunc_api.txt (in the include/ directory). The files are in reST format, so the comments should/could be. At some point I've got to through and add more :-) -- |>|\/|< /--\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |[EMAIL PROTECTED] - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
> Are able to use doxygen for Python code ? I thought it only worked > for C (and > alike) ? > > IIRC correctly, it now does Python too. Let's see... here is an > example > ## Documentation for this module. > # > # More details. > > ## Documentation for a function. > # > # More details. > def func(): > pass > Looks like ## replaces the /** I never found it (although I haven't looked too hard), but I always thought there was an official way to document python code -- minimally to put the documentation in the docstring following the function definition: def func(..): """One liner. Continue docs -- some type of reStructredText style """ pas Isn't that the same docstring that ipython uses to bring up help, when you do: In [1]: myobject.some_func? So .. I guess I'm wondering why we want to break from the standard? -steve - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] please change mean to use dtype=float
Tim Hochberg wrote: Robert Kern wrote: David M. Cooke wrote: On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: Let me offer a third path: the algorithms used for .mean() and .var() are substandard. There are much better incremental algorithms that entirely avoid the need to accumulate such large (and therefore precision-losing) intermediate values. The algorithms look like the following for 1D arrays in Python: def mean(a): m = a[0] for i in range(1, len(a)): m += (a[i] - m) / (i + 1) return m This isn't really going to be any better than using a simple sum. It'll also be slower (a division per iteration). With one exception, every test that I've thrown at it shows that it's better for float32. That exception is uniformly spaced arrays, like linspace(). > You do avoid > accumulating large sums, but then doing the division a[i]/len(a) and > adding that will do the same. Okay, this is true. Now, if you want to avoid losing precision, you want to use a better summation technique, like compensated (or Kahan) summation: def mean(a): s = e = a.dtype.type(0) for i in range(0, len(a)): temp = s y = a[i] + e s = temp + y e = (temp - s) + y return s / len(a) Some numerical experiments in Maple using 5-digit precision show that your mean is maybe a bit better in some cases, but can also be much worse, than sum(a)/len(a), but both are quite poor in comparision to the Kahan summation. (We could probably use a fast implementation of Kahan summation in addition to a.sum()) +1 def var(a): m = a[0] t = a.dtype.type(0) for i in range(1, len(a)): q = a[i] - m r = q / (i+1) m += r t += i * q * r t /= len(a) return t Alternatively, from Knuth: def var_knuth(a): m = a.dtype.type(0) variance = a.dtype.type(0) for i in range(len(a)): delta = a[i] - m m += delta / (i+1) variance += delta * (a[i] - m) variance /= len(a) return variance These formulas are good when you can only do one pass over the data (like in a calculator where you don't store all the data points), but are slightly worse than doing two passes. Kahan summation would probably also be good here too. Again, my tests show otherwise for float32. I'll condense my ipython log into a module for everyone's perusal. It's possible that the Kahan summation of the squared residuals will work better than the current two-pass algorithm and the implementations I give above. This is what my tests show as well var_knuth outperformed any simple two pass algorithm I could come up with, even ones using Kahan sums. Interestingly, for 1D arrays the built in float32 variance performs better than it should. After a bit of twiddling around I discovered that it actually does most of it's calculations in float64. It uses a two pass calculation, the result of mean is a scalar, and in the process of converting that back to an array we end up with float64 values. Or something like that; I was mostly reverse engineering the sequence of events from the results. Here's a simple of example of how var is a little wacky. A shape-[N] array will give you a different result than a shape-[1,N] array. The reason is clear -- in the second case the mean is not a scalar so there isn't the inadvertent promotion to float64, but it's still odd. >>> data = (1000*(random.random([1]) - 0.1)).astype(float32) >>> print data.var() - data.reshape([1, -1]).var(-1) [ 0.1171875] I'm going to go ahead and attach a module containing the versions of mean, var, etc that I've been playing with in case someone wants to mess with them. Some were stolen from traffic on this list, for others I grabbed the algorithms from wikipedia or equivalent. -tim -tim - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion def raw_kahan_sum(values): """raw_kahan_sum(values) -> sum(values), residual where sum(values) is computed using Kahan's summation algorithm and the residual is value of the lower order bits when finished. """ total = c = values.dtype.type(0) for x in values: y = x + c t = total + y c = y - (t - total) total = t return total, c def sum(values): """sum(values) -> sum of
Re: [Numpy-discussion] please change mean to use dtype=float
Robert Kern wrote: > David M. Cooke wrote: > >> On Wed, Sep 20, 2006 at 03:01:18AM -0500, Robert Kern wrote: >> >>> Let me offer a third path: the algorithms used for .mean() and .var() are >>> substandard. There are much better incremental algorithms that entirely >>> avoid >>> the need to accumulate such large (and therefore precision-losing) >>> intermediate >>> values. The algorithms look like the following for 1D arrays in Python: >>> >>> def mean(a): >>> m = a[0] >>> for i in range(1, len(a)): >>> m += (a[i] - m) / (i + 1) >>> return m >>> >> This isn't really going to be any better than using a simple sum. >> It'll also be slower (a division per iteration). >> > > With one exception, every test that I've thrown at it shows that it's better > for > float32. That exception is uniformly spaced arrays, like linspace(). > > > You do avoid > > accumulating large sums, but then doing the division a[i]/len(a) and > > adding that will do the same. > > Okay, this is true. > > >> Now, if you want to avoid losing precision, you want to use a better >> summation technique, like compensated (or Kahan) summation: >> >> def mean(a): >> s = e = a.dtype.type(0) >> for i in range(0, len(a)): >> temp = s >> y = a[i] + e >> s = temp + y >> e = (temp - s) + y >> return s / len(a) >> >> Some numerical experiments in Maple using 5-digit precision show that >> your mean is maybe a bit better in some cases, but can also be much >> worse, than sum(a)/len(a), but both are quite poor in comparision to the >> Kahan summation. >> >> (We could probably use a fast implementation of Kahan summation in >> addition to a.sum()) >> > > +1 > > >>> def var(a): >>> m = a[0] >>> t = a.dtype.type(0) >>> for i in range(1, len(a)): >>> q = a[i] - m >>> r = q / (i+1) >>> m += r >>> t += i * q * r >>> t /= len(a) >>> return t >>> >>> Alternatively, from Knuth: >>> >>> def var_knuth(a): >>> m = a.dtype.type(0) >>> variance = a.dtype.type(0) >>> for i in range(len(a)): >>> delta = a[i] - m >>> m += delta / (i+1) >>> variance += delta * (a[i] - m) >>> variance /= len(a) >>> return variance >>> >> These formulas are good when you can only do one pass over the data >> (like in a calculator where you don't store all the data points), but >> are slightly worse than doing two passes. Kahan summation would probably >> also be good here too. >> > > Again, my tests show otherwise for float32. I'll condense my ipython log into > a > module for everyone's perusal. It's possible that the Kahan summation of the > squared residuals will work better than the current two-pass algorithm and > the > implementations I give above. > This is what my tests show as well var_knuth outperformed any simple two pass algorithm I could come up with, even ones using Kahan sums. Interestingly, for 1D arrays the built in float32 variance performs better than it should. After a bit of twiddling around I discovered that it actually does most of it's calculations in float64. It uses a two pass calculation, the result of mean is a scalar, and in the process of converting that back to an array we end up with float64 values. Or something like that; I was mostly reverse engineering the sequence of events from the results. -tim - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about recarray
Lionel Roubeyrie wrote: > find any solution for that. I have tried with arrays of dtype=object, but I > have problem when I want to compute min, max, ... with an error like: > TypeError: function not supported for these types, and can't coerce safely to > supported types. > I just added support for min and max methods of object arrays, by adding support for Object arrays to the minimum and maximum functions. -Travis - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] arr.dtype.kind is 'i' for dtype=unit !?
Matthew Brett wrote: > Hi, > > >> It's in the array interface specification: >> >> http://numpy.scipy.org/array_interface.shtml >> > > I was interested in the 't' (bitfield) type - is there an example of > usage somewhere? > No, It's not implemented in NumPy. It's just part of the array interface specification for completeness. -Travis - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] arr.dtype.kind is 'i' for dtype=unit !?
Hi, > It's in the array interface specification: > > http://numpy.scipy.org/array_interface.shtml I was interested in the 't' (bitfield) type - is there an example of usage somewhere? In [13]: dtype('t8') --- exceptions.TypeError Traceback (most recent call last) /home/mb312/python/ TypeError: data type not understood Best, Matthew - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about recarray
Lionel Roubeyrie wrote: > Hi all, > Is it possible to put masked values into recarrays, I need a array with > heterogenous types of datas (datetime objects in the first col, all others > are float) but with missing values in some records. For the moment, I don't > find any solution for that. Either use "nans" or "inf" for missing values or use the masked array object with a complex data-type. You don't need to use a recarray object to get "records". Any array can have "records". Therefore, you can have a masked array of "records" by creating an array with the appropriate data-type. It may also be possible to use a recarray as the "array" for the masked array object becuase the recarray is a sub-class of the array. > I have tried with arrays of dtype=object, but I > have problem when I want to compute min, max, ... with an error like: > TypeError: function not supported for these types, and can't coerce safely to > supported types. > It looks like the max and min functions are not supported for Object arrays. import numpy as N N.maximum.types does not include Object arrays. It probably should. -Travis - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
On 9/21/06, Sebastian Haase <[EMAIL PROTECTED]> wrote: On Thursday 21 September 2006 09:05, Charles R Harris wrote:> Travis,>> A few questions.>> 1) I can't find any systematic code testing units, although there seem to> be tests for regressions and such. Is there a place we should be putting > such tests?>> 2) Any plans for code documentation? I documented some of my stuff with> doxygen markups and wonder if we should include a Doxyfile as part of the> package.Are able to use doxygen for Python code ? I thought it only worked for C (and alike) ?IIRC correctly, it now does Python too. Let's see... here is an example## Documentation for this module. ## More details. ## Documentation for a function.# # More details.def func():pass Looks like ## replaces the /**Chuck - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
> Are able to use doxygen for Python code ? I thought it only worked for C (and > alike) ? There is an ugly-hack :) http://i31www.ira.uka.de/~baas/pydoxy/ But I wouldn't recommend using it, rather stick with Epydoc. -- Louis Cordier <[EMAIL PROTECTED]> cell: +27721472305 Point45 Entertainment (Pty) Ltd. http://www.point45.org - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
Charles R Harris wrote: > Travis, > > A few questions. > > 1) I can't find any systematic code testing units, although there seem > to be tests for regressions and such. Is there a place we should be > putting such tests? All tests are placed under the tests directory of the corresponding sub-package. They will only be picked up by .test(level < 10) if the file is named test_..test(level>10) should pick up all test files. If you want to name something different but still have it run at a test level < 10, then you need to run the test from one of the other test files that will be picked up (test_regression.py and test_unicode.py are doing that for example). > > 2) Any plans for code documentation? I documented some of my stuff > with doxygen markups and wonder if we should include a Doxyfile as > part of the package. I'm not familiar with Doxygen, but would welcome any improvements to the code documentation. > > 3) Would you consider breaking out the Converters into a separate .c > file for inclusion? The code generator seems to take care of the ordering. You are right that it doesn't matter which order the API subroutines are placed. I'm not opposed to more breaking up of the .c files, as long as it is clear where things will be located.The #include strategy is necessary to get it all in one Python module, but having smaller .c files usually makes for faster editing. It's the arrayobject.c file that is "too-large" IMHO, however. That's where I would look for ways to break it up. The iterobject and the data-type object could be taken out, for example. -Travis - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] immutable arrays
Martin Wiechert wrote: > Thanks Travis. > > Do I understand correctly that the only way to be really safe is to make a > copy and not to export a reference to it? > Because anybody having a reference to the owner of the data can override the > flag? > No, that's not quite correct. Of course in C, anybody can do anything they want to the flags. In Python, only the owner of the object itself can change the writeable flag once it is set to False. So, if you only return a "view" of the array (a.view()) then the Python user will not be able to change the flags. Example: a = array([1,2,3]) a.flags.writeable = False b = a.view() b.flags.writeable = True # raises an error. c = a c.flags.writeable = True # can be done because c is a direct alias to a. Hopefully, that explains the situation a bit better. -Travis - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests and code documentation
On Thursday 21 September 2006 09:05, Charles R Harris wrote: > Travis, > > A few questions. > > 1) I can't find any systematic code testing units, although there seem to > be tests for regressions and such. Is there a place we should be putting > such tests? > > 2) Any plans for code documentation? I documented some of my stuff with > doxygen markups and wonder if we should include a Doxyfile as part of the > package. Are able to use doxygen for Python code ? I thought it only worked for C (and alike) ? > > 3) Would you consider breaking out the Converters into a separate .c file > for inclusion? The code generator seems to take care of the ordering. > > Chuck - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] Tests and code documentation
Travis,A few questions.1) I can't find any systematic code testing units, although there seem to be tests for regressions and such. Is there a place we should be putting such tests?2) Any plans for code documentation? I documented some of my stuff with doxygen markups and wonder if we should include a Doxyfile as part of the package. 3) Would you consider breaking out the Converters into a separate .c file for inclusion? The code generator seems to take care of the ordering.Chuck - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64
On 9/21/06, Peter Bienstman <[EMAIL PROTECTED]> wrote: Hi,I just installed rc1 on an AMD64 machine. but I get this error message whentrying to import it:Python 2.4.3 (#1, Sep 21 2006, 13:06:42)[GCC 4.1.1 (Gentoo 4.1.1)] on linux2Type "help", "copyright", "credits" or "license" for more information. >>> import numpyTraceback (most recent call last):I don't see this running the latest from svn on AMD64 here. Not sayin' there might not be a problem with rc1, I just don't see it with my sources. Python 2.4.3 (#1, Jun 13 2006, 11:46:22)[GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import numpy >>> numpy.version.version'1.0.dev3202'>>> numpy.version.os.uname()('Linux', 'tethys', '2.6.17-1.2187_FC5', '#1 SMP Mon Sep 11 01:16:59 EDT 2006', 'x86_64')If you are building on Gentoo maybe you could delete the build directory (and maybe the numpy site package) and rebuild. Chuck. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] Question about recarray
Hi all, Is it possible to put masked values into recarrays, I need a array with heterogenous types of datas (datetime objects in the first col, all others are float) but with missing values in some records. For the moment, I don't find any solution for that. I have tried with arrays of dtype=object, but I have problem when I want to compute min, max, ... with an error like: TypeError: function not supported for these types, and can't coerce safely to supported types. thanks -- Lionel Roubeyrie - [EMAIL PROTECTED] LIMAIR http://www.limair.asso.fr - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
[Numpy-discussion] 1.0rc1 doesn't seem to work on AMD64
Hi, I just installed rc1 on an AMD64 machine. but I get this error message when trying to import it: Python 2.4.3 (#1, Sep 21 2006, 13:06:42) [GCC 4.1.1 (Gentoo 4.1.1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in ? File "/usr/lib64/python2.4/site-packages/numpy/__init__.py", line 36, in ? import core File "/usr/lib64/python2.4/site-packages/numpy/core/__init__.py", line 7, in ? import numerictypes as nt File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 191, in ? _add_aliases() File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 169, in _add_aliases base, bit, char = bitname(typeobj) File "/usr/lib64/python2.4/site-packages/numpy/core/numerictypes.py", line 119, in bitname char = base[0] IndexError: string index out of range Thanks! Peter - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Re: [Numpy-discussion] immutable arrays
Thanks Travis. Do I understand correctly that the only way to be really safe is to make a copy and not to export a reference to it? Because anybody having a reference to the owner of the data can override the flag? Cheers, Martin On Wednesday 20 September 2006 20:18, Travis Oliphant wrote: > Martin Wiechert wrote: > > Hi list, > > > > I just stumbled accross NPY_WRITEABLE flag. > > Now I'd like to know if there are ways either from Python or C to make an > > array temporarily immutable. > > Just setting the flag > > Python: > > make immutable: > a.flags.writeable = False > > make mutable again: > a.flags.writeable = True > > > C: > > make immutable: > a->flags &= ~NPY_WRITEABLE > > make mutable again: > a->flags |= NPY_WRITEABLE > > > In C you can play with immutability all you want. In Python you can > only make something writeable if you either 1) own the data or 2) the > object that owns the data is itself "writeable" > > > -Travis > > > - > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your opinions on IT & business topics through brief surveys -- and earn > cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > ___ > Numpy-discussion mailing list > Numpy-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion