Re: [R] cumsum vs. sum
GaGr == Gabor Grothendieck ggrothendi...@gmail.com on Tue, 17 Feb 2009 20:53:18 -0500 writes: GaGr Check out sum.exact and cumsum.exact in the caTools package. library(caTools) GaGr Loading required package: bitops x - 1/(12:14) sum(x) - cumsum(x)[3] GaGr [1] 2.775558e-17 sum.exact(x) - cumsum.exact(x)[3] GaGr [1] 0 [ buuh, humbug! ] The 'NEWS' for R-devel (to become R 2.9.0 in April) has the following entry o cumsum(x) and cumprod(x) for double precision x now use a long double accumulator where available and so more closely match sum() and prod() in potentially being more accurate. and indeed, in R-devel, sum(x) - cumsum(x)[length(x)] gives 0 for your example. Martin Maechler, ETH Zurich and R-core team GaGr On Tue, Feb 17, 2009 at 5:12 PM, Stavros Macrakis macra...@alum.mit.edu wrote: I recently traced a bug of mine to the fact that cumsum(s)[length(s)] is not always exactly equal to sum(s). For example, x-1/(12:14) sum(x) - cumsum(x)[3] = 2.8e-17 Floating-point addition is of course not exact, and in particular is not associative, so there are various possible reasons for this. Perhaps sum uses clever summing tricks to get more accurate results? In some quick experiments, it does seem to get more accurate results than cumsum. It might be worth documenting. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
SM == Stavros Macrakis macra...@alum.mit.edu on Wed, 18 Feb 2009 10:00:40 -0500 writes: SM Nice! Glad to hear it. It sounds as though it is still possible for SM cumsum(x)[length(x)] to not be exactly equal to sum, though? Well, possible, probably yes, platform-dependently; However I vaguely remember that I didn't see one such case in the few experiments I did. Martin SM On Wed, Feb 18, 2009 at 8:03 AM, Martin Maechler SM maech...@stat.math.ethz.ch wrote: SM ... o cumsum(x) and cumprod(x) for double precision x now use a long double accumulator where available and so more closely match sum() and prod() in potentially being more accurate. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
Hmm. Why not use the same method to guarantee the same result? Or at least document the possibility that cumsum(x)[length(x)] != sum(x)... that seems like an easy trap to fall into. -s On Wed, Feb 18, 2009 at 11:39 AM, Martin Maechler maech...@stat.math.ethz.ch wrote: SM == Stavros Macrakis macra...@alum.mit.edu on Wed, 18 Feb 2009 10:00:40 -0500 writes: SM Nice! Glad to hear it. It sounds as though it is still possible for SM cumsum(x)[length(x)] to not be exactly equal to sum, though? Well, possible, probably yes, platform-dependently; However I vaguely remember that I didn't see one such case in the few experiments I did. Martin SM On Wed, Feb 18, 2009 at 8:03 AM, Martin Maechler SM maech...@stat.math.ethz.ch wrote: SM ... o cumsum(x) and cumprod(x) for double precision x now use a long double accumulator where available and so more closely match sum() and prod() in potentially being more accurate. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
On 18/02/2009 12:41 PM, Stavros Macrakis wrote: Hmm. Why not use the same method to guarantee the same result? Or at least document the possibility that cumsum(x)[length(x)] != sum(x)... that seems like an easy trap to fall into. Assuming equality of floating point numbers computed by two different paths is always a trap. R doesn't try to obtain results that are equal to the last bit in other circumstances; why should it do so here? For example, one somewhat controversial choice in R is to use 64 bit precision in intermediate computations when available, rather than rounding everything to 52 bits as it does when stored to memory in doubles. This means that the value you get is likely to be closer to the truth than if you did the rounding earlier, but it is also subject to change according to optimization level, compiler version, etc. Duncan Murdoch -s On Wed, Feb 18, 2009 at 11:39 AM, Martin Maechler maech...@stat.math.ethz.ch wrote: SM == Stavros Macrakis macra...@alum.mit.edu on Wed, 18 Feb 2009 10:00:40 -0500 writes: SM Nice! Glad to hear it. It sounds as though it is still possible for SM cumsum(x)[length(x)] to not be exactly equal to sum, though? Well, possible, probably yes, platform-dependently; However I vaguely remember that I didn't see one such case in the few experiments I did. Martin SM On Wed, Feb 18, 2009 at 8:03 AM, Martin Maechler SM maech...@stat.math.ethz.ch wrote: SM ... o cumsum(x) and cumprod(x) for double precision x now use a long double accumulator where available and so more closely match sum() and prod() in potentially being more accurate. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
G'day all, On Wed, 18 Feb 2009 12:41:27 -0500 Stavros Macrakis macra...@alum.mit.edu wrote: Hmm. Why not use the same method to guarantee the same result? Hmm, I did not look at the source code but, potentially, sum() could use some tricks to reduce rounding errors further that would not be available to cumsum(); e.g. sorting the data before adding summing them; or grouping them into groups of roughly similar magnitude and then sum group-by-group. So it does may be counter-productive to use the same method. Or at least document the possibility that cumsum(x)[length(x)] != sum(x)... that seems like an easy trap to fall into. But this is already documented, isn't it? FAQ 7.31. ;-)) Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
Duncan, Berwin, Martin, Thanks for your thoughtful explanations, which make perfect sense. May I simply suggest that the non-identity between last(cumsum) and sum might be worth mentioning in the cumsum doc page? -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cumsum vs. sum
I recently traced a bug of mine to the fact that cumsum(s)[length(s)] is not always exactly equal to sum(s). For example, x-1/(12:14) sum(x) - cumsum(x)[3] = 2.8e-17 Floating-point addition is of course not exact, and in particular is not associative, so there are various possible reasons for this. Perhaps sum uses clever summing tricks to get more accurate results? In some quick experiments, it does seem to get more accurate results than cumsum. It might be worth documenting. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum vs. sum
Check out sum.exact and cumsum.exact in the caTools package. library(caTools) Loading required package: bitops x - 1/(12:14) sum(x) - cumsum(x)[3] [1] 2.775558e-17 sum.exact(x) - cumsum.exact(x)[3] [1] 0 On Tue, Feb 17, 2009 at 5:12 PM, Stavros Macrakis macra...@alum.mit.edu wrote: I recently traced a bug of mine to the fact that cumsum(s)[length(s)] is not always exactly equal to sum(s). For example, x-1/(12:14) sum(x) - cumsum(x)[3] = 2.8e-17 Floating-point addition is of course not exact, and in particular is not associative, so there are various possible reasons for this. Perhaps sum uses clever summing tricks to get more accurate results? In some quick experiments, it does seem to get more accurate results than cumsum. It might be worth documenting. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.