Re: Maths error
In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | | [ Interval arithmetic ] | | | For people just getting into it, it can be shocking to realize just how | | wide the interval can become after some computations. | | Yes. Even when you can prove (mathematically) that the bounds are | actually quite tight :-) | | This sounds like one of those pesky: | but you should be able to do better - kinds of things... It's worse :-( It is rather like global optimisation (including linear programming etc.) The algorithms that are guaranteed to work are so catastrophically slow that they are of theoretical interest only, but almost every practical problem can be solved well enough with a hack, IF it is coded by someone who understands both the problem and global optimisation. This is why the statistical methods (so disliked by Kahan) are used. In a fair number of cases, they give reasonable estimates of the error. In others, they give a false sense of security :-( Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Tim Roberts [EMAIL PROTECTED] writes: | Hendrik van Rooyen [EMAIL PROTECTED] wrote: | | What I don't know is how much precision this approximation loses when | used in real applications, and I have never found anyone else who has | much of a clue, either. | | I would suspect that this is one of those questions which are simple | to ask, but horribly difficult to answer - I mean - if the hardware has | thrown it away, how do you study it - you need somehow two | different parallel engines doing the same stuff, and comparing the | results, or you have to write a big simulation, and then you bring | your simulation errors into the picture - There be Dragons... | | Actually, this is a very well studied part of computer science called | interval arithmetic. As you say, you do every computation twice, once to | compute the minimum, once to compute the maximum. When you're done, you | can be confident that the true answer lies within the interval. The problem with it is that it is an unrealistically pessimal model, and there are huge classes of algorithm that it can't handle at all; anything involving iterative convergence for a start. It has been around for yonks (I first dabbled with it 30+ years ago), and it has never reached viability for most real applications. In 30 years, it has got almost nowhere. Don't confuse interval methods with interval arithmetic, because you don't need the latter for the former, despite the claims that you do. | For people just getting into it, it can be shocking to realize just how | wide the interval can become after some computations. Yes. Even when you can prove (mathematically) that the bounds are actually quite tight :-) Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren wrote: The problem with it is that it is an unrealistically pessimal model, and there are huge classes of algorithm that it can't handle at all; anything involving iterative convergence for a start. It has been around for yonks (I first dabbled with it 30+ years ago), and it has never reached viability for most real applications. In 30 years, it has got almost nowhere. Don't confuse interval methods with interval arithmetic, because you don't need the latter for the former, despite the claims that you do. | For people just getting into it, it can be shocking to realize just how | wide the interval can become after some computations. Yes. Even when you can prove (mathematically) that the bounds are actually quite tight :-) I've been experimenting with a fixed-point interval type in python. I expect many algorithms would require you to explicitly round/collapse/whatever-term the interval as they go along, essentially making it behave like a float. Do you think it'd suitable for general-use, assuming you didn't mind the explicit rounding? Unfortunately I lack a math background, so it's unlikely to progress past an experiment. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Rhamphoryncus [EMAIL PROTECTED] writes: | | I've been experimenting with a fixed-point interval type in python. I | expect many algorithms would require you to explicitly | round/collapse/whatever-term the interval as they go along, essentially | making it behave like a float. Yes, quite. | Do you think it'd suitable for | general-use, assuming you didn't mind the explicit rounding? I doubt it. Sorry. | Unfortunately I lack a math background, so it's unlikely to progress | past an experiment. As the same is true for what plenty of people have done, despite them having good backgrounds in mathematics, don't feel inferior! Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Tim Peters wrote: ... Alas, most people wouldn't read that either 0.5 wink. Oh the loss, you missed the chance for a 0.47684987 wink. --Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren [EMAIL PROTECTED] wrote: [Tim Roberts] | Actually, this is a very well studied part of computer science called | interval arithmetic. As you say, you do every computation twice, once to | compute the minimum, once to compute the maximum. When you're done, you | can be confident that the true answer lies within the interval. The problem with it is that it is an unrealistically pessimal model, and there are huge classes of algorithm that it can't handle at all; anything involving iterative convergence for a start. It has been around for yonks (I first dabbled with it 30+ years ago), and it has never reached viability for most real applications. In 30 years, it has got almost nowhere. Don't confuse interval methods with interval arithmetic, because you don't need the latter for the former, despite the claims that you do. | For people just getting into it, it can be shocking to realize just how | wide the interval can become after some computations. Yes. Even when you can prove (mathematically) that the bounds are actually quite tight :-) This sounds like one of those pesky: but you should be able to do better - kinds of things... - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Tim Peters [EMAIL PROTECTED] wrote: [Nick Maclaren] ... Yes, but that wasn't their point. It was that in (say) iterative algorithms, the error builds up by a factor of the base at every step. If it wasn't for the fact that errors build up, almost all programs could ignore numerical analysis and still get reliable answers! Actually, my (limited) investigations indicated that such an error build-up was extremely rare - I could achieve it only in VERY artificial programs. But I did find that the errors built up faster for higher bases, so that a reasonable rule of thumb is that 28 digits with a decimal base was comparable to (say) 80 bits with a binary base. [Hendrik van Rooyen] I would have thought that this sort of thing was a natural consequence of rounding errors - if I round (or worse truncate) a binary, I can be off by at most one, with an expectation of a half of a least significant digit, while if I use hex digits, my expectation is around eight, and for decimal around five... Which, in all cases, is a half ULP at worst (when rounding -- as everyone does now). So it would seem natural that errors would propagate faster on big base systems, AOTBE, but this may be a naive view.. I don't know of any current support for this view. It the bad old days, such things were often confused by architectures that mixed non-binary bases with creative rounding rules (like truncation indeed), and it could be hard to know where to pin the blame. What you will still see stated is variations on Kahan's telegraphic binary is better than any other radix for error analysis (but not very much), listed as one of two techincal advantages for binary fp in: http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf It's important to note that he says error analysis, not error propagation -- regardless of base in use, rounding is good to = 1/2 ULP. A fuller elementary explanation of this can be found in David Goldberg's widely available What Every Computer Scientist Should Know About Floating-Point, in its Relative Error and Ulps section. The short course is that rigorous forward error analysis of fp algorithms is usually framed in terms of relative error: given a computed approximation x' to the mathematically exact result x, what's the largest possible absolute value of the mathematical r = (x'-x)/x (the relative error of x')? This framework gets used because it's more- or-less tractable, starting by assuming inputs are exact (or not, in which case you start by bounding the inputs' relative errors), then successively computing relative errors for each step of the algorithm. Goldberg's paper, and Knuth volume 2, contain many introductory examples of rigorous analysis using this approach. Analysis of relative error generally goes along independent of FP base. It's at the end, when you want to transform a statement about relative error into a statement about error as measured by ULPs (units in the last place), where the base comes in strongly. As Goldberg explains, the larger the fp base the sloppier the relative-error-converted-to-ULPs bound is -- but this is by a constant factor independent of the algorithm being analyzed, hence Kahan's ... better ... but not very much. In more words from Goldberg: Since epsilon [a measure of relative error] can overestimate the effect of rounding to the nearest floating-point number by the wobble factor of B [the FP base, like 2 for binary or 10 for decimal], error estimates of formulas will be tighter on machines with a small B. When only the order of magnitude of rounding error is of interest, ulps and epsilon may be used interchangeably, since they differ by at most a factor of B. So that factor of B is irrelevant to most apps most of the time. For a combination of an fp algorithm + set of inputs near the edge of giving gibberish results, of course it can be important. Someone using Python's decimal implementation has an often very effective workaround then, short of writing a more robust fp algorithm: just boost the precision. Thanks Tim, for taking the trouble. - really nice explanation. My basic error of thinking ( ? - more like gut feel ) was that the bigger bases somehow lose more bits at every round, forgetting that half a microvolt is still half a microvolt, whether it is rounded in binary, decimal, or hex... - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | Tim Peters [EMAIL PROTECTED] wrote: | | What you will still see stated is variations on Kahan's telegraphic | binary is better than any other radix for error analysis (but not very | much), listed as one of two techincal advantages for binary fp in: | | http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf Which I believe to be the final statement of the matter. It was a minority view 30 years ago, but I now know of little dissent. He has omitted that mid-point invariant as a third advantage of binary, but I agree that it could be phrased as one or two extra mathematical invariants hold for binary (but not very important ones). | My basic error of thinking ( ? - more like gut feel ) was that the | bigger bases somehow lose more bits at every round, | forgetting that half a microvolt is still half a microvolt, whether | it is rounded in binary, decimal, or hex... That is not an error, but only a mistake :-) Yes, you have hit the nail on the head. Some people claimed that some important algorithms did that, and that binary was consequently much better. If it were true, then the precision you would need would be pro rata to the case - so the decimal equivalent of 64-bit binary would need 160 bits. Experience failed to confirm their viewpoint, and the effect was seen in only artificial algorithms (sorry - I can no longer remember the examples and am reluctant to waste time trying to reinvent them). But it was ALSO found that the converse was not QUITE true, either, and the effective numerical precision is not FULLY independent of the base. So, at a wild guesstimate, 64-bit decimal will deliver a precision comparable to about 56-bit binary, and will cause significant numerical problems to a FEW applications. Hence people will have to convert to the much more expensive 128-bit decimal format for such work. Bloatware rules. All your bits are belong to us. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | | I would suspect that this is one of those questions which are simple | to ask, but horribly difficult to answer - I mean - if the hardware has | thrown it away, how do you study it - you need somehow two | different parallel engines doing the same stuff, and comparing the | results, or you have to write a big simulation, and then you bring | your simulation errors into the picture - There be Dragons... No. You just emulate floating-point in software and throw a switch selecting between the two rounding rules. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Hendrik van Rooyen [EMAIL PROTECTED] wrote: Nick Maclaren [EMAIL PROTECTED] wrote: What I don't know is how much precision this approximation loses when used in real applications, and I have never found anyone else who has much of a clue, either. I would suspect that this is one of those questions which are simple to ask, but horribly difficult to answer - I mean - if the hardware has thrown it away, how do you study it - you need somehow two different parallel engines doing the same stuff, and comparing the results, or you have to write a big simulation, and then you bring your simulation errors into the picture - There be Dragons... Actually, this is a very well studied part of computer science called interval arithmetic. As you say, you do every computation twice, once to compute the minimum, once to compute the maximum. When you're done, you can be confident that the true answer lies within the interval. For people just getting into it, it can be shocking to realize just how wide the interval can become after some computations. -- Tim Roberts, [EMAIL PROTECTED] Providenza Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Dennis Lee Bieber [EMAIL PROTECTED]wrote: On Sun, 14 Jan 2007 07:18:11 +0200, Hendrik van Rooyen [EMAIL PROTECTED] declaimed the following in comp.lang.python: I recall an SF character known as Slipstick Libby, who was supposed to be a Genius - but I forget the setting and the author. Robert Heinlein. Appears a few of the Lazarus Long books. It is something that has become quietly extinct, and we did not even notice. And get collector prices -- http://www.sphere.bc.ca/test/sruniverse.html Thanks Dennis - Fascinating site ! - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | | *grin* - I was around at that time, and some of the inappropriate habits | almost forced by the lack of processing power still linger in my mind, | like - Don't use division if you can possibly avoid it, - its EXPENSIVE! | - it seems so silly nowadays. Yes, indeed, but that one is actually still with us! Integer division is done by software on a few systems, and floating-point division is often not vectorisable or pipelines poorly. But, except for special cases of little relevance to Python, it is not the poison that it was back then. | As an old slide rule user - I can agree with this - if you know the order | of the answer, and maybe two points after the decimal, it will tell you | if the bridge will fall down or not. Having an additional fifty decimal | places of accuracy does not really add any real information in these | cases. Its nice of course if its free, like it has almost become - but | I think people get mesmerized by the numbers, without giving any | thought to what they mean - which is probably why we often see | threads complaining about the error in the fifteenth decimal place.. Agreed. But the issue is really error build-up, and algorithms that are numerically 'unstable' - THEN, such subtle differences do matter. You still aren't interested in more than a few digits in the result, but you may have to sweat blood to get them. | [*] Assuming signed magnitude, calculate the answer truncated towards | zero but keep track of whether it is exact. If not, force the last | bit to 1. An old, cheap approximation to rounding. | | This is not so cheap - its good solid reasoning in my book - | after all, something is a lot more than nothing and should | not be thrown away... The cheap means cheap in hardware - it needs very little logic, which is why it was used on the old, discrete-logic, machines. I have been told by hardware people that implementing IEEE 754 rounding and denormalised numbers needs a horrific amount of logic - which is why only IBM do it all in hardware. And the decimal formats are significantly more complicated. What I don't know is how much precision this approximation loses when used in real applications, and I have never found anyone else who has much of a clue, either. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
[Nick Maclaren] ... Yes, but that wasn't their point. It was that in (say) iterative algorithms, the error builds up by a factor of the base at every step. If it wasn't for the fact that errors build up, almost all programs could ignore numerical analysis and still get reliable answers! Actually, my (limited) investigations indicated that such an error build-up was extremely rare - I could achieve it only in VERY artificial programs. But I did find that the errors built up faster for higher bases, so that a reasonable rule of thumb is that 28 digits with a decimal base was comparable to (say) 80 bits with a binary base. [Hendrik van Rooyen] I would have thought that this sort of thing was a natural consequence of rounding errors - if I round (or worse truncate) a binary, I can be off by at most one, with an expectation of a half of a least significant digit, while if I use hex digits, my expectation is around eight, and for decimal around five... Which, in all cases, is a half ULP at worst (when rounding -- as everyone does now). So it would seem natural that errors would propagate faster on big base systems, AOTBE, but this may be a naive view.. I don't know of any current support for this view. It the bad old days, such things were often confused by architectures that mixed non-binary bases with creative rounding rules (like truncation indeed), and it could be hard to know where to pin the blame. What you will still see stated is variations on Kahan's telegraphic binary is better than any other radix for error analysis (but not very much), listed as one of two techincal advantages for binary fp in: http://www.cs.berkeley.edu/~wkahan/MktgMath.pdf It's important to note that he says error analysis, not error propagation -- regardless of base in use, rounding is good to = 1/2 ULP. A fuller elementary explanation of this can be found in David Goldberg's widely available What Every Computer Scientist Should Know About Floating-Point, in its Relative Error and Ulps section. The short course is that rigorous forward error analysis of fp algorithms is usually framed in terms of relative error: given a computed approximation x' to the mathematically exact result x, what's the largest possible absolute value of the mathematical r = (x'-x)/x (the relative error of x')? This framework gets used because it's more- or-less tractable, starting by assuming inputs are exact (or not, in which case you start by bounding the inputs' relative errors), then successively computing relative errors for each step of the algorithm. Goldberg's paper, and Knuth volume 2, contain many introductory examples of rigorous analysis using this approach. Analysis of relative error generally goes along independent of FP base. It's at the end, when you want to transform a statement about relative error into a statement about error as measured by ULPs (units in the last place), where the base comes in strongly. As Goldberg explains, the larger the fp base the sloppier the relative-error-converted-to-ULPs bound is -- but this is by a constant factor independent of the algorithm being analyzed, hence Kahan's ... better ... but not very much. In more words from Goldberg: Since epsilon [a measure of relative error] can overestimate the effect of rounding to the nearest floating-point number by the wobble factor of B [the FP base, like 2 for binary or 10 for decimal], error estimates of formulas will be tighter on machines with a small B. When only the order of magnitude of rounding error is of interest, ulps and epsilon may be used interchangeably, since they differ by at most a factor of B. So that factor of B is irrelevant to most apps most of the time. For a combination of an fp algorithm + set of inputs near the edge of giving gibberish results, of course it can be important. Someone using Python's decimal implementation has an often very effective workaround then, short of writing a more robust fp algorithm: just boost the precision. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Dennis Lee Bieber [EMAIL PROTECTED] wrote: {My 8th grade teacher was a bit worried at seeing me with a slipstick G; and my HighSchool Trig/Geometry teacher only required 3 significant digits for answers -- even though half the class had calculators by then} LOL - I haven't seen the word slipstick for yonks... I recall an SF character known as Slipstick Libby, who was supposed to be a Genius - but I forget the setting and the author. It is something that has become quietly extinct, and we did not even notice. We should start a movement for reviving them - on grounds of their greenness - they use no batteries... Fat chance. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren [EMAIL PROTECTED] wrote: The cheap means cheap in hardware - it needs very little logic, which is why it was used on the old, discrete-logic, machines. I have been told by hardware people that implementing IEEE 754 rounding and denormalised numbers needs a horrific amount of logic - which is why only IBM do it all in hardware. And the decimal formats are significantly more complicated. What I don't know is how much precision this approximation loses when used in real applications, and I have never found anyone else who has much of a clue, either. I would suspect that this is one of those questions which are simple to ask, but horribly difficult to answer - I mean - if the hardware has thrown it away, how do you study it - you need somehow two different parallel engines doing the same stuff, and comparing the results, or you have to write a big simulation, and then you bring your simulation errors into the picture - There be Dragons... - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | | I would have thought that this sort of thing was a natural consequence | of rounding errors - if I round (or worse truncate) a binary, I can be off | by at most one, with an expectation of a half of a least significant digit, | while if I use hex digits, my expectation is around eight, and for decimal | around five... | | So it would seem natural that errors would propagate | faster on big base systems, AOTBE, but this may be | a naive view.. Yes, indeed, and that is precisely why the we must use binary camp won out. The problem was that computers of the early 1970s were not quite powerful enough to run real applications with simulated floating-point arithmetic. I am one of the half-dozen people who did ANY actual tests on real numerical code, but there may have been some work since! Nowadays, it would be easy, and it would make quite a good PhD. The points to look at would be the base and the rounding rules (including IEEE rounding versus probabilistic versus last bit forced[*]). We know that the use or not of denormalised numbers and the exact details of true rounding make essentially no difference. In a world ruled by reason rather than spin, this investigation would have been done before claiming that decimal floating-point is an adequate replacement for binary for numerical work, but we don't live in such a world. No matter. Almost everyone in the area agrees that decimal floating-point isn't MUCH worse than binary, from a numerical point of view :-) [*] Assuming signed magnitude, calculate the answer truncated towards zero but keep track of whether it is exact. If not, force the last bit to 1. An old, cheap approximation to rounding. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], Hendrik van Rooyen [EMAIL PROTECTED] writes: | | I would have thought that this sort of thing was a natural consequence | of rounding errors - if I round (or worse truncate) a binary, I can be off | by at most one, with an expectation of a half of a least significant digit, | while if I use hex digits, my expectation is around eight, and for decimal | around five... | | So it would seem natural that errors would propagate | faster on big base systems, AOTBE, but this may be | a naive view.. Yes, indeed, and that is precisely why the we must use binary camp won out. The problem was that computers of the early 1970s were not quite powerful enough to run real applications with simulated floating-point arithmetic. I am one of the half-dozen people who did ANY actual tests on real numerical code, but there may have been some work since! *grin* - I was around at that time, and some of the inappropriate habits almost forced by the lack of processing power still linger in my mind, like - Don't use division if you can possibly avoid it, - its EXPENSIVE! - it seems so silly nowadays. Nowadays, it would be easy, and it would make quite a good PhD. The points to look at would be the base and the rounding rules (including IEEE rounding versus probabilistic versus last bit forced[*]). We know that the use or not of denormalised numbers and the exact details of true rounding make essentially no difference. In a world ruled by reason rather than spin, this investigation would have been done before claiming that decimal floating-point is an adequate replacement for binary for numerical work, but we don't live in such a world. No matter. Almost everyone in the area agrees that decimal floating-point isn't MUCH worse than binary, from a numerical point of view :-) As an old slide rule user - I can agree with this - if you know the order of the answer, and maybe two points after the decimal, it will tell you if the bridge will fall down or not. Having an additional fifty decimal places of accuracy does not really add any real information in these cases. Its nice of course if its free, like it has almost become - but I think people get mesmerized by the numbers, without giving any thought to what they mean - which is probably why we often see threads complaining about the error in the fifteenth decimal place.. [*] Assuming signed magnitude, calculate the answer truncated towards zero but keep track of whether it is exact. If not, force the last bit to 1. An old, cheap approximation to rounding. This is not so cheap - its good solid reasoning in my book - after all, something is a lot more than nothing and should not be thrown away... - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Tim Peters [EMAIL PROTECTED] writes: | | Sure. Possibly even most. Short of writing a long gentle tutorial, | can that be improved? Alas, most people wouldn't read that either 0.5 | wink. Yes. Improved wording would be only slightly longer, and it is never appropriate to omit all negative aspects. The truth, the whole truth and nothing but the truth :-) | Worse, I expect most people have no real idea of that there's a possible | difference between internal and external representations. This is often | given as a selling point for decimal arithmetic: it's WYSIWYG in ways | binary fp can't be (short of inventing power-of-2 fp representations for | I/O, which few people would use). Right. Another case when none of the problems show up on dinky little examples but do in real code :-( | A lot of very well-respected numerical analysts said that larger bases | led to a faster build-up of error (independent of the precision). My | limited investigations indicated that there was SOME truth in that, | but it wasn't a major matter; I never say the matter settled | definitively. | | My point was that 28 decimal digits of precision is far greater than | supplied even by 64-bit binary floats today (let alone the smaller sizes | in most-common use back in the 60s and 70s). Pollution of low-order | bits is far less of a real concern when there are some number of low- | order bits you don't care about at all. Yes, but that wasn't their point. It was that in (say) iterative algorithms, the error builds up by a factor of the base at every step. If it wasn't for the fact that errors build up, almost all programs could ignore numerical analysis and still get reliable answers! Actually, my (limited) investigations indicated that such an error build-up was extremely rare - I could achieve it only in VERY artificial programs. But I did find that the errors built up faster for higher bases, so that a reasonable rule of thumb is that 28 digits with a decimal base was comparable to (say) 80 bits with a binary base. And, IN GENERAL, programs won't be using 128-bit IEEE representations. Given Python's overheads, there is no reason not to, unless the hardware is catastrophically slower (which is plausible). | If you know a b, doing | | c = a + (b-a)/2 | | instead of | | c = (a+b)/2 | | at least guarantees (ignoring possible overflow) a = c = b. As shown | last time, it's not even always the case that (x+x)/2 == x in decimal fp | (or in any fp base 2, for that matter). Yes. Back in the days before binary floating-point started to dominate, we taught that as a matter of routine, but it has not been taught to all users of floating-point for a couple of decades. Indeed, a lot of modern programmers regard having to distort simple expressions in that way as anathema. It isn't a major issue, because our experience from then is that it is both teachable and practical, but it IS a way in which any base above 2 is significantly worse than base 2. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren [EMAIL PROTECTED] wrote: Yes, but that wasn't their point. It was that in (say) iterative algorithms, the error builds up by a factor of the base at every step. If it wasn't for the fact that errors build up, almost all programs could ignore numerical analysis and still get reliable answers! Actually, my (limited) investigations indicated that such an error build-up was extremely rare - I could achieve it only in VERY artificial programs. But I did find that the errors built up faster for higher bases, so that a reasonable rule of thumb is that 28 digits with a decimal base was comparable to (say) 80 bits with a binary base. I would have thought that this sort of thing was a natural consequence of rounding errors - if I round (or worse truncate) a binary, I can be off by at most one, with an expectation of a half of a least significant digit, while if I use hex digits, my expectation is around eight, and for decimal around five... So it would seem natural that errors would propagate faster on big base systems, AOTBE, but this may be a naive view.. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Tim Peters [EMAIL PROTECTED] writes: | | Huh. I don't read it that way. If it said numbers can be ... I | might, but reading that way seems to requires effort to overlook the | decimal in decimal numbers can be I wouldn't expect YOU to read it that way, but I can assure you from experience that many people do. What it MEANS is Numbers with a short representation in decimal can be represented exactly in decimal arithmetic, which is tautologous. What they READ it to mean is One advantage of representing numbers in decimal is that they can be represented exactly, and they then assume that also applies to pi, sqrt(2), 1/3 The point is that the decimal could apply equally well to the external or internal representation and, if you aren't fairly clued-up in this area, it is easy to choose the wrong one. | | and how is decimal no better than binary? | | | Basically, they both lose info when rounding does occur. For | | example, | | Yes, but there are two ways in which binary is superior. Let's skip | the superior 'smoothness', as being too arcane an issue for this | group, | | With 28 decimal digits used by default, few apps would care about this | anyway. Were you in the computer arithmetic area during the base wars of the 1960s and 1970s that culminated with binary winning out? A lot of very well-respected numerical analysts said that larger bases led to a faster build-up of error (independent of the precision). My limited investigations indicated that there was SOME truth in that, but it wasn't a major matter; I never say the matter settled definitively. | and deal with the other. In binary, calculating the mid-point | of two numbers (a very common operation) is guaranteed to be within | the range defined by those numbers, or to over/under-flow. | | Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range | (x,y) in decimal, even for the most respectable values of x and y. | This was a MAJOR gotcha in the days before binary became standard, | and will clearly return with decimal. | | I view this as being an instance of lose info when rounding does | occur. For example, No, absolutely NOT! This is an orthogonal matter, and is about the loss of an important invariant when using any base above 2. Back in the days when there were multiple bases, virtually every programmer who wrote large numerical code got caught by it at least once, and many got caught several times (it has multiple guises). For example, take the following algorithm for binary chop: while 1 : c = (a+b)/2 if f(x) y : if c == b : break b = c else : if c == a : break a = c That works in binary, but in no base above 2 (assuming that I haven't made a stupid error writing it down). In THAT case, it is easy to fix for decimal, but there are ways that it can show up that can be quite tricky to fix. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
[Tim Peters] ... Huh. I don't read it that way. If it said numbers can be ... I might, but reading that way seems to requires effort to overlook the decimal in decimal numbers can be [Nick Maclaren] I wouldn't expect YOU to read it that way, Of course I meant putting myself in others' shoes, I don't but I can assure you from experience that many people do. Sure. Possibly even most. Short of writing a long gentle tutorial, can that be improved? Alas, most people wouldn't read that either 0.5 wink. What it MEANS is Numbers with a short representation in decimal short is a red herring here: Python's Decimal constructor ignores the precision setting, retaining all the digits you give. For example, if you pass a string with a million decimal digits, you'll end up with a very fat Decimal instance -- no info is lost. can be represented exactly in decimal arithmetic, which is tautologous. What they READ it to mean is One advantage of representing numbers in decimal is that they can be represented exactly, and they then assume that also applies to pi, sqrt(2), 1/3 The point is that the decimal could apply equally well to the external or internal representation and, if you aren't fairly clued-up in this area, it is easy to choose the wrong one. Worse, I expect most people have no real idea of that there's a possible difference between internal and external representations. This is often given as a selling point for decimal arithmetic: it's WYSIWYG in ways binary fp can't be (short of inventing power-of-2 fp representations for I/O, which few people would use). [attribution lost] and how is decimal no better than binary? [Tim] Basically, they both lose info when rounding does occur. For example, [Nick] Yes, but there are two ways in which binary is superior. Let's skip the superior 'smoothness', as being too arcane an issue for this group, With 28 decimal digits used by default, few apps would care about this anyway. Were you in the computer arithmetic area during the base wars of the 1960s and 1970s that culminated with binary winning out? Yes, although I came in on the tail end of that and never actually used a non-binary machine. A lot of very well-respected numerical analysts said that larger bases led to a faster build-up of error (independent of the precision). My limited investigations indicated that there was SOME truth in that, but it wasn't a major matter; I never say the matter settled definitively. My point was that 28 decimal digits of precision is far greater than supplied even by 64-bit binary floats today (let alone the smaller sizes in most-common use back in the 60s and 70s). Pollution of low-order bits is far less of a real concern when there are some number of low- order bits you don't care about at all. and deal with the other. In binary, calculating the mid-point of two numbers (a very common operation) is guaranteed to be within the range defined by those numbers, or to over/under-flow. Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range (x,y) in decimal, even for the most respectable values of x and y. This was a MAJOR gotcha in the days before binary became standard, and will clearly return with decimal. I view this as being an instance of lose info when rounding does occur. For example, No, absolutely NOT! Of course it is. If there were no rounding errors, the computed result would be exactly right -- that's darned near tautological too. You snipped the examples I gave showing exactly where and how rounding error created the problems in (x+y)/2 and x/2+y/2 for some specific values of x and y using decimal arithmetic. If you don't like those examples, supply your own, and if you get a similarly surprising result you'll find rounding error(s) occur(s) in yours too. It so happens that rounding errors in binary fp can't lead to the same counterintuitive /outcome/, essentially because x+x == y+y implies x == y in base 2 fp, which is indeed a bit of magic specific to base 2. The fact that there /do/ exist fp x and y such that x != y yet x+x == y+y in bases 2 is entirely due to fp rounding error losing info. This is an orthogonal matter, Disagree. and is about the loss of an important invariant when using any base above 2. It is that. Back in the days when there were multiple bases, virtually every programmer who wrote large numerical code got caught by it at least once, and many got caught several times (it has multiple guises). For example, take the following algorithm for binary chop: while 1 : c = (a+b)/2 if f(x) y : if c == b : break b = c else : if c == a : break a = c That works in binary, but in no base above 2 (assuming that I haven't made a stupid error writing it down). In THAT case, it is easy to fix for decimal, but there are ways that it can show
Re: Maths error
| Rory Campbell-Lange wrote: | | Is using the decimal module the best way around this? (I'm | expecting the first sum to match the second). It seem | anachronistic that decimal takes strings as input, though. As Dan Bishop says, probably not. The introduction to the decimal module makes exaggerated claims of accuracy, amounting to propaganda. It is numerically no better than binary, and has some advantages and some disadvantages. | Also check the recent thread bizarre floating point output. No, don't. That is about another matter entirely, and will merely confuse you. I have a course on computer arithmetic, and am just now writing one on Python numerics, and confused people may contact me - though I don't guarantee to help. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
On Tue, 2007-01-09 at 11:38 +, Nick Maclaren wrote: | Rory Campbell-Lange wrote: | | Is using the decimal module the best way around this? (I'm | expecting the first sum to match the second). It seem | anachronistic that decimal takes strings as input, though. As Dan Bishop says, probably not. The introduction to the decimal module makes exaggerated claims of accuracy, amounting to propaganda. It is numerically no better than binary, and has some advantages and some disadvantages. Please elaborate. Which exaggerated claims are made, and how is decimal no better than binary? -Carsten -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
[Rory Campbell-Lange] Is using the decimal module the best way around this? (I'm expecting the first sum to match the second). It seem anachronistic that decimal takes strings as input, though. [Nick Maclaren] As Dan Bishop says, probably not. The introduction to the decimal module makes exaggerated claims of accuracy, amounting to propaganda. It is numerically no better than binary, and has some advantages and some disadvantages. [Carsten Haese] Please elaborate. Which exaggerated claims are made, Well, just about any technical statement can be misleading if not qualified to such an extent that the only people who can still understand it knew it to begin with 0.8 wink. The most dubious statement here to my eyes is the intro's exactness carries over into arithmetic. It takes a world of additional words to explain exactly what it is about the example given (0.1 + 0.1 + 0.1 - 0.3 = 0 exactly in decimal fp, but not in binary fp) that does, and does not, generalize. Roughly, it does generalize to one important real-life use-case: adding and subtracting any number of decimal quantities delivers the exact decimal result, /provided/ that precision is set high enough that no rounding occurs. and how is decimal no better than binary? Basically, they both lose info when rounding does occur. For example, import decimal 1 / decimal.Decimal(3) Decimal(0.) _ * 3 Decimal(0.) That is, (1/3)*3 != 1 in decimal. The reason why is obvious by eyeball, but only because you have a lifetime of experience working in base 10. A bit ironically, the rounding in binary just happens to be such that (1/3)/3 does equal 1: 1./3 0.1 _ * 3 1.0 It's not just * and /. The real thing at work in the 0.1 + 0.1 + 0.1 - 0.3 example is representation error, not sloppy +/-: 0.1 and 0.3 can't be /represented/ exactly as binary floats to begin with. Much the same can happen if you instead you use inputs exactly representable in base 2 but not in base 10 (and while there are none such if precision is infinite, precision isn't infinite): x = decimal.Decimal(1) / 2**90 print x 8.077935669463160887416100508E-28 print x + x + x - 3*x # not exactly 0 1E-54 The same in binary f.p. is exact, because 1./2**90 is exactly representable in binary fp: x = 1. / 2**90 print x # this displays an inexact decimal approx. to 1./2**90 8.07793566946e-028 print x + x + x - 3*x # but the binary arithmetic is exact 0.0 If you boost decimal's precision high enough, then this specific example is also exact using decimal; but with the default precision of 28, 1./2**90 can't be represented exactly in decimal to begin with; e.g., decimal.Decimal(1) / 2**90 * 2**90 Decimal(0.) All forms of fp are subject to representation and rounding errors. The biggest practical difference here is that the `decimal` module is not subject to representation error for natural decimal quantities, provided precision is set high enough to retain all the input digits. That's worth something to many apps, and is the whole ball of wax for some apps -- but leaves a world of possible surprises nevertheless. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Nick Maclaren wrote: No, don't. That is about another matter entirely, It isn't. Regards, Björn -- BOFH excuse #366: ATM cell has no roaming feature turned on, notebooks can't connect -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Tim Peters [EMAIL PROTECTED] writes: | | Well, just about any technical statement can be misleading if not qualified | to such an extent that the only people who can still understand it knew it | to begin with 0.8 wink. The most dubious statement here to my eyes is | the intro's exactness carries over into arithmetic. It takes a world of | additional words to explain exactly what it is about the example given (0.1 | + 0.1 + 0.1 - 0.3 = 0 exactly in decimal fp, but not in binary fp) that | does, and does not, generalize. Roughly, it does generalize to one | important real-life use-case: adding and subtracting any number of decimal | quantities delivers the exact decimal result, /provided/ that precision is | set high enough that no rounding occurs. Precisely. There is one other such statement, too: Decimal numbers can be represented exactly. What it MEANS is that numbers with a short representation in decimal can be represented exactly in decimal, which is tautologous, but many people READ it to say that numbers that they are interested in can be represented exactly in decimal. Such as pi, sqrt(2), 1/3 and so on | and how is decimal no better than binary? | | Basically, they both lose info when rounding does occur. For example, Yes, but there are two ways in which binary is superior. Let's skip the superior 'smoothness', as being too arcane an issue for this group, and deal with the other. In binary, calculating the mid-point of two numbers (a very common operation) is guaranteed to be within the range defined by those numbers, or to over/under-flow. Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range (x,y) in decimal, even for the most respectable values of x and y. This was a MAJOR gotcha in the days before binary became standard, and will clearly return with decimal. Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Carsten Haese [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] | On Tue, 2007-01-09 at 11:38 +, Nick Maclaren wrote: | As Dan Bishop says, probably not. The introduction to the decimal | module makes exaggerated claims of accuracy, amounting to propaganda. | It is numerically no better than binary, and has some advantages | and some disadvantages. | | Please elaborate. Which exaggerated claims are made, and how is decimal | no better than binary? As to the latter question: calculating with decimals instead of binaries eliminates conversion errors introduced when one has *exact* decimal inputs, such as in financial calculations (which were the motivating use case for the decimal module). But it does not eliminate errors inherent in approximating reals with (a limited set of) ratrionals. Nor does it eliminate errors inherent in approximation algorithms (such as using a finite number of terms of an infinite series. Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
On 1/9/07, Tim Peters [EMAIL PROTECTED] wrote: Well, just about any technical statement can be misleading if not qualified to such an extent that the only people who can still understand it knew it to begin with 0.8 wink. +1 QTOW -- Cheers, Simon B [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Bjoern Schliessmann wrote: Nick Maclaren wrote: No, don't. That is about another matter entirely, It isn't. Actually it really is. That thread is about the difference between str(some_float) and repr(some_float) and why str(some_tuple) uses the repr() of its elements. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
In article [EMAIL PROTECTED], Robert Kern [EMAIL PROTECTED] writes: | | No, don't. That is about another matter entirely, | | It isn't. | | Actually it really is. That thread is about the difference between | str(some_float) and repr(some_float) and why str(some_tuple) uses the repr() of | its elements. Precisely. And it also applies to strings, which I had failed to notice: print (1,2) ('1', '2') print 1, 2 1 2 Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
[Tim Peters] ... | Well, just about any technical statement can be misleading if not | qualified to such an extent that the only people who can still | understand it knew it to begin with 0.8 wink. The most dubious | statement here to my eyes is the intro's exactness carries over | into arithmetic. It takes a world of additional words to explain | exactly what it is about the example given (0.1 + 0.1 + 0.1 - 0.3 = | 0 exactly in decimal fp, but not in binary fp) that does, and does | not, generalize. Roughly, it does generalize to one important | real-life use-case: adding and subtracting any number of decimal | quantities delivers the exact decimal result, /provided/ that | precision is set high enough that no rounding occurs. [Nick Maclaren] Precisely. There is one other such statement, too: Decimal numbers can be represented exactly. What it MEANS is that numbers with a short representation in decimal can be represented exactly in decimal, which is tautologous, but many people READ it to say that numbers that they are interested in can be represented exactly in decimal. Such as pi, sqrt(2), 1/3 and so on Huh. I don't read it that way. If it said numbers can be ... I might, but reading that way seems to requires effort to overlook the decimal in decimal numbers can be [attribution lost] | and how is decimal no better than binary? | Basically, they both lose info when rounding does occur. For | example, Yes, but there are two ways in which binary is superior. Let's skip the superior 'smoothness', as being too arcane an issue for this group, With 28 decimal digits used by default, few apps would care about this anyway. and deal with the other. In binary, calculating the mid-point of two numbers (a very common operation) is guaranteed to be within the range defined by those numbers, or to over/under-flow. Neither (x+y)/2.0 nor (x/2.0+y/2.0) are necessarily within the range (x,y) in decimal, even for the most respectable values of x and y. This was a MAJOR gotcha in the days before binary became standard, and will clearly return with decimal. I view this as being an instance of lose info when rounding does occur. For example, import decimal as d s = d.Decimal(. + 9 * d.getcontext().prec) s Decimal(0.) (s+s)/2 Decimal(1.000) s/2 + s/2 Decimal(1.000) The problems there are due to rounding error: s/2 # the problem in s/2+s/2 is that s/2 rounds up to exactly 1/2 Decimal(0.5000) s+s # the problem in (s+s)/2 is that s+s rounds up to exactly 2 Decimal(2.000) It's always something ;-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
Rory Campbell-Lange wrote: Is using the decimal module the best way around this? (I'm expecting the first sum to match the second). It seem anachronistic that decimal takes strings as input, though. What's your problem with the result, or what's your goal? Such precision errors with floating point numbers are normal because the precision is limited technically. For floats a and b, you'd seldom say if a == b: (because it's often false as in your case) but rather if a - b threshold: for a reasonable threshold value which depends on your application. Also check the recent thread bizarre floating point output. Regards, Björn -- BOFH excuse #333: A plumber is needed, the network drain is clogged -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
At Monday 8/1/2007 19:20, Bjoern Schliessmann wrote: Rory Campbell-Lange wrote: Is using the decimal module the best way around this? (I'm expecting the first sum to match the second). It seem anachronistic that decimal takes strings as input, though. [...] Also check the recent thread bizarre floating point output. And the last section on the Python Tutorial Floating Point Arithmetic: Issues and Limitations -- Gabriel Genellina Softlab SRL __ Preguntá. Respondé. Descubrí. Todo lo que querías saber, y lo que ni imaginabas, está en Yahoo! Respuestas (Beta). ¡Probalo ya! http://www.yahoo.com.ar/respuestas -- http://mail.python.org/mailman/listinfo/python-list
Re: Maths error
On Jan 8, 3:30 pm, Rory Campbell-Lange [EMAIL PROTECTED] wrote: (1.0/10.0) + (2.0/10.0) + (3.0/10.0) 0.60009 6.0/10.0 0.59998 Is using the decimal module the best way around this? (I'm expecting the first sum to match the second). Probably not. Decimal arithmetic is NOT a cure-all for floating-point arithmetic errors. Decimal(1) / Decimal(3) * Decimal(3) Decimal(0.) Decimal(2).sqrt() ** 2 Decimal(1.999) It seem anachronistic that decimal takes strings as input, though. How else would you distinguish Decimal('0.1') from Decimal('0.155511151231257827021181583404541015625')? -- http://mail.python.org/mailman/listinfo/python-list