To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=104430 Issue #|104430 Summary|Modularize Calc's expression evaluator to support alte |rnate arithmetics Component|Spreadsheet Version|OOO310m15 Platform|All URL|http://speleotrove.com/decimal/ OS/Version|All Status|UNCONFIRMED Status whiteboard| Keywords| Resolution| Issue type|ENHANCEMENT Priority|P3 Subcomponent|programming Assigned to|spreadsheet Reported by|jeffrey_s
------- Additional comments from jeffre...@openoffice.org Sat Aug 22 16:35:13 +0000 2009 ------- Please consider modularizing Calc's expression evaluator and documenting the resulting interface. It would be desirable to move away from binary floating point arithmetic for spreadsheet calculations for the reasons mentioned in http://speleotrove.com/decimal/. (It would also be nice to be able to adjust the precision with which calculations are done, and implement alternate semantics such as interval arithmetic, but I digress.) This is, of course, practicably impossible, given the conflicting goals of maintaining approximate compatibility with older spreadsheet versions, various macro & programming APIs, plug-ins, and an infamous spreadsheet from a Redmond, Washington company. However, if we modularize things, we can replace the expression evaluator to enable new semantics later on. Right now, all of the spreadsheet values are passed back and forth internally as IEEE 754 double precision binary floats. Cosmetic rounding to suppress binary floating point artifacts is performed after addition and subtraction (maybe others) by flushing answers close to zero to zero. This is not documented anywhere that I have found. (It's probably a better policy than the one cooked up in Redmond. Refer to some of William Kahan's papers on the topic.) However, it still creates a chasm between zero and the nearest nonzero numbers. (For reasons why this is bad, refer to the discussions surrounding the denormalized values in IEEE 754, especially with the new revision. Flush-to-zero is wonderful if you're just doing graphics for a game, but deadly if the numbers actually mean something. This isn't quite the same issue, but it has the same effect.) There is a check-box in "Options" to have the string value of the spreadsheet cell reparsed for use as an input to calculations that refer to it instead of using a cached binary floating point value, which sort of addresses this issue, but (1) it will always suffer from double rounding, which makes it break some corner cases when doing modular arithmetic and (2) it can't do any better than double precision binary floating point, and (3) it gets in the way of formatting a spreadsheet to carry all the decimal places when you're trying to simulate that kind of thing. If the expression evaluator could be moved to a .DLL/.so, and the API generalized to pass a string or general-purpose data structure (instead of just a binary double-precision float) (and documented thus), then it could be replaced in a separate project with a tool that would provide expression evaluation using decimal semantics, rational semantics, or otherwise modified semantics. In order to actually make this doable in polynomial time, I suggest replacing the internal representation of the double-precision cell contents value with a string that will initially just hex-encode (or maybe octal would be faster) the IEEE 754 double bitfields (and treat them as opaque outside the expression evaluator). Make all the conversions to and from and all calculations the responsibility of the expression evaluator .DLL/.so, so that everything else outside the expression evaluator .DLL/.so treats the encoded float as opaque. All of the standard spreadsheet formulas would be the responsibility of the expression evaluator (as they depend on the format of the floating point numbers used). With the expression evaluation mechanism parameterized out of the rest of the spreadsheet module, it would be possible to create alternate expression evaluators that did not depend of having everything reduced to a common binary floating point representation, which would in turn permit representations that carry more precision or follow different rounding semantics. Ultimately, we're going to have to move spreadsheets away from binary floating point, and this will be true of all spreadsheets, not just OpenOffice. People expect their calculators to follow the rules that they learned in school, and that's why calculator universally use decimal representation internally. They also expect spreadsheets to follow the rules, but for performance reasons about 20 years ago, we needed to use binary floating point, and we've been stuck with it ever since. We're not facing performace issues of the type we faced 20 years ago. There's no reason to continue to inflict this on ourselves. --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@sc.openoffice.org For additional commands, e-mail: issues-h...@sc.openoffice.org --------------------------------------------------------------------- To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org