[sc-issues] [Issue 104430] Modularize Calc's express ion evaluator to support alternate arithmetics

jeffrey_s Sat, 22 Aug 2009 09:35:17 -0700

To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=104430
                 Issue #|104430
                 Summary|Modularize Calc's expression evaluator to support alte
                        |rnate arithmetics
               Component|Spreadsheet
                 Version|OOO310m15
                Platform|All
                     URL|http://speleotrove.com/decimal/
              OS/Version|All
                  Status|UNCONFIRMED
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|ENHANCEMENT
                Priority|P3
            Subcomponent|programming
             Assigned to|spreadsheet
             Reported by|jeffrey_s






------- Additional comments from jeffre...@openoffice.org Sat Aug 22 16:35:13 
+0000 2009 -------
Please consider modularizing Calc's expression evaluator and documenting the
resulting interface.

It would be desirable to move away from binary floating point arithmetic for
spreadsheet calculations for the reasons mentioned in
http://speleotrove.com/decimal/.  (It would also be nice to be able to adjust
the precision with which calculations are done, and implement alternate
semantics such as interval arithmetic, but I digress.)

This is, of course, practicably impossible, given the conflicting goals of
maintaining approximate compatibility with older spreadsheet versions, various
macro & programming APIs, plug-ins, and an infamous spreadsheet from a Redmond,
Washington company.  However, if we modularize things, we can replace the
expression evaluator to enable new semantics later on.

Right now, all of the spreadsheet values are passed back and forth internally as
IEEE 754 double precision binary floats.  Cosmetic rounding to suppress binary
floating point artifacts is performed after addition and subtraction (maybe
others) by flushing answers close to zero to zero.  This is not documented
anywhere that I have found.  (It's probably a better policy than the one cooked
up in Redmond.  Refer to some of William Kahan's papers on the topic.)  However,
it still creates a chasm between zero and the nearest nonzero numbers.  (For
reasons why this is bad, refer to the discussions surrounding the denormalized
values in IEEE 754, especially with the new revision.  Flush-to-zero is
wonderful if you're just doing graphics for a game, but deadly if the numbers
actually mean something.  This isn't quite the same issue, but it has the same
effect.)

There is a check-box in "Options" to have the string value of the spreadsheet
cell reparsed for use as an input to calculations that refer to it instead of
using a cached binary floating point value, which sort of addresses this issue,
but (1) it will always suffer from double rounding, which makes it break some
corner cases when doing modular arithmetic and (2) it can't do any better than
double precision binary floating point, and (3) it gets in the way of formatting
a spreadsheet to carry all the decimal places when you're trying to simulate
that kind of thing.

If the expression evaluator could be moved to a .DLL/.so, and the API
generalized to pass a string or general-purpose data structure (instead of just
a binary double-precision float) (and documented thus), then it could be
replaced in a separate project with a tool that would provide expression
evaluation using decimal semantics, rational semantics, or otherwise modified
semantics.  

In order to actually make this doable in polynomial time, I suggest replacing
the internal representation of the double-precision cell contents value with a
string that will initially just hex-encode (or maybe octal would be faster) the
IEEE 754 double bitfields (and treat them as opaque outside the expression
evaluator).  Make all the conversions to and from and all calculations the
responsibility of the expression evaluator .DLL/.so, so that everything else
outside the expression evaluator .DLL/.so treats the encoded float as opaque. 
All of the standard spreadsheet formulas would be the responsibility of the
expression evaluator (as they depend on the format of the floating point numbers
used).

With the expression evaluation mechanism parameterized out of the rest of the
spreadsheet module, it would be possible to create alternate expression
evaluators that did not depend of having everything reduced to a common binary
floating point representation, which would in turn permit representations that
carry more precision or follow different rounding semantics.

Ultimately, we're going to have to move spreadsheets away from binary floating
point, and this will be true of all spreadsheets, not just OpenOffice.  People
expect their calculators to follow the rules that they learned in school, and
that's why calculator universally use decimal representation internally.  They
also expect spreadsheets to follow the rules, but for performance reasons about
20 years ago, we needed to use binary floating point, and we've been stuck with
it ever since.  We're not facing performace issues of the type we faced 20 years
ago.  There's no reason to continue to inflict this on ourselves.

---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@sc.openoffice.org
For additional commands, e-mail: issues-h...@sc.openoffice.org


---------------------------------------------------------------------
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org

[sc-issues] [Issue 104430] Modularize Calc's express ion evaluator to support alternate arithmetics

Reply via email to