Please, let's revisit this, and not postpone it without further discussion. I never knew about the correctness issues in div_var(), but since I know about it, I feel I am just waiting until that problem will hit me or anyone else. So can you, Tom, please describe in what situations the old code is really unsafe?

We usually *round* all values to at maximum 4 decimal places -- are we on the save side? Does this only affect high precision division, or any divisions?

Best Regards
Michael Paesold

Bruce Momjian wrote:
Because this has not been applied, this has been saved for the 8.4 release:

        http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Tom Lane wrote:
I wrote:
I just blew the dust off my old copy of Knuth vol 2, and see that his
algorithm for multi-precision division generates output digits that are
correct to start with (or at least he never needs to revisit a digit
after moving on to the next).  ISTM we should go over to an approach
like that.
The attached proposed patch rewrites div_var() using Knuth's algorithm,
meaning that division should always produce an exact truncated output
when asked to truncate at X number of places.  This passes regression
tests and fixes both of the cases previously exhibited:
http://archives.postgresql.org/pgsql-bugs/2007-06/msg00068.php
http://archives.postgresql.org/pgsql-general/2005-05/msg01109.php

The bad news is that it's significantly slower than the CVS-HEAD code;
it appears that for long inputs, div_var is something like 4X slower
than before, depending on platform.  The numeric_big regression test
takes about twice as long as before on one of my machines, and 50%
longer on another.  This is because the innermost loop now involves
integer division, which it didn't before.  (According to oprofile,
just about all the time goes into the loop that subtracts qhat * divisor
from the working dividend, which is what you'd expect.)

Now it's unlikely that real-world applications are going to be as
dependent on the speed of div_var for long inputs as numeric_big is.
And getting the right answer has to take priority over speed anyway.
Still this is a bit annoying.  Anyone see a way to speed it up, or
have another approach?

                        regards, tom lane


Content-Description: numeric-div.patch.gz

[ Type application/octet-stream treated as attachment, skipping... ]


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to