Hi, On Sun, Jan 29, 2023, at 14:33, Dean Rasheed wrote: > On Sat, 28 Jan 2023 at 22:14, Joel Jacobson <j...@compiler.org> wrote: >> HEAD, patched: >> sweight = (arg.weight * DEC_DIGITS) / 2 + 1 > > You haven't actually said why this formula is more correct than the > current one. I believe that it is when arg.weight >= 0, but I'm not > convinced it's correct for arg.weight < 0.
Ops, I totally failed to consider arg.weight < 0. Nice catch, thanks! It seems that, when arg.weight < 0, the current sweight formula in HEAD is correct, and, when arg.weight >= 0, the formula I suggested seems to be an improvement. I think this is what we want: if (arg.weight < 0) sweight = (arg.weight + 1) * DEC_DIGITS / 2 - 1; else sweight = arg.weight * DEC_DIGITS / 2 + 1; For DEC_DIGITS == 4, they are both the same, and become (arg.weight * 2 + 1). But when DEC_DIGITS != 4 && arg.weight >= 0, it's an improvement, as we then don't need to generate unnecessarily many sig. digits, and more often get exactly NUMERIC_MIN_SIG_DIGITS in the result, while still never getting fewer than NUMERIC_MIN_SIG_DIGITS. When DEC_DIGITS == 4, the compiler optimizes away the if/else, and compiles it to exactly the same code as the current sweight formula, tested with Godbolt: Test code: #define DEC_DIGITS 4 int sweight(weight) { if (weight < 0) return (weight + 1) * DEC_DIGITS / 2 - 1; else return weight * DEC_DIGITS / 2 + 1; } Output for x86-64 gcc (trunk): sweight: lea eax, [rdi+1+rdi] ret So, the extra if/else shouldn't cause any overhead, when DEC_DIGITS == 4. Not sure how/if it can be mathematically proven why it's more correct when DEC_DIGITS != 4, i.e., when it makes a difference. I derived it based on the insight that the square root weight should naturally be half the arg weight, and that the integer divison means we must adjust for when the weight is not evenly divisable by two. But it seems possible to gain some confidence about the correctness of the improvement by experimentally testing a wide range of negative/positive arg.weight. I wrote the attached script, test_sweight_formulas.sh, for that purpose. It compares the number of sig. digits in the result for sqrt(trim_scale(2::numeric*10::numeric^exp)) where exp is -40..40, that is, for args between 0.0000000000000000000000000000000000000002 0.000000000000000000000000000000000000002 ... 2000000000000000000000000000000000000000 20000000000000000000000000000000000000000 The last column shows the diff in number of sig. figs between HEAD and fix-sweight-v2. As expected, there is no difference for DEC_DIGITS == 4. But note the differences for DEC_DIGITS == 2 and DEC_DIGITS == 1 further down. -- -- Comparison of sig_figs(sqrt(n)) for HEAD vs fix-sweight-v2 -- when DEC_DIGITS == 4 -- DEC_DIGITS | n | HEAD | fix-sweight-v2 | diff ------------+--------------+------+----------------+------ 4 | 2e-40 | 21 | 21 | 0 4 | 2e-39 | 20 | 20 | 0 4 | 2e-38 | 20 | 20 | 0 4 | 2e-37 | 19 | 19 | 0 4 | 2e-36 | 19 | 19 | 0 4 | 2e-35 | 18 | 18 | 0 4 | 2e-34 | 18 | 18 | 0 4 | 2e-33 | 17 | 17 | 0 4 | 2e-32 | 17 | 17 | 0 4 | 2e-31 | 16 | 16 | 0 4 | 2e-30 | 17 | 17 | 0 4 | 2e-29 | 17 | 17 | 0 4 | 2e-28 | 16 | 16 | 0 4 | 2e-27 | 16 | 16 | 0 4 | 2e-26 | 17 | 17 | 0 4 | 2e-25 | 17 | 17 | 0 4 | 2e-24 | 16 | 16 | 0 4 | 2e-23 | 16 | 16 | 0 4 | 2e-22 | 17 | 17 | 0 4 | 2e-21 | 17 | 17 | 0 4 | 2e-20 | 16 | 16 | 0 4 | 2e-19 | 16 | 16 | 0 4 | 2e-18 | 17 | 17 | 0 4 | 2e-17 | 17 | 17 | 0 4 | 2e-16 | 16 | 16 | 0 4 | 2e-15 | 16 | 16 | 0 4 | 2e-14 | 17 | 17 | 0 4 | 2e-13 | 17 | 17 | 0 4 | 2e-12 | 16 | 16 | 0 4 | 2e-11 | 16 | 16 | 0 4 | 0.0000000002 | 17 | 17 | 0 4 | 0.000000002 | 17 | 17 | 0 4 | 0.00000002 | 16 | 16 | 0 4 | 0.0000002 | 16 | 16 | 0 4 | 0.000002 | 17 | 17 | 0 4 | 0.00002 | 17 | 17 | 0 4 | 0.0002 | 16 | 16 | 0 4 | 0.002 | 16 | 16 | 0 4 | 0.02 | 17 | 17 | 0 4 | 0.2 | 17 | 17 | 0 4 | 2 | 16 | 16 | 0 4 | 20 | 16 | 16 | 0 4 | 200 | 17 | 17 | 0 4 | 2000 | 17 | 17 | 0 4 | 20000 | 16 | 16 | 0 4 | 200000 | 16 | 16 | 0 4 | 2000000 | 17 | 17 | 0 4 | 20000000 | 17 | 17 | 0 4 | 200000000 | 16 | 16 | 0 4 | 2000000000 | 16 | 16 | 0 4 | 20000000000 | 17 | 17 | 0 4 | 2e+11 | 17 | 17 | 0 4 | 2e+12 | 16 | 16 | 0 4 | 2e+13 | 16 | 16 | 0 4 | 2e+14 | 17 | 17 | 0 4 | 2e+15 | 17 | 17 | 0 4 | 2e+16 | 16 | 16 | 0 4 | 2e+17 | 16 | 16 | 0 4 | 2e+18 | 17 | 17 | 0 4 | 2e+19 | 17 | 17 | 0 4 | 2e+20 | 16 | 16 | 0 4 | 2e+21 | 16 | 16 | 0 4 | 2e+22 | 17 | 17 | 0 4 | 2e+23 | 17 | 17 | 0 4 | 2e+24 | 16 | 16 | 0 4 | 2e+25 | 16 | 16 | 0 4 | 2e+26 | 17 | 17 | 0 4 | 2e+27 | 17 | 17 | 0 4 | 2e+28 | 16 | 16 | 0 4 | 2e+29 | 16 | 16 | 0 4 | 2e+30 | 17 | 17 | 0 4 | 2e+31 | 17 | 17 | 0 4 | 2e+32 | 17 | 17 | 0 4 | 2e+33 | 17 | 17 | 0 4 | 2e+34 | 18 | 18 | 0 4 | 2e+35 | 18 | 18 | 0 4 | 2e+36 | 19 | 19 | 0 4 | 2e+37 | 19 | 19 | 0 4 | 2e+38 | 20 | 20 | 0 4 | 2e+39 | 20 | 20 | 0 4 | 2e+40 | 21 | 21 | 0 (81 rows) -- -- Comparison of sig_figs(sqrt(n)) for HEAD vs fix-sweight-v2 -- when DEC_DIGITS == 2 -- DEC_DIGITS | n | HEAD | fix-sweight-v2 | diff ------------+--------------+------+----------------+------ 2 | 2e-40 | 21 | 21 | 0 2 | 2e-39 | 20 | 20 | 0 2 | 2e-38 | 20 | 20 | 0 2 | 2e-37 | 19 | 19 | 0 2 | 2e-36 | 19 | 19 | 0 2 | 2e-35 | 18 | 18 | 0 2 | 2e-34 | 18 | 18 | 0 2 | 2e-33 | 17 | 17 | 0 2 | 2e-32 | 17 | 17 | 0 2 | 2e-31 | 17 | 17 | 0 2 | 2e-30 | 17 | 17 | 0 2 | 2e-29 | 17 | 17 | 0 2 | 2e-28 | 17 | 17 | 0 2 | 2e-27 | 17 | 17 | 0 2 | 2e-26 | 17 | 17 | 0 2 | 2e-25 | 17 | 17 | 0 2 | 2e-24 | 17 | 17 | 0 2 | 2e-23 | 17 | 17 | 0 2 | 2e-22 | 17 | 17 | 0 2 | 2e-21 | 17 | 17 | 0 2 | 2e-20 | 17 | 17 | 0 2 | 2e-19 | 17 | 17 | 0 2 | 2e-18 | 17 | 17 | 0 2 | 2e-17 | 17 | 17 | 0 2 | 2e-16 | 17 | 17 | 0 2 | 2e-15 | 17 | 17 | 0 2 | 2e-14 | 17 | 17 | 0 2 | 2e-13 | 17 | 17 | 0 2 | 2e-12 | 17 | 17 | 0 2 | 2e-11 | 17 | 17 | 0 2 | 0.0000000002 | 17 | 17 | 0 2 | 0.000000002 | 17 | 17 | 0 2 | 0.00000002 | 17 | 17 | 0 2 | 0.0000002 | 17 | 17 | 0 2 | 0.000002 | 17 | 17 | 0 2 | 0.00002 | 17 | 17 | 0 2 | 0.0002 | 17 | 17 | 0 2 | 0.002 | 17 | 17 | 0 2 | 0.02 | 17 | 17 | 0 2 | 0.2 | 17 | 17 | 0 2 | 2 | 17 | 16 | -1 2 | 20 | 17 | 16 | -1 2 | 200 | 17 | 16 | -1 2 | 2000 | 17 | 16 | -1 2 | 20000 | 17 | 16 | -1 2 | 200000 | 17 | 16 | -1 2 | 2000000 | 17 | 16 | -1 2 | 20000000 | 17 | 16 | -1 2 | 200000000 | 17 | 16 | -1 2 | 2000000000 | 17 | 16 | -1 2 | 20000000000 | 17 | 16 | -1 2 | 2e+11 | 17 | 16 | -1 2 | 2e+12 | 17 | 16 | -1 2 | 2e+13 | 17 | 16 | -1 2 | 2e+14 | 17 | 16 | -1 2 | 2e+15 | 17 | 16 | -1 2 | 2e+16 | 17 | 16 | -1 2 | 2e+17 | 17 | 16 | -1 2 | 2e+18 | 17 | 16 | -1 2 | 2e+19 | 17 | 16 | -1 2 | 2e+20 | 17 | 16 | -1 2 | 2e+21 | 17 | 16 | -1 2 | 2e+22 | 17 | 16 | -1 2 | 2e+23 | 17 | 16 | -1 2 | 2e+24 | 17 | 16 | -1 2 | 2e+25 | 17 | 16 | -1 2 | 2e+26 | 17 | 16 | -1 2 | 2e+27 | 17 | 16 | -1 2 | 2e+28 | 17 | 16 | -1 2 | 2e+29 | 17 | 16 | -1 2 | 2e+30 | 17 | 16 | -1 2 | 2e+31 | 17 | 16 | -1 2 | 2e+32 | 17 | 17 | 0 2 | 2e+33 | 17 | 17 | 0 2 | 2e+34 | 18 | 18 | 0 2 | 2e+35 | 18 | 18 | 0 2 | 2e+36 | 19 | 19 | 0 2 | 2e+37 | 19 | 19 | 0 2 | 2e+38 | 20 | 20 | 0 2 | 2e+39 | 20 | 20 | 0 2 | 2e+40 | 21 | 21 | 0 (81 rows) -- -- Comparison of sig_figs(sqrt(n)) for HEAD vs fix-sweight-v2 -- when DEC_DIGITS == 1 -- DEC_DIGITS | n | HEAD | fix-sweight-v2 | diff ------------+--------------+------+----------------+------ 1 | 2e-40 | 21 | 21 | 0 1 | 2e-39 | 20 | 20 | 0 1 | 2e-38 | 20 | 20 | 0 1 | 2e-37 | 19 | 19 | 0 1 | 2e-36 | 19 | 19 | 0 1 | 2e-35 | 18 | 18 | 0 1 | 2e-34 | 18 | 18 | 0 1 | 2e-33 | 17 | 17 | 0 1 | 2e-32 | 17 | 17 | 0 1 | 2e-31 | 17 | 17 | 0 1 | 2e-30 | 17 | 17 | 0 1 | 2e-29 | 17 | 17 | 0 1 | 2e-28 | 17 | 17 | 0 1 | 2e-27 | 17 | 17 | 0 1 | 2e-26 | 17 | 17 | 0 1 | 2e-25 | 17 | 17 | 0 1 | 2e-24 | 17 | 17 | 0 1 | 2e-23 | 17 | 17 | 0 1 | 2e-22 | 17 | 17 | 0 1 | 2e-21 | 17 | 17 | 0 1 | 2e-20 | 17 | 17 | 0 1 | 2e-19 | 17 | 17 | 0 1 | 2e-18 | 17 | 17 | 0 1 | 2e-17 | 17 | 17 | 0 1 | 2e-16 | 17 | 17 | 0 1 | 2e-15 | 17 | 17 | 0 1 | 2e-14 | 17 | 17 | 0 1 | 2e-13 | 17 | 17 | 0 1 | 2e-12 | 17 | 17 | 0 1 | 2e-11 | 17 | 17 | 0 1 | 0.0000000002 | 17 | 17 | 0 1 | 0.000000002 | 17 | 17 | 0 1 | 0.00000002 | 17 | 17 | 0 1 | 0.0000002 | 17 | 17 | 0 1 | 0.000002 | 17 | 17 | 0 1 | 0.00002 | 17 | 17 | 0 1 | 0.0002 | 17 | 17 | 0 1 | 0.002 | 17 | 17 | 0 1 | 0.02 | 17 | 17 | 0 1 | 0.2 | 17 | 17 | 0 1 | 2 | 18 | 16 | -2 1 | 20 | 17 | 16 | -1 1 | 200 | 18 | 16 | -2 1 | 2000 | 17 | 16 | -1 1 | 20000 | 18 | 16 | -2 1 | 200000 | 17 | 16 | -1 1 | 2000000 | 18 | 16 | -2 1 | 20000000 | 17 | 16 | -1 1 | 200000000 | 18 | 16 | -2 1 | 2000000000 | 17 | 16 | -1 1 | 20000000000 | 18 | 16 | -2 1 | 2e+11 | 17 | 16 | -1 1 | 2e+12 | 18 | 16 | -2 1 | 2e+13 | 17 | 16 | -1 1 | 2e+14 | 18 | 16 | -2 1 | 2e+15 | 17 | 16 | -1 1 | 2e+16 | 18 | 16 | -2 1 | 2e+17 | 17 | 16 | -1 1 | 2e+18 | 18 | 16 | -2 1 | 2e+19 | 17 | 16 | -1 1 | 2e+20 | 18 | 16 | -2 1 | 2e+21 | 17 | 16 | -1 1 | 2e+22 | 18 | 16 | -2 1 | 2e+23 | 17 | 16 | -1 1 | 2e+24 | 18 | 16 | -2 1 | 2e+25 | 17 | 16 | -1 1 | 2e+26 | 18 | 16 | -2 1 | 2e+27 | 17 | 16 | -1 1 | 2e+28 | 18 | 16 | -2 1 | 2e+29 | 17 | 16 | -1 1 | 2e+30 | 18 | 16 | -2 1 | 2e+31 | 17 | 16 | -1 1 | 2e+32 | 18 | 17 | -1 1 | 2e+33 | 17 | 17 | 0 1 | 2e+34 | 18 | 18 | 0 1 | 2e+35 | 18 | 18 | 0 1 | 2e+36 | 19 | 19 | 0 1 | 2e+37 | 19 | 19 | 0 1 | 2e+38 | 20 | 20 | 0 1 | 2e+39 | 20 | 20 | 0 1 | 2e+40 | 21 | 21 | 0 (81 rows) > You lost me here. In unpatched HEAD, sqrt(102::numeric) produces > 10.099504938362078, not 10.09950493836208 (with DEC_DIGITS = 4). And > wasn't your previous point that when DEC_DIGITS = 4, the new formula > is the same as the old one? My apologies, I failed to correctly copy/paste the output from my terminal, into the right DEC_DIGITS example, which made it look like the output changed for DEC_DIGITS == 4 even though it didn't. Note to myself to always write a test script to ensure results are reproducible, which they now are thanks to the new test_sweight_formulas.sh script. >> In passing, also add pow10[] values for DEC_DIGITS==2 and DEC_DIGITS==1, >> since otherwise it's not possible to compile such DEC_DIGITS values >> due to the assert: >> >> StaticAssertDecl(lengthof(pow10) == DEC_DIGITS, "mismatch with >> DEC_DIGITS"); >> > > That might be worth doing, to ensure that the code still compiles for > other DEC_DIGITS/NBASE values. I'm not sure how useful that really is > anymore though. As the comment at the top says, it's kept mostly for > historical reasons. Attached patch fix-pow10-assert.patch /Joel
fix-pow10-assert.patch
Description: Binary data
fix-sweight-v2.patch
Description: Binary data
DEC_DIGITS_1.patch
Description: Binary data
DEC_DIGITS_2.patch
Description: Binary data
#!/bin/sh current_dir=$(pwd) export PGDATA="$current_dir/pgtemp-data" export PGPORT=54321 export PATH="$current_dir/pgtemp/bin:$PATH" if [ ! -d "postgresql" ]; then git clone https://git.postgresql.org/git/postgresql.git mv postgresql-foo postgresql ( cd postgresql && \ git reset --hard && \ ./configure --prefix="$current_dir/pgtemp" && \ make -j4 && \ make install ) && \ initdb && \ pg_ctl -l "$current_dir/pg.log" start && \ createdb "$USER" || \ exit 1 cat <<-"EOF" | psql -X || exit CREATE OR REPLACE FUNCTION sig_figs(numeric) RETURNS int LANGUAGE sql AS $$ SELECT 1+coalesce(floor(log(abs(nullif($1,0))))::int,0)+scale($1) $$; CREATE TABLE test_results ( "DEC_DIGITS" int, pg_version text, n_double double precision, -- used for sorting, can't use numeric since that's what we are testing n_text text, sqrt text, sqrt_text text, sqrt_sig_figs int ); EOF else pg_ctl -l "$current_dir/pg.log" start || exit cat <<-"EOF" | psql || exit TRUNCATE TABLE test_results; EOF fi for pg_version in "HEAD" "fix-sweight-v2"; do for DEC_DIGITS in 4 2 1; do ( cd postgresql && \ git reset --hard && \ git apply "$current_dir/fix-pow10-assert.patch" ) || exit if [ "$pg_version" != "HEAD" ] then (cd postgresql && \ git apply "$current_dir/$pg_version.patch") || exit fi if [ $DEC_DIGITS != 4 ] then (cd postgresql && \ git apply "$current_dir/DEC_DIGITS_$DEC_DIGITS.patch") || exit fi (cd postgresql && make install && pg_ctl restart) || exit cat <<-EOF | psql -X || exit WITH nums AS ( SELECT exp, trim_scale(2::numeric*10::numeric^exp) n FROM generate_series(-40,40) exp ) INSERT INTO test_results ("DEC_DIGITS", pg_version, n_double, n_text, sqrt, sqrt_sig_figs) SELECT '$DEC_DIGITS', '$pg_version', n::double precision, CASE WHEN abs(exp) > 10 THEN format('2e%s%s',CASE WHEN exp > 0 THEN '+' END,exp) ELSE n::text END, sqrt(n), sig_figs(sqrt(n)) FROM nums; EOF done done for DEC_DIGITS in 4 2 1; do echo "--" echo "-- Comparison of sig_figs(sqrt(n)) for HEAD vs fix-sweight-v2" echo "-- when DEC_DIGITS == $DEC_DIGITS" echo "--" cat <<-EOF | psql -X || exit SELECT a."DEC_DIGITS", a.n_text AS n, a.sqrt_sig_figs AS "HEAD", b.sqrt_sig_figs AS "fix-sweight-v2", b.sqrt_sig_figs - a.sqrt_sig_figs AS diff FROM test_results a JOIN test_results b USING ("DEC_DIGITS",n_text) WHERE a.pg_version = 'HEAD' AND b.pg_version = 'fix-sweight-v2' AND a."DEC_DIGITS" = $DEC_DIGITS ORDER BY a.n_double EOF done pg_ctl stop