Re: [HACKERS] Reduce palloc's in numeric operations.
Thanks for comments, > Have to be careful to really not modify the > operands. numeric_out() and numeric_out_sci() are wrong; they > call get_str_from_var(), which modifies the argument. Same with > numeric_expr(): it passes the argument to > numericvar_to_double_no_overflow(), which passes it to > get_str_from_var(). numericvar_to_int8() also modifies its > argument, so all the functions that use that, directly or > indirectly, must make a copy. mmm. My carefulness was a bit short to pick up them... I overlooked that get_str_from_var() and numeric_to_int8() calls round_var() which rewrites the operand. I reverted numeric_out() and numeric_int8(), numeric_int2(). Altough, I couldn't find in get_str_from_var_sci() where the operand would be modified. The lines where var showing in get_str_from_var_sci() is listed below. | 2:get_str_from_var_sci(NumericVar *var, int rscale) | 21: if (var->ndigits > 0) | 23:exponent = (var->weight + 1) * DEC_DIGITS; | 29:exponent -= DEC_DIGITS - (int) log10(var->digits[0]); | 59: div_var(var, &denominator, &significand, rscale, true); The only suspect is div_var at this level, and do the same thing for var1 in div_var. | 2:div_var(NumericVar *var1, NumericVar *var2, NumericVar *result, | 20: int var1ndigits = var1->ndigits; | 35: if (var1ndigits == 0) | 47: if (var1->sign == var2->sign) | 51: res_weight = var1->weight - var2->weight; | 68: div_ndigits = Max(div_ndigits, var1ndigits); | 83: memcpy(dividend + 1, var1->digits, var1ndigits * sizeof(NumericDigit)); | 132: for (i = var1ndigits; i >= 0; i--) No line seems to modify var1 as far as I see so I've left numeric_out_sci() modified in this patch. Well, I found some other bugs in numeric_stddev_internal. vN was errorniously freed and vsumX2 in is used as work. They are fixed in this new patch. > Perhaps get_str_from_var(), and the other functions that > currently scribble on the arguments, should be modified to not > do so. They could easily make a copy of the argument within the > function. Then the callers could safely use > set_var_from_num_nocopy(). The performance would be the same, > you would have the same number of pallocs, but you would get > rid of the surprising argument-modifying behavior of those > functions. I agree with that. const qualifiers on parameters would rule this mechanically. I try this for the next version of this patch. > SELECT SUM(col) FROM numtest; > > The execution time of that query fell from about 5300 ms to 4300 ms, ie. > about 20%. Wow, it seems more promising than I expected. Thanks. regards, -- 堀口恭太郎 日本電信電話株式会社 NTTオープンソフトウェアセンタ Phone: 03-5860-5115 / Fax: 03-5463-5490 diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 68c1f1d..fcff325 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -367,6 +367,7 @@ static void zero_var(NumericVar *var); static const char *set_var_from_str(const char *str, const char *cp, NumericVar *dest); static void set_var_from_num(Numeric value, NumericVar *dest); +static void set_var_from_num_nocopy(Numeric num, NumericVar *dest); static void set_var_from_var(NumericVar *value, NumericVar *dest); static char *get_str_from_var(NumericVar *var, int dscale); static char *get_str_from_var_sci(NumericVar *var, int rscale); @@ -617,11 +618,10 @@ numeric_out_sci(Numeric num, int scale) return pstrdup("NaN"); init_var(&x); - set_var_from_num(num, &x); + set_var_from_num_nocopy(num, &x); str = get_str_from_var_sci(&x, scale); - free_var(&x); return str; } @@ -696,7 +696,7 @@ numeric_send(PG_FUNCTION_ARGS) int i; init_var(&x); - set_var_from_num(num, &x); + set_var_from_num_nocopy(num, &x); pq_begintypsend(&buf); @@ -707,8 +707,6 @@ numeric_send(PG_FUNCTION_ARGS) for (i = 0; i < x.ndigits; i++) pq_sendint(&buf, x.digits[i], sizeof(NumericDigit)); - free_var(&x); - PG_RETURN_BYTEA_P(pq_endtypsend(&buf)); } @@ -1577,15 +1575,13 @@ numeric_add(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); add_var(&arg1, &arg2, &result); res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1620,15 +1616,13 @@ numeric_sub(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); sub_var(&arg1, &arg2, &result); res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1667,15 +1661,13 @@ numeric_mul(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_
Re: [HACKERS] Reduce palloc's in numeric operations.
Kyotaro HORIGUCHI wrote: > Hello, I will propose reduce palloc's in numeric operations. > > The numeric operations are slow by nature, but usually it is not > a problem for on-disk operations. Altough the slowdown is > enhanced on on-memory operations. > > I inspcted them and found some very short term pallocs. These > palloc's are used for temporary storage for digits of unpaked > numerics. This looks like a neat little patch. Some feedback has been provided by Heikki (thanks!) and since we're still waiting for an updated version, I have marked this Returned with Feedback for the time being. Please make sure to address the remaining issues and submit to the next commitfest. Thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce palloc's in numeric operations.
On 14.09.2012 11:25, Kyotaro HORIGUCHI wrote: Hello, I will propose reduce palloc's in numeric operations. The numeric operations are slow by nature, but usually it is not a problem for on-disk operations. Altough the slowdown is enhanced on on-memory operations. I inspcted them and found some very short term pallocs. These palloc's are used for temporary storage for digits of unpaked numerics. The formats of numeric digits in packed and unpaked forms are same. So we can kicked out a part of palloc's using digits in packed numeric in-place to make unpakced one. In this patch, I added new function set_var_from_num_nocopy() to do this. And make use of it for operands which won't modified. Have to be careful to really not modify the operands. numeric_out() and numeric_out_sci() are wrong; they call get_str_from_var(), which modifies the argument. Same with numeric_expr(): it passes the argument to numericvar_to_double_no_overflow(), which passes it to get_str_from_var(). numericvar_to_int8() also modifies its argument, so all the functions that use that, directly or indirectly, must make a copy. Perhaps get_str_from_var(), and the other functions that currently scribble on the arguments, should be modified to not do so. They could easily make a copy of the argument within the function. Then the callers could safely use set_var_from_num_nocopy(). The performance would be the same, you would have the same number of pallocs, but you would get rid of the surprising argument-modifying behavior of those functions. The performance gain seems quite moderate 'SELECT SUM(numeric_column) FROM on_memory_table' for ten million rows and about 8 digits numeric runs for 3480 ms aganst original 3930 ms. It's 11% gain. 'SELECT SUM(int_column) FROM on_memory_table' needed 1570 ms. Similary 8% gain for about 30 - 50 digits numeric. Performance of avg(numeric) made no gain in contrast. Do you think this worth doing? Yes, I think this is worthwhile. I'm seeing an even bigger gain, with smaller numerics. I created a table with this: CREATE TABLE numtest AS SELECT a::numeric AS col FROM generate_series(1, 1000) a; And repeated this query with \timing: SELECT SUM(col) FROM numtest; The execution time of that query fell from about 5300 ms to 4300 ms, ie. about 20%. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reduce palloc's in numeric operations.
Hello, I will propose reduce palloc's in numeric operations. The numeric operations are slow by nature, but usually it is not a problem for on-disk operations. Altough the slowdown is enhanced on on-memory operations. I inspcted them and found some very short term pallocs. These palloc's are used for temporary storage for digits of unpaked numerics. The formats of numeric digits in packed and unpaked forms are same. So we can kicked out a part of palloc's using digits in packed numeric in-place to make unpakced one. In this patch, I added new function set_var_from_num_nocopy() to do this. And make use of it for operands which won't modified. The performance gain seems quite moderate 'SELECT SUM(numeric_column) FROM on_memory_table' for ten million rows and about 8 digits numeric runs for 3480 ms aganst original 3930 ms. It's 11% gain. 'SELECT SUM(int_column) FROM on_memory_table' needed 1570 ms. Similary 8% gain for about 30 - 50 digits numeric. Performance of avg(numeric) made no gain in contrast. Do you think this worth doing? regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 68c1f1d..8e88bd5 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -367,6 +367,7 @@ static void zero_var(NumericVar *var); static const char *set_var_from_str(const char *str, const char *cp, NumericVar *dest); static void set_var_from_num(Numeric value, NumericVar *dest); +static void set_var_from_num_nocopy(Numeric num, NumericVar *dest); static void set_var_from_var(NumericVar *value, NumericVar *dest); static char *get_str_from_var(NumericVar *var, int dscale); static char *get_str_from_var_sci(NumericVar *var, int rscale); @@ -540,12 +541,10 @@ numeric_out(PG_FUNCTION_ARGS) * from rounding. */ init_var(&x); - set_var_from_num(num, &x); + set_var_from_num_nocopy(num, &x); str = get_str_from_var(&x, x.dscale); - free_var(&x); - PG_RETURN_CSTRING(str); } @@ -617,11 +616,10 @@ numeric_out_sci(Numeric num, int scale) return pstrdup("NaN"); init_var(&x); - set_var_from_num(num, &x); + set_var_from_num_nocopy(num, &x); str = get_str_from_var_sci(&x, scale); - free_var(&x); return str; } @@ -696,7 +694,7 @@ numeric_send(PG_FUNCTION_ARGS) int i; init_var(&x); - set_var_from_num(num, &x); + set_var_from_num_nocopy(num, &x); pq_begintypsend(&buf); @@ -707,8 +705,6 @@ numeric_send(PG_FUNCTION_ARGS) for (i = 0; i < x.ndigits; i++) pq_sendint(&buf, x.digits[i], sizeof(NumericDigit)); - free_var(&x); - PG_RETURN_BYTEA_P(pq_endtypsend(&buf)); } @@ -1577,15 +1573,13 @@ numeric_add(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); add_var(&arg1, &arg2, &result); res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1620,15 +1614,13 @@ numeric_sub(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); sub_var(&arg1, &arg2, &result); res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1667,15 +1659,13 @@ numeric_mul(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); mul_var(&arg1, &arg2, &result, arg1.dscale + arg2.dscale); res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1711,8 +1701,8 @@ numeric_div(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); /* * Select scale for division result @@ -1726,8 +1716,6 @@ numeric_div(PG_FUNCTION_ARGS) res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1762,8 +1750,8 @@ numeric_div_trunc(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_var_from_num_nocopy(num1, &arg1); + set_var_from_num_nocopy(num2, &arg2); /* * Do the divide and return the result @@ -1772,8 +1760,6 @@ numeric_div_trunc(PG_FUNCTION_ARGS) res = make_result(&result); - free_var(&arg1); - free_var(&arg2); free_var(&result); PG_RETURN_NUMERIC(res); @@ -1802,15 +1788,13 @@ numeric_mod(PG_FUNCTION_ARGS) init_var(&arg2); init_var(&result); - set_var_from_num(num1, &arg1); - set_var_from_num(num2, &arg2); + set_va