On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:
> Hello.
>
> At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <[email protected]> wrote in
> <[email protected]>
> > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > > Folks,
> > > > >
> > > > > Please find attached a couple of patches intended to $subject.
> > > > >
> > > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > > pretty decent.
> > > >
> > > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > > help in any cases I was testing.
> > >
> > > Found a couple of "whiles" that should have been "ifs."
> >
> > Factored out some inefficient functions and made the guts use the more
> > efficient function.
>
> I'm not sure this is on the KISS principle, but looked it and
> have several random comments.
>
> +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
>
> I don't think that we are allowing that as project coding
> policy. It seems to have been introduced only to accept external
> code as-is.
Changed to fit current policy.
> - char str[23]; /* sign, 21 digits and '\0' */
> + char str[MAXINT8LEN];
>
> It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN
> can be so. I think MAXINT8LEN should be 20 and the definition
> should be str[MAXINT8LEN + 1].
Done.
> +static const char DIGIT_TABLE[200] = {
> + '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6',
> '0', '7', '0', '8', '0', '9',
>
> Wouldn't it be simpler if it were defined as a constant string?
>
> static const char DIGIT_TABLE[201] =
> "000102030405....19"
> "202122232425....39"
> ..
I thought this might be even clearer:
"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
> +pg_ltoa_n(int32 value, char *a)
> ...
> + /* Compute the result string. */
> + while (value >= 100000000)
>
> We have only two degits above the value. Isn't the stuff inside
> the while a waste of cycles?
Changed the while to an if.
> + /* Expensive 64-bit division. Optimize? */
>
> I believe compiler treats such trivial optimizations. (concretely
> converts into shifts and subtractons if faster.)
Comments removed.
> + memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
>
> Maybe it'd be easy to read if 'a + olength - i' is a single variable.
Done.
> + i += adjust;
> + return i;
>
> If 'a + olength - i' is replaced with a variable, the return
> statement is replacable with "return olength + adjust;".
I'm not sure I understand this.
> + return t + (v >= PowersOfTen[t]);
>
> I think it's better that if it were 't - (v < POT[t]) + 1; /*
> log10(v) + 1 */'. At least we need an explanation of the
> difference. (I'didn't checked the algorithm is truely right,
> though.)
Comments added.
> > void
> > pg_lltoa(int64 value, char *a)
> > {
> ..
> > memcpy(a, "-9223372036854775808", 21);
> ..
> > memcpy(a, "0", 2);
>
> The lines need a comment like "/* length contains trailing '\0'
> */"
Comments added.
> + if (value >= 0)
> ...
> + else
> + {
> + if (value == PG_INT32_MIN)
> + memcpy(str, "-2147483648", 11);
> + return str + 11;
> > }
> + *str++ = '-';
> + return pg_ltostr_zeropad(str, -value, minwidth - 1);
>
> If then block of the if statement were (values < 0), we won't
> need to reenter the functaion.
This is a tail-call recursion, so it's probably optimized already.
> + len = pg_ltoa_n(value, str);
> + if (minwidth <= len)
> + return str + len;
> +
> + memmove(str + minwidth - len, str, len);
>
> If the function had the parameters str with the room only for two
> digits and a NUL, 2 as minwidth but 1000 as value, the function
> would overrun the buffer. The original function just ignores
> overflowing digits.
I believe the original was incorrect.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
>From 30d8a2b1bd2f6094a5cb4dd8acb9cc36c837802a Mon Sep 17 00:00:00 2001
From: David Fetter <[email protected]>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v7] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"
This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit
- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter
diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
case INT8OID:
{
int64 num = DatumGetInt64(value);
- char str[23]; /* sign, 21 digits and '\0' */
+ char str[MAXINT8LEN + 1];
pg_lltoa(num, str);
pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
#include "utils/builtins.h"
-#define MAXINT8LEN 25
-
typedef struct
{
int64 current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..fef6da672b 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,64 @@
#include "common/int.h"
#include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+ uint32 t;
+ static uint64 PowersOfTen[] =
+ {1, 10, 100,
+ 1000, 10000, 100000,
+ 1000000, 10000000, 100000000,
+ 1000000000};
+ /*
+ * Compute base-10 logarithm by dividing the base-2 logarithm
+ * by a good-enough approximation of the base-2 logarithm of 10
+ */
+ t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+ return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+ uint32 t;
+ static uint64 PowersOfTen[] =
+ {1, 10, 100,
+ 1000, 10000, 100000,
+ 1000000, 10000000, 100000000,
+ 1000000000, 10000000000, 100000000000,
+ 1000000000000, 10000000000000, 100000000000000,
+ 1000000000000000, 10000000000000000, 100000000000000000,
+ 1000000000000000000};
+
+ /*
+ * Compute base-10 logarithm by dividing the base-2 logarithm
+ * by a good-enough approximation of the base-2 logarithm of 10
+ */
+ t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+ return t + (v >= PowersOfTen[t]);
+}
/*
* pg_atoi: convert string to integer
@@ -276,16 +334,17 @@ pg_itoa(int16 i, char *a)
}
/*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
*
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
*/
-void
-pg_ltoa(int32 value, char *a)
+int32
+pg_ltoa_n(int32 value, char *a)
{
- char *start = a;
- bool neg = false;
+ uint32 olength;
+ uint32 i = 0, adjust = 0;
/*
* Avoid problems with the most negative integer not being representable
@@ -293,53 +352,114 @@ pg_ltoa(int32 value, char *a)
*/
if (value == PG_INT32_MIN)
{
- memcpy(a, "-2147483648", 12);
- return;
+ memcpy(a, "-2147483648", 11);
+ return 11;
}
- else if (value < 0)
+
+ /* Might as well handle this case, too */
+ if (value == 0)
+ {
+ memcpy(a, "0", 1);
+ return 1;
+ }
+
+ if (value < 0)
{
value = -value;
- neg = true;
- }
-
- /* Compute the result string backwards. */
- do
- {
- int32 remainder;
- int32 oldval = value;
-
- value /= 10;
- remainder = oldval - value * 10;
- *a++ = '0' + remainder;
- } while (value != 0);
-
- if (neg)
*a++ = '-';
+ adjust++;
+ }
+
+ olength = decimalLength32(value);
+
+ /* Compute the result string. */
+ if (value >= 100000000)
+ {
+ const uint32 value2 = value % 100000000;
+
+ value /= 100000000;
- /* Add trailing NUL byte, and back up 'a' to the last character. */
- *a-- = '\0';
+ const uint32 c = value2 % 10000;
+ const uint32 d = value2 / 10000;
+ const uint32 c0 = (c % 100) << 1;
+ const uint32 c1 = (c / 100) << 1;
+ const uint32 d0 = (d % 100) << 1;
+ const uint32 d1 = (d / 100) << 1;
- /* Reverse string. */
- while (start < a)
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+ memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+ memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+ memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+ i += 8;
+ }
+
+ if (value >= 10000)
{
- char swap = *start;
+ const uint32 c = value - 10000 * (value / 10000);
+
+ value /= 10000;
- *start++ = *a;
- *a-- = swap;
+ const uint32 c0 = (c % 100) << 1;
+ const uint32 c1 = (c / 100) << 1;
+
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+ memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+ i += 4;
}
+ if (value >= 100)
+ {
+ const uint32 c = (value % 100) << 1;
+
+ value /= 100;
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c, 2);
+ i += 2;
+ }
+ if (value >= 10)
+ {
+ const uint32 c = value << 1;
+
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c, 2);
+ i += 2;
+ }
+ else
+ {
+ *a = (char) ('0' + value);
+ i++;
+ }
+
+ i += adjust;
+ return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+ int32 len = pg_ltoa_n(value, a);
+ a[len] = '\0';
}
/*
* pg_lltoa: convert a signed 64-bit integer to its string representation
*
* Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
*/
void
pg_lltoa(int64 value, char *a)
{
- char *start = a;
- bool neg = false;
+ uint32 olength;
+ uint32 i = 0;
/*
* Avoid problems with the most negative integer not being representable
@@ -347,40 +467,91 @@ pg_lltoa(int64 value, char *a)
*/
if (value == PG_INT64_MIN)
{
- memcpy(a, "-9223372036854775808", 21);
+ /* Include terminating NUL */
+ memcpy(a, "-9223372036854775808", MAXINT8LEN + 1);
return;
}
- else if (value < 0)
+
+ /* Might as well handle this case, too */
+ if (value == 0)
+ {
+ /* Include terminating NUL */
+ memcpy(a, "0", 2);
+ return;
+ }
+
+ if (value < 0)
{
value = -value;
- neg = true;
- }
-
- /* Compute the result string backwards. */
- do
- {
- int64 remainder;
- int64 oldval = value;
-
- value /= 10;
- remainder = oldval - value * 10;
- *a++ = '0' + remainder;
- } while (value != 0);
-
- if (neg)
*a++ = '-';
+ }
+
+ olength = decimalLength64(value);
+
+ /* Compute the result string. */
+ while (value >= 100000000)
+ {
+ const uint64 q = value / 100000000; /* Most-significant digits */
+ uint32 value2 = (uint32) (value - 100000000 * q); /* Least-significant digits */
+
+ const uint32 c = value2 % 10000;
+ const uint32 d = value2 / 10000;
+ const uint32 c0 = (c % 100) << 1;
+ const uint32 c1 = (c / 100) << 1;
+ const uint32 d0 = (d % 100) << 1;
+ const uint32 d1 = (d / 100) << 1;
+
+ char *pos = a + olength - i;
+
+ value = q;
+
+ memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+ memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+ memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+ memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+ i += 8;
+ }
+
+ /* Switch to 32-bit for speed */
+ uint32 value2 = (uint32) value;
- /* Add trailing NUL byte, and back up 'a' to the last character. */
- *a-- = '\0';
+ if (value2 >= 10000)
+ {
+ const uint32 c = value2 - 10000 * (value2 / 10000);
+
+ value2 /= 10000;
+
+ const uint32 c0 = (c % 100) << 1;
+ const uint32 c1 = (c / 100) << 1;
+
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+ memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+ i += 4;
+ }
+ if (value2 >= 100)
+ {
+ const uint32 c = (value2 % 100) << 1;
- /* Reverse string. */
- while (start < a)
+ value2 /= 100;
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c, 2);
+ i += 2;
+ }
+ if (value2 >= 10)
{
- char swap = *start;
+ const uint32 c = value2 << 1;
- *start++ = *a;
- *a-- = swap;
+ char *pos = a + olength - i;
+ memcpy(pos - 2, DIGIT_TABLE + c, 2);
+ i += 2;
}
+ else
+ {
+ *a = (char) ('0' + value2);
+ }
+
+ a[olength] = '\0';
}
@@ -409,60 +580,44 @@ pg_lltoa(int64 value, char *a)
char *
pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
{
- char *start = str;
- char *end = &str[minwidth];
- int32 num = value;
+ int32 len;
Assert(minwidth > 0);
- /*
- * Handle negative numbers in a special way. We can't just write a '-'
- * prefix and reverse the sign as that would overflow for INT32_MIN.
- */
- if (num < 0)
+ if (value >= 0)
{
- *start++ = '-';
- minwidth--;
+ if (value < 100 && minwidth == 2) /* Short cut for common case */
+ {
+ const uint32 c = value << 1;
+ memcpy(str, DIGIT_TABLE + c, 2);
+ return str + 2;
+ }
+ len = pg_ltoa_n(value, str);
+ if (minwidth <= len)
+ return str + len;
+
+ memmove(str + minwidth - len, str, len);
+ for(int i = 0; i < minwidth - len; i++)
+ {
+ memcpy(str + i, DIGIT_TABLE, 1);
+ }
+ return str + minwidth;
+ }
+ else
+ {
/*
- * Build the number starting at the last digit. Here remainder will
- * be a negative number, so we must reverse the sign before adding '0'
- * in order to get the correct ASCII digit.
+ * Changing this number's sign would overflow PG_INT32_MAX,
+ * so special-case it.
*/
- while (minwidth--)
+ if (value == PG_INT32_MIN)
{
- int32 oldval = num;
- int32 remainder;
-
- num /= 10;
- remainder = oldval - num * 10;
- start[minwidth] = '0' - remainder;
+ memcpy(str, "-2147483648", 11);
+ return str + 11;
}
+ *str++ = '-';
+ return pg_ltostr_zeropad(str, -value, minwidth - 1);
}
- else
- {
- /* Build the number starting at the last digit */
- while (minwidth--)
- {
- int32 oldval = num;
- int32 remainder;
-
- num /= 10;
- remainder = oldval - num * 10;
- start[minwidth] = '0' + remainder;
- }
- }
-
- /*
- * If minwidth was not high enough to fit the number then num won't have
- * been divided down to zero. We punt the problem to pg_ltostr(), which
- * will generate a correct answer in the minimum valid width.
- */
- if (num != 0)
- return pg_ltostr(str, value);
-
- /* Otherwise, return last output character + 1 */
- return end;
}
/*
@@ -486,62 +641,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
char *
pg_ltostr(char *str, int32 value)
{
- char *start;
- char *end;
-
- /*
- * Handle negative numbers in a special way. We can't just write a '-'
- * prefix and reverse the sign as that would overflow for INT32_MIN.
- */
- if (value < 0)
- {
- *str++ = '-';
-
- /* Mark the position we must reverse the string from. */
- start = str;
-
- /* Compute the result string backwards. */
- do
- {
- int32 oldval = value;
- int32 remainder;
-
- value /= 10;
- remainder = oldval - value * 10;
- /* As above, we expect remainder to be negative. */
- *str++ = '0' - remainder;
- } while (value != 0);
- }
- else
- {
- /* Mark the position we must reverse the string from. */
- start = str;
-
- /* Compute the result string backwards. */
- do
- {
- int32 oldval = value;
- int32 remainder;
-
- value /= 10;
- remainder = oldval - value * 10;
- *str++ = '0' + remainder;
- } while (value != 0);
- }
-
- /* Remember the end+1 and back up 'str' to the last character. */
- end = str--;
-
- /* Reverse string. */
- while (start < str)
- {
- char swap = *start;
-
- *start++ = *str;
- *str-- = swap;
- }
-
- return end;
+ int32 len = pg_ltoa_n(value, str);
+ return str + len;
}
/*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..cd13536cf4 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
#include "nodes/nodes.h"
#include "utils/fmgrprotos.h"
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
/* bool.c */
extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +48,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
extern int16 pg_strtoint16(const char *s);
extern int32 pg_strtoint32(const char *s);
extern void pg_itoa(int16 i, char *a);
+int32 pg_ltoa_n(int32 l, char *a);
extern void pg_ltoa(int32 l, char *a);
extern void pg_lltoa(int64 ll, char *a);
extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
--------------2.21.0--