On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas <robertmh...@gmail.com> wrote: > Some comments:
I attach a new version of the patch series that incorporates all your feedback. The patch series is now made cumulative in a way that makes it easy for someone to independently commit the unsigned integer comparison optimization for text, and nothing else. The macro that uses is in a dedicated header now, because I have another patch (SortSupport for the UUID type) that uses the same optimization for the same reason. It seems like something that will probably end up with a third or forth client before too long, so I think the byte swap macro wrapper belongs in sortsupport.h. BTW, I think that in practice the merge phase of a tape sort isn't much helped by comparison caching, contrary to comments appearing in the original version. The heap data structure used by polyphase merge has bad properties around locality (both temporal and spatial). I'm thinking about independently addressing that problem. I now make no claims about it in this patch. -- Peter Geoghegan
From f9a95e0b930c43a3d5c424df5cd5b3ddb2302763 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <peter.geoghega...@gmail.com> Date: Sat, 3 Oct 2015 16:11:02 -0700 Subject: [PATCH 3/4] Add two text sort caching optimizations Firstly, cache strxfrm() blobs across calls made to the text SortSupport abbreviation routine. Secondly, cache strcoll() results across calls to the text comparison routine (SortSupport authoritative comparator only). The cost of doing this should be immeasurably small in cases where the optimization does not help, while the benefits in cases where the optimization is effective should be quite significant. --- src/backend/utils/adt/varlena.c | 75 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 71 insertions(+), 4 deletions(-) diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 63ca437..da55a84 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -61,6 +61,9 @@ typedef struct char *buf2; /* 2nd string, or abbreviation strxfrm() buf */ int buflen1; int buflen2; + int last_len1; /* Length of last buf1 string/strxfrm() blob */ + int last_len2; /* Length of last buf2 string/strxfrm() blob */ + int last_returned; /* Last comparison result (cache) */ bool collate_c; hyperLogLogState abbr_card; /* Abbreviated key cardinality state */ hyperLogLogState full_card; /* Full key cardinality state */ @@ -1831,9 +1834,20 @@ btsortsupport_worker(SortSupport ssup, Oid collid) tss->buflen1 = TEXTBUFLEN; tss->buf2 = palloc(TEXTBUFLEN); tss->buflen2 = TEXTBUFLEN; + /* Start with invalid values */ + tss->last_len1 = -1; + tss->last_len2 = -1; #ifdef HAVE_LOCALE_T tss->locale = locale; #endif + /* + * To avoid somehow confusing a strxfrm() blob and an original string + * within bttextfastcmp_locale(), initialize last returned cache to a + * sentinel value. A platform-specific actual strcoll() return value + * of INT_MIN seems unlikely, but if that occurs it will not cause + * wrong answers. + */ + tss->last_returned = INT_MIN; tss->collate_c = collate_c; ssup->ssup_extra = tss; @@ -1896,6 +1910,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup) { text *arg1 = DatumGetTextPP(x); text *arg2 = DatumGetTextPP(y); + bool arg1_match; TextSortSupport *tss = (TextSortSupport *) ssup->ssup_extra; /* working state */ @@ -1914,6 +1929,11 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup) /* Fast pre-check for equality, as discussed in varstr_cmp() */ if (len1 == len2 && memcmp(a1p, a2p, len1) == 0) { + /* + * No change in buf1 or buf2 contents, so avoid changing last_len1 or + * last_len2. Existing contents of buffers might still be used by next + * call. + */ result = 0; goto done; } @@ -1931,10 +1951,43 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup) tss->buf2 = MemoryContextAlloc(ssup->ssup_cxt, tss->buflen2); } - memcpy(tss->buf1, a1p, len1); - tss->buf1[len1] = '\0'; - memcpy(tss->buf2, a2p, len2); - tss->buf2[len2] = '\0'; + /* + * We're likely to be asked to compare the same strings repeatedly, and + * memcmp() is so much cheaper than strcoll() that it pays to try to cache + * comparisons, even though in general there is no reason to think that + * that will work out (every text datum may be unique). Caching does not + * slow things down measurably when it doesn't work out, and can speed + * things up by rather a lot when it does. In part, this is because the + * memcmp() compares data from cachelines that are needed in L1 cache even + * when the last comparison's result cannot be reused. + */ + arg1_match = true; + if (len1 != tss->last_len1 || memcmp(tss->buf1, a1p, len1) != 0) + { + arg1_match = false; + memcpy(tss->buf1, a1p, len1); + tss->buf1[len1] = '\0'; + tss->last_len1 = len1; + } + + /* + * If we're comparing the same two strings as last time, we can return the + * same answer without calling strcoll() again. This is more likely than + * it seems (at least with moderate to low cardinality sets), because + * quicksort compares the same pivot against many values. + */ + if (len2 != tss->last_len2 || memcmp(tss->buf2, a2p, len2) != 0) + { + memcpy(tss->buf2, a2p, len2); + tss->buf2[len2] = '\0'; + tss->last_len2 = len2; + } + else if (arg1_match && tss->last_returned != INT_MIN) + { + /* Use result cached following last actual strcoll() call */ + result = tss->last_returned; + goto done; + } #ifdef HAVE_LOCALE_T if (tss->locale) @@ -1951,6 +2004,8 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup) if (result == 0) result = strcmp(tss->buf1, tss->buf2); + /* Cache result, perhaps saving an expensive strcoll() call next time */ + tss->last_returned = result; done: /* We can't afford to leak memory here. */ if (PointerGetDatum(arg1) != x) @@ -2033,9 +2088,19 @@ bttext_abbrev_convert(Datum original, SortSupport ssup) tss->buf1 = palloc(tss->buflen1); } + /* Might be able to reuse strxfrm() blob from last call */ + if (tss->last_len1 == len && + memcmp(tss->buf1, authoritative_data, len) == 0) + { + memcpy(pres, tss->buf2, Min(sizeof(Datum), tss->last_len2)); + /* No change affecting cardinality, so no hashing required */ + goto done; + } + /* Just like strcoll(), strxfrm() expects a NUL-terminated string */ memcpy(tss->buf1, authoritative_data, len); tss->buf1[len] = '\0'; + tss->last_len1 = len; for (;;) { @@ -2047,6 +2112,7 @@ bttext_abbrev_convert(Datum original, SortSupport ssup) #endif bsize = strxfrm(tss->buf2, tss->buf1, tss->buflen2); + tss->last_len2 = bsize; if (bsize < tss->buflen2) break; @@ -2104,6 +2170,7 @@ bttext_abbrev_convert(Datum original, SortSupport ssup) addHyperLogLog(&tss->abbr_card, hash); +done: /* Byteswap on little-endian machines: */ res = ABBREV_STRING_UINT(res); /* Don't leak memory here */ -- 1.9.1
From 5847c49dcf4f98adb0d3bfc16a43fb3135c852b4 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <peter.geoghega...@gmail.com> Date: Fri, 3 Jul 2015 13:48:08 -0700 Subject: [PATCH 2/4] Use unsigned integer abbreviated keys for text Arrange for text abbreviated keys to be represented such that a simple 3-way unsigned integer comparison using the new representation is correct. This is somewhat faster on at least some platforms, since unsigned integer comparisons are now used rather than memcmp() during sorting proper. On little-endian machines, this requires a byteswap at the end of each call to conversion routine (big-endian machines use the same bitwise representation as before). --- src/backend/utils/adt/varlena.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 2fbbf54..63ca437 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -1967,25 +1967,25 @@ done: static int bttextcmp_abbrev(Datum x, Datum y, SortSupport ssup) { - char *a = (char *) &x; - char *b = (char *) &y; - int result; - - result = memcmp(a, b, sizeof(Datum)); - /* - * When result = 0, the core system will call bttextfastcmp_c() or + * When 0 is returned, the core system will call bttextfastcmp_c() or * bttextfastcmp_locale(). Even a strcmp() on two non-truncated strxfrm() * blobs cannot indicate *equality* authoritatively, for the same reason * that there is a strcoll() tie-breaker call to strcmp() in varstr_cmp(). */ - return result; + if (x > y) + return 1; + else if (x == y) + return 0; + else + return -1; } /* * Conversion routine for sortsupport. Converts original text to abbreviated * key representation. Our encoding strategy is simple -- pack the first 8 - * bytes of a strxfrm() blob into a Datum. + * bytes of a strxfrm() blob into a Datum (on little-endian machines, the 8 + * bytes are stored in reverse order), and treat it as an unsigned integer. */ static Datum bttext_abbrev_convert(Datum original, SortSupport ssup) @@ -2104,6 +2104,8 @@ bttext_abbrev_convert(Datum original, SortSupport ssup) addHyperLogLog(&tss->abbr_card, hash); + /* Byteswap on little-endian machines: */ + res = ABBREV_STRING_UINT(res); /* Don't leak memory here */ if (PointerGetDatum(authoritative) != original) pfree(authoritative); -- 1.9.1
From b3005f3bce8c96191eb5e523ea2480766b496bd1 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <peter.geoghega...@gmail.com> Date: Mon, 29 Jun 2015 23:53:05 -0400 Subject: [PATCH 1/4] Add BSWAP64() byte-swapping macro This includes detection of __builtin_bswap64() built-in availability, which BSWAP64() often becomes no more than a simple wrapper for. BSWAP64() can be used to convert 64-bit big-endian unsigned integers to 64-bit little-endian unsigned integers, and vice-versa. Add another macro, ABBREV_STRING_UINT(), that allows string-like types to represent abbreviated keys as unsigned integers (ABBREV_STRING_UINT() will use BSWAP64() on 64-bit little-endian platforms). In future commits, certain types can be made to use simple 3-way unsigned integer comparisons (sizeof(Datum)-wide unsigned integers) within their abbreviated key comparators; they need only first use ABBREV_STRING_UINT() within their SortSupport conversion routines to alter the abbreviated representation and make this safe on all supported platforms. --- config/c-compiler.m4 | 18 ++++++++++++++ configure | 24 +++++++++++++++++++ configure.in | 1 + src/include/c.h | 22 +++++++++++++++++ src/include/pg_config.h.in | 3 +++ src/include/pg_config.h.win32 | 3 +++ src/include/port/pg_crc32c.h | 10 -------- src/include/utils/sortsupport.h | 52 +++++++++++++++++++++++++++++++++++++++++ 8 files changed, 123 insertions(+), 10 deletions(-) diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index 9feec0c..550d034 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -214,6 +214,24 @@ fi])# PGAC_C_BUILTIN_BSWAP32 +# PGAC_C_BUILTIN_BSWAP64 +# ------------------------- +# Check if the C compiler understands __builtin_bswap64(), +# and define HAVE__BUILTIN_BSWAP64 if so. +AC_DEFUN([PGAC_C_BUILTIN_BSWAP64], +[AC_CACHE_CHECK(for __builtin_bswap64, pgac_cv__builtin_bswap64, +[AC_COMPILE_IFELSE([AC_LANG_SOURCE( +[static unsigned long int x = __builtin_bswap64(0xaabbccddeeff0011);] +)], +[pgac_cv__builtin_bswap64=yes], +[pgac_cv__builtin_bswap64=no])]) +if test x"$pgac_cv__builtin_bswap64" = xyes ; then +AC_DEFINE(HAVE__BUILTIN_BSWAP64, 1, + [Define to 1 if your compiler understands __builtin_bswap64.]) +fi])# PGAC_C_BUILTIN_BSWAP64 + + + # PGAC_C_BUILTIN_CONSTANT_P # ------------------------- # Check if the C compiler understands __builtin_constant_p(), diff --git a/configure b/configure index 99ef10f..b771a83 100755 --- a/configure +++ b/configure @@ -11259,6 +11259,30 @@ if test x"$pgac_cv__builtin_bswap32" = xyes ; then $as_echo "#define HAVE__BUILTIN_BSWAP32 1" >>confdefs.h fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_bswap64" >&5 +$as_echo_n "checking for __builtin_bswap64... " >&6; } +if ${pgac_cv__builtin_bswap64+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +static unsigned long int x = __builtin_bswap64(0xaabbccddeeff0011); + +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + pgac_cv__builtin_bswap64=yes +else + pgac_cv__builtin_bswap64=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_bswap64" >&5 +$as_echo "$pgac_cv__builtin_bswap64" >&6; } +if test x"$pgac_cv__builtin_bswap64" = xyes ; then + +$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h + +fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5 $as_echo_n "checking for __builtin_constant_p... " >&6; } if ${pgac_cv__builtin_constant_p+:} false; then : diff --git a/configure.in b/configure.in index 4143451..b5868b0 100644 --- a/configure.in +++ b/configure.in @@ -1317,6 +1317,7 @@ PGAC_C_FUNCNAME_SUPPORT PGAC_C_STATIC_ASSERT PGAC_C_TYPES_COMPATIBLE PGAC_C_BUILTIN_BSWAP32 +PGAC_C_BUILTIN_BSWAP64 PGAC_C_BUILTIN_CONSTANT_P PGAC_C_BUILTIN_UNREACHABLE PGAC_C_VA_ARGS diff --git a/src/include/c.h b/src/include/c.h index 8163b00..9be41dc 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -943,6 +943,28 @@ typedef NameData *Name; #define STATUS_FOUND (1) #define STATUS_WAITING (2) +/* byte swapping macros */ +#ifdef HAVE__BUILTIN_BSWAP32 +#define BSWAP32(x) __builtin_bswap32(x) +#else +#define BSWAP32(x) (((x << 24) & 0xff000000) | \ + ((x << 8) & 0x00ff0000) | \ + ((x >> 8) & 0x0000ff00) | \ + ((x >> 24) & 0x000000ff)) +#endif /* HAVE__BUILTIN_BSWAP32 */ + +#ifdef HAVE__BUILTIN_BSWAP64 +#define BSWAP64(x) __builtin_bswap64(x) +#else +#define BSWAP64(x) (((x << 56) & 0xff00000000000000UL) | \ + ((x << 40) & 0x00ff000000000000UL) | \ + ((x << 24) & 0x0000ff0000000000UL) | \ + ((x << 8) & 0x000000ff00000000UL) | \ + ((x >> 8) & 0x00000000ff000000UL) | \ + ((x >> 24) & 0x0000000000ff0000UL) | \ + ((x >> 40) & 0x000000000000ff00UL) | \ + ((x >> 56) & 0x00000000000000ffUL)) +#endif /* HAVE__BUILTIN_BSWAP64 */ /* * Append PG_USED_FOR_ASSERTS_ONLY to definitions of variables that are only diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index dda73a8..a20e0cd 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -660,6 +660,9 @@ /* Define to 1 if your compiler understands __builtin_bswap32. */ #undef HAVE__BUILTIN_BSWAP32 +/* Define to 1 if your compiler understands __builtin_bswap64. */ +#undef HAVE__BUILTIN_BSWAP64 + /* Define to 1 if your compiler understands __builtin_constant_p. */ #undef HAVE__BUILTIN_CONSTANT_P diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32 index 6f7a773..8566065 100644 --- a/src/include/pg_config.h.win32 +++ b/src/include/pg_config.h.win32 @@ -508,6 +508,9 @@ /* Define to 1 if your compiler understands __builtin_bswap32. */ /* #undef HAVE__BUILTIN_BSWAP32 */ +/* Define to 1 if your compiler understands __builtin_bswap64. */ +/* #undef HAVE__BUILTIN_BSWAP64 */ + /* Define to 1 if your compiler understands __builtin_constant_p. */ /* #undef HAVE__BUILTIN_CONSTANT_P */ diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h index c925c56..e5389aa 100644 --- a/src/include/port/pg_crc32c.h +++ b/src/include/port/pg_crc32c.h @@ -71,16 +71,6 @@ extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) #define COMP_CRC32C(crc, data, len) \ ((crc) = pg_comp_crc32c_sb8((crc), (data), (len))) #ifdef WORDS_BIGENDIAN - -#ifdef HAVE__BUILTIN_BSWAP32 -#define BSWAP32(x) __builtin_bswap32(x) -#else -#define BSWAP32(x) (((x << 24) & 0xff000000) | \ - ((x << 8) & 0x00ff0000) | \ - ((x >> 8) & 0x0000ff00) | \ - ((x >> 24) & 0x000000ff)) -#endif - #define FIN_CRC32C(crc) ((crc) = BSWAP32(crc) ^ 0xFFFFFFFF) #else #define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF) diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h index 4258630..a8bedfd 100644 --- a/src/include/utils/sortsupport.h +++ b/src/include/utils/sortsupport.h @@ -267,6 +267,58 @@ ApplySortAbbrevFullComparator(Datum datum1, bool isNull1, return compare; } +/* + * Make unsigned integer (Datum) 3-way comparisons safe for string-like types. + * + * For some types, their abbreviated representations could reasonably be built + * such that comparisons are binary string comparisons. For example, for a + * string type, an abbreviated comparator like this might be used: + * + * static int + * btstringcmp_abbrev(Datum x, Datum y, SortSupport ssup) + * { + * return memcmp((char *) &x, (char *) &y, sizeof(Datum)); + * } + * + * Perhaps the conversion routine packs NUL bytes used as sentinel values + * representing termination in the event of short strings too, so short strings + * require no special treatment (assuming NUL bytes are generally disallowed). + * There might be slightly different specifics for each type, but this basic + * pattern recurs relatively often. + * + * On big-endian machines, compilers might be able to optimize this to just a + * few CPU instructions (involving unsigned integer comparisons). Opclass + * authors should not rely on that, though. Instead, they should use the + * ABBREV_STRING_UINT() macro within their conversion routine to make sure that + * it's safe to use a 3-way comparator that performs simple unsigned integer + * comparisons explicitly. The abbreviated comparator should then look like + * this instead: + * + * static int + * btstringcmp_abbrev(Datum x, Datum y, SortSupport ssup) + * { + * if (x > y) + * return 1; + * else if (x == y) + * return 0; + * else + * return -1; + * } + * + * Apart from ensuring even cheaper comparisons on big-endian machines, this + * technique also works on little-endian machines, since a byteswap can be + * performed within ABBREV_STRING_UINT(). + */ +#ifdef WORDS_BIGENDIAN +#define ABBREV_STRING_UINT(x) (x) +#else /* !WORDS_BIGENDIAN */ +#if SIZEOF_DATUM == 8 +#define ABBREV_STRING_UINT(x) BSWAP64(x) +#else /* SIZEOF_DATUM != 8 */ +#define ABBREV_STRING_UINT(x) BSWAP32(x) +#endif /* SIZEOF_DATUM == 8 */ +#endif /* WORDS_BIGENDIAN */ + /* Other functions in utils/sort/sortsupport.c */ extern void PrepareSortSupportComparisonShim(Oid cmpFunc, SortSupport ssup); extern void PrepareSortSupportFromOrderingOp(Oid orderingOp, SortSupport ssup); -- 1.9.1
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers