On Tue, Aug 18, 2020 at 6:53 AM Peter Geoghegan <p...@bowt.ie> wrote: > I definitely think that we should have something like this, though. > It's a relatively easy win. There are plenty of workloads that spend > lots of time on pruning.
Alright then, here's an attempt to flesh the idea out a bit more, and replace the three other copies of qsort() while I'm at it.
From 8d0b569fcf6141b622c63fc4bc102c762f01ca9e Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 17 Aug 2020 21:31:56 +1200 Subject: [PATCH 1/4] Add sort_template.h for making fast sort functions. Move our qsort implementation into a header that can be used to define specialized functions for better performance. Discussion: https://postgr.es/m/CA%2BhUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw%40mail.gmail.com --- src/include/lib/sort_template.h | 428 ++++++++++++++++++++++++++++++++ 1 file changed, 428 insertions(+) create mode 100644 src/include/lib/sort_template.h diff --git a/src/include/lib/sort_template.h b/src/include/lib/sort_template.h new file mode 100644 index 0000000000..a279bcf959 --- /dev/null +++ b/src/include/lib/sort_template.h @@ -0,0 +1,428 @@ +/*------------------------------------------------------------------------- + * + * sort_template.h + * + * A template for a sort algorithm that supports varying degrees of + * specialization. + * + * Copyright (c) 2020, PostgreSQL Global Development Group + * + * Usage notes: + * + * To generate functions specialized for a type, the following parameter + * macros should be #define'd before this file is included. + * + * - ST_SORT - the name of a sort function to be generated + * - ST_ELEMENT_TYPE - type of the referenced elements + * - ST_DECLARE - if defined the functions and types are declared + * - ST_DEFINE - if defined the functions and types are defined + * - ST_SCOPE - scope (e.g. extern, static inline) for functions + * + * Instead of ST_ELEMENT_TYPE, ST_ELEMENT_TYPE_VOID can be defined. Then + * the generated functions will automatically gain an "element_size" + * parameter. This allows us to generate a traditional qsort function. + * + * One of the following macros must be defined, to show how to compare + * elements. The first two options are arbitrary expressions depending + * on whether an extra pass-through argument is desired, and the third + * option should be defined if the sort function should receive a + * function pointer at runtime. + * + * - ST_COMPARE(a, b) - a simple comparison expression + * - ST_COMPARE(a, b, arg) - variant that takes an extra argument + * - ST_COMPARE_RUNTIME_POINTER - sort function takes a function pointer + * + * To say that the comparator and therefore also sort function should + * receive an extra pass-through argument, specify the type of the + * argument. + * + * - ST_COMPARE_ARG_TYPE - type of extra argument + * + * The prototype of the generated sort function is: + * + * void ST_SORT(ST_ELEMENT_TYPE *data, size_t n, + * [size_t element_size,] + * [ST_SORT_compare_function compare,] + * [ST_COMPARE_ARG_TYPE *arg]); + * + * ST_SORT_compare_function is a function pointer of the following type: + * + * int (*)(const ST_ELEMENT_TYPE *a, const ST_ELEMENT_TYPE *b, + * [ST_COMPARE_ARG_TYPE *arg]) + * + * HISTORY + * + * Modifications from vanilla NetBSD source: + * - Add do ... while() macro fix + * - Remove __inline, _DIAGASSERTs, __P + * - Remove ill-considered "swap_cnt" switch to insertion sort, in favor + * of a simple check for presorted input. + * - Take care to recurse on the smaller partition, to bound stack usage + * - Convert into a header that can generate specialized functions + * + * IDENTIFICATION + * src/include/lib/sort_template.h + * + *------------------------------------------------------------------------- + */ + +/* $NetBSD: qsort.c,v 1.13 2003/08/07 16:43:42 agc Exp $ */ + +/*- + * Copyright (c) 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Qsort routine based on J. L. Bentley and M. D. McIlroy, + * "Engineering a sort function", + * Software--Practice and Experience 23 (1993) 1249-1265. + * + * We have modified their original by adding a check for already-sorted + * input, which seems to be a win per discussions on pgsql-hackers around + * 2006-03-21. + * + * Also, we recurse on the smaller partition and iterate on the larger one, + * which ensures we cannot recurse more than log(N) levels (since the + * partition recursed to is surely no more than half of the input). Bentley + * and McIlroy explicitly rejected doing this on the grounds that it's "not + * worth the effort", but we have seen crashes in the field due to stack + * overrun, so that judgment seems wrong. + */ + +#define ST_MAKE_PREFIX(a) CppConcat(a,_) +#define ST_MAKE_NAME(a,b) ST_MAKE_NAME_(ST_MAKE_PREFIX(a),b) +#define ST_MAKE_NAME_(a,b) CppConcat(a,b) + +/* + * If the element type is void, we'll also need an element_size argument + * because we don't know the size. + */ +#ifdef ST_ELEMENT_TYPE_VOID +#define ST_ELEMENT_TYPE void +#define ST_SORT_PROTO_SIZE , size_t element_size +#define ST_SORT_INVOKE_SIZE , element_size +#else +#define ST_SORT_PROTO_SIZE +#define ST_SORT_INVOKE_SIZE +#endif + +/* + * If the user wants to be able to pass in compare functions at runtime, + * we'll need to make that an argument of the sort and med3 functions. + */ +#ifdef ST_COMPARE_RUNTIME_POINTER +/* + * The type of the comparator function pointer that ST_SORT will take, unless + * you've already declared a type name manually and want to use that instead of + * having a new one defined. + */ +#ifndef ST_COMPARATOR_TYPE_NAME +#define ST_COMPARATOR_TYPE_NAME ST_MAKE_NAME(ST_SORT, compare_function) +#endif +#define ST_COMPARE compare +#ifndef ST_COMPARE_ARG_TYPE +#define ST_SORT_PROTO_COMPARE , ST_COMPARATOR_TYPE_NAME compare +#define ST_SORT_INVOKE_COMPARE , compare +#else +#define ST_SORT_PROTO_COMPARE , ST_COMPARATOR_TYPE_NAME compare +#define ST_SORT_INVOKE_COMPARE , compare +#endif +#else +#define ST_SORT_PROTO_COMPARE +#define ST_SORT_INVOKE_COMPARE +#endif + +/* + * If the user wants to use a compare function or expression that takes an + * extra argument, we'll need to make that an argument of the sort, compare and + * med3 functions. + */ +#ifdef ST_COMPARE_ARG_TYPE +#define ST_SORT_PROTO_ARG , ST_COMPARE_ARG_TYPE *arg +#define ST_SORT_INVOKE_ARG , arg +#else +#define ST_SORT_PROTO_ARG +#define ST_SORT_INVOKE_ARG +#endif + +#ifdef ST_DECLARE + +#ifdef ST_COMPARE_RUNTIME_POINTER +typedef int (*ST_COMPARATOR_TYPE_NAME)(const ST_ELEMENT_TYPE *, + const ST_ELEMENT_TYPE * + ST_SORT_PROTO_ARG); +#endif + +/* Declare the sort function. Note optional arguments at end. */ +ST_SCOPE void ST_SORT(ST_ELEMENT_TYPE *first, size_t n + ST_SORT_PROTO_SIZE + ST_SORT_PROTO_COMPARE + ST_SORT_PROTO_ARG); + +#endif + +#ifdef ST_DEFINE + +/* sort private helper functions */ +#define ST_MED3 ST_MAKE_NAME(ST_SORT, med3) +#define ST_SWAP ST_MAKE_NAME(ST_SORT, swap) +#define ST_SWAPN ST_MAKE_NAME(ST_SORT, swapn) + +/* Users expecting to run very large sorts may need them to be interruptible. */ +#ifdef ST_CHECK_FOR_INTERRUPTS +#define DO_CHECK_FOR_INTERRUPTS() CHECK_FOR_INTERRUPTS() +#else +#define DO_CHECK_FOR_INTERRUPTS() +#endif + +/* + * Create wrapper macros that know how to invoke compare, med3 and sort with + * the right arguments. + */ +#ifdef ST_COMPARE_RUNTIME_POINTER +#define DO_COMPARE(a_, b_) ST_COMPARE((a_), (b_) ST_SORT_INVOKE_ARG) +#elif defined(ST_COMPARE_ARG_TYPE) +#define DO_COMPARE(a_, b_) ST_COMPARE((a_), (b_), arg) +#else +#define DO_COMPARE(a_, b_) ST_COMPARE((a_), (b_)) +#endif +#define DO_MED3(a_, b_, c_) \ + ST_MED3((a_), (b_), (c_) \ + ST_SORT_INVOKE_COMPARE \ + ST_SORT_INVOKE_ARG) +#define DO_SORT(a_, n_) \ + ST_SORT((a_), (n_) \ + ST_SORT_INVOKE_SIZE \ + ST_SORT_INVOKE_COMPARE \ + ST_SORT_INVOKE_ARG) + +/* + * If we're working with void pointers, we'll use pointer arithmetic based on + * uint8, and use the runtime element_size to step through the array and swap + * elements. Otherwise we'll work with ST_ELEMENT_TYPE. + */ +#ifndef ST_ELEMENT_TYPE_VOID +#define ST_POINTER_TYPE ST_ELEMENT_TYPE +#define ST_POINTER_STEP 1 +#define DO_SWAPN(a_, b_, n_) ST_SWAPN((a_), (b_), (n_)) +#define DO_SWAP(a_, b_) ST_SWAP((a_), (b_)) +#else +#define ST_POINTER_TYPE uint8 +#define ST_POINTER_STEP element_size +#define DO_SWAPN(a_, b_, n_) ST_SWAPN((a_), (b_), (n_)) +#define DO_SWAP(a_, b_) DO_SWAPN((a_), (b_), element_size) +#endif + +/* + * Find the median of three values. Currently, performance seems to be best + * if the the comparator is inlined here, but the med3 function is not inlined + * in the qsort function. + */ +static pg_noinline ST_ELEMENT_TYPE * +ST_MED3(ST_ELEMENT_TYPE *a, + ST_ELEMENT_TYPE *b, + ST_ELEMENT_TYPE *c + ST_SORT_PROTO_COMPARE + ST_SORT_PROTO_ARG) +{ + return DO_COMPARE(a, b) < 0 ? + (DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a)) + : (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c)); +} + +static inline void +ST_SWAP(ST_POINTER_TYPE *a, ST_POINTER_TYPE *b) +{ + ST_POINTER_TYPE tmp = *a; + + *a = *b; + *b = tmp; +} + +static inline void +ST_SWAPN(ST_POINTER_TYPE *a, ST_POINTER_TYPE *b, size_t n) +{ + for (size_t i = 0; i < n; ++i) + ST_SWAP(&a[i], &b[i]); +} + +/* + * Sort an array. + */ +ST_SCOPE void +ST_SORT(ST_ELEMENT_TYPE *data, size_t n + ST_SORT_PROTO_SIZE + ST_SORT_PROTO_COMPARE + ST_SORT_PROTO_ARG) +{ + ST_POINTER_TYPE *a = (ST_POINTER_TYPE *) data, + *pa, + *pb, + *pc, + *pd, + *pl, + *pm, + *pn; + size_t d1, + d2; + int r, + presorted; + +loop: + DO_CHECK_FOR_INTERRUPTS(); + if (n < 7) + { + for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP; + pm += ST_POINTER_STEP) + for (pl = pm; pl > a && DO_COMPARE(pl - ST_POINTER_STEP, pl) > 0; + pl -= ST_POINTER_STEP) + DO_SWAP(pl, pl - ST_POINTER_STEP); + return; + } + presorted = 1; + for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP; + pm += ST_POINTER_STEP) + { + DO_CHECK_FOR_INTERRUPTS(); + if (DO_COMPARE(pm - ST_POINTER_STEP, pm) > 0) + { + presorted = 0; + break; + } + } + if (presorted) + return; + pm = a + (n / 2) * ST_POINTER_STEP; + if (n > 7) + { + pl = a; + pn = a + (n - 1) * ST_POINTER_STEP; + if (n > 40) + { + size_t d = (n / 8) * ST_POINTER_STEP; + + pl = DO_MED3(pl, pl + d, pl + 2 * d); + pm = DO_MED3(pm - d, pm, pm + d); + pn = DO_MED3(pn - 2 * d, pn - d, pn); + } + pm = DO_MED3(pl, pm, pn); + } + DO_SWAP(a, pm); + pa = pb = a + ST_POINTER_STEP; + pc = pd = a + (n - 1) * ST_POINTER_STEP; + for (;;) + { + while (pb <= pc && (r = DO_COMPARE(pb, a)) <= 0) + { + if (r == 0) + { + DO_SWAP(pa, pb); + pa += ST_POINTER_STEP; + } + pb += ST_POINTER_STEP; + DO_CHECK_FOR_INTERRUPTS(); + } + while (pb <= pc && (r = DO_COMPARE(pc, a)) >= 0) + { + if (r == 0) + { + DO_SWAP(pc, pd); + pd -= ST_POINTER_STEP; + } + pc -= ST_POINTER_STEP; + DO_CHECK_FOR_INTERRUPTS(); + } + if (pb > pc) + break; + DO_SWAP(pb, pc); + pb += ST_POINTER_STEP; + pc -= ST_POINTER_STEP; + } + pn = a + n * ST_POINTER_STEP; + d1 = Min(pa - a, pb - pa); + DO_SWAPN(a, pb - d1, d1); + d1 = Min(pd - pc, pn - pd - ST_POINTER_STEP); + DO_SWAPN(pb, pn - d1, d1); + d1 = pb - pa; + d2 = pd - pc; + if (d1 <= d2) + { + /* Recurse on left partition, then iterate on right partition */ + if (d1 > ST_POINTER_STEP) + DO_SORT(a, d1 / ST_POINTER_STEP); + if (d2 > ST_POINTER_STEP) + { + /* Iterate rather than recurse to save stack space */ + /* DO_SORT(pn - d2, d2 / ST_POINTER_STEP) */ + a = pn - d2; + n = d2 / ST_POINTER_STEP; + goto loop; + } + } + else + { + /* Recurse on right partition, then iterate on left partition */ + if (d2 > ST_POINTER_STEP) + DO_SORT(pn - d2, d2 / ST_POINTER_STEP); + if (d1 > ST_POINTER_STEP) + { + /* Iterate rather than recurse to save stack space */ + /* DO_SORT(a, d1 / ST_POINTER_STEP) */ + n = d1 / ST_POINTER_STEP; + goto loop; + } + } +} +#endif + +#undef DO_COMPARE +#undef DO_MED3 +#undef DO_SORT +#undef DO_SWAP +#undef DO_SWAPN +#undef ST_COMPARATOR_TYPE_NAME +#undef ST_COMPARE +#undef ST_COMPARE_ARG_TYPE +#undef ST_COMPARE_RUNTIME_POINTER +#undef ST_ELEMENT_TYPE +#undef ST_ELEMENT_TYPE_VOID +#undef ST_MAKE_NAME +#undef ST_MAKE_NAME_ +#undef ST_MAKE_PREFIX +#undef ST_MED3 +#undef ST_POINTER_STEP +#undef ST_POINTER_TYPE +#undef ST_SORT +#undef ST_SORT_INVOKE_ARG +#undef ST_SORT_INVOKE_COMPARE +#undef ST_SORT_INVOKE_SIZE +#undef ST_SORT_PROTO_ARG +#undef ST_SORT_PROTO_COMPARE +#undef ST_SORT_PROTO_SIZE +#undef ST_SWAP +#undef ST_SWAPN -- 2.20.1
From fc58e1ad97a1eaad7bd1277a9d493af239cc77ef Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 17 Aug 2020 20:59:32 +1200 Subject: [PATCH 2/4] Use sort_template.h for compactify_tuples(). Since compactify_tuples() is called often, a specialized sort function for sorting tuple offsets can give measurable speed-up. Discussion: https://postgr.es/m/CA%2BhUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw%40mail.gmail.com --- src/backend/storage/page/bufpage.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c index d708117a40..f81fa8cf3d 100644 --- a/src/backend/storage/page/bufpage.c +++ b/src/backend/storage/page/bufpage.c @@ -421,13 +421,13 @@ typedef struct itemIdSortData } itemIdSortData; typedef itemIdSortData *itemIdSort; -static int -itemoffcompare(const void *itemidp1, const void *itemidp2) -{ - /* Sort in decreasing itemoff order */ - return ((itemIdSort) itemidp2)->itemoff - - ((itemIdSort) itemidp1)->itemoff; -} +/* Create a specialized sort function for descending offset order. */ +#define ST_SORT qsort_itemoff +#define ST_ELEMENT_TYPE itemIdSortData +#define ST_COMPARE(a, b) ((b)->itemoff - (a)->itemoff) +#define ST_SCOPE static +#define ST_DEFINE +#include "lib/sort_template.h" /* * After removing or marking some line pointers unused, move the tuples to @@ -441,8 +441,7 @@ compactify_tuples(itemIdSort itemidbase, int nitems, Page page) int i; /* sort itemIdSortData array into decreasing itemoff order */ - qsort((char *) itemidbase, nitems, sizeof(itemIdSortData), - itemoffcompare); + qsort_itemoff(itemidbase, nitems); upper = phdr->pd_special; for (i = 0; i < nitems; i++) -- 2.20.1
From 66e33fbeb329bffe9d8dd2ef2c8594d20e1ad662 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Wed, 19 Aug 2020 19:34:45 +1200 Subject: [PATCH 3/4] Use sort_template.h for qsort() and qsort_arg(). Reduce duplication by using the new template. --- src/port/qsort.c | 227 ++---------------------------------------- src/port/qsort_arg.c | 228 ++----------------------------------------- 2 files changed, 15 insertions(+), 440 deletions(-) diff --git a/src/port/qsort.c b/src/port/qsort.c index fa992e2081..7879e6cd56 100644 --- a/src/port/qsort.c +++ b/src/port/qsort.c @@ -1,229 +1,16 @@ /* * qsort.c: standard quicksort algorithm - * - * Modifications from vanilla NetBSD source: - * Add do ... while() macro fix - * Remove __inline, _DIAGASSERTs, __P - * Remove ill-considered "swap_cnt" switch to insertion sort, - * in favor of a simple check for presorted input. - * Take care to recurse on the smaller partition, to bound stack usage. - * - * CAUTION: if you change this file, see also qsort_arg.c, gen_qsort_tuple.pl - * - * src/port/qsort.c - */ - -/* $NetBSD: qsort.c,v 1.13 2003/08/07 16:43:42 agc Exp $ */ - -/*- - * Copyright (c) 1992, 1993 - * The Regents of the University of California. All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * 3. Neither the name of the University nor the names of its contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. */ #include "c.h" - -static char *med3(char *a, char *b, char *c, - int (*cmp) (const void *, const void *)); -static void swapfunc(char *, char *, size_t, int); - -/* - * Qsort routine based on J. L. Bentley and M. D. McIlroy, - * "Engineering a sort function", - * Software--Practice and Experience 23 (1993) 1249-1265. - * - * We have modified their original by adding a check for already-sorted input, - * which seems to be a win per discussions on pgsql-hackers around 2006-03-21. - * - * Also, we recurse on the smaller partition and iterate on the larger one, - * which ensures we cannot recurse more than log(N) levels (since the - * partition recursed to is surely no more than half of the input). Bentley - * and McIlroy explicitly rejected doing this on the grounds that it's "not - * worth the effort", but we have seen crashes in the field due to stack - * overrun, so that judgment seems wrong. - */ - -#define swapcode(TYPE, parmi, parmj, n) \ -do { \ - size_t i = (n) / sizeof (TYPE); \ - TYPE *pi = (TYPE *)(void *)(parmi); \ - TYPE *pj = (TYPE *)(void *)(parmj); \ - do { \ - TYPE t = *pi; \ - *pi++ = *pj; \ - *pj++ = t; \ - } while (--i > 0); \ -} while (0) - -#define SWAPINIT(a, es) swaptype = ((char *)(a) - (char *)0) % sizeof(long) || \ - (es) % sizeof(long) ? 2 : (es) == sizeof(long)? 0 : 1 - -static void -swapfunc(char *a, char *b, size_t n, int swaptype) -{ - if (swaptype <= 1) - swapcode(long, a, b, n); - else - swapcode(char, a, b, n); -} - -#define swap(a, b) \ - if (swaptype == 0) { \ - long t = *(long *)(void *)(a); \ - *(long *)(void *)(a) = *(long *)(void *)(b); \ - *(long *)(void *)(b) = t; \ - } else \ - swapfunc(a, b, es, swaptype) - -#define vecswap(a, b, n) if ((n) > 0) swapfunc(a, b, n, swaptype) - -static char * -med3(char *a, char *b, char *c, int (*cmp) (const void *, const void *)) -{ - return cmp(a, b) < 0 ? - (cmp(b, c) < 0 ? b : (cmp(a, c) < 0 ? c : a)) - : (cmp(b, c) > 0 ? b : (cmp(a, c) < 0 ? a : c)); -} - -void -pg_qsort(void *a, size_t n, size_t es, int (*cmp) (const void *, const void *)) -{ - char *pa, - *pb, - *pc, - *pd, - *pl, - *pm, - *pn; - size_t d1, - d2; - int r, - swaptype, - presorted; - -loop:SWAPINIT(a, es); - if (n < 7) - { - for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) - for (pl = pm; pl > (char *) a && cmp(pl - es, pl) > 0; - pl -= es) - swap(pl, pl - es); - return; - } - presorted = 1; - for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) - { - if (cmp(pm - es, pm) > 0) - { - presorted = 0; - break; - } - } - if (presorted) - return; - pm = (char *) a + (n / 2) * es; - if (n > 7) - { - pl = (char *) a; - pn = (char *) a + (n - 1) * es; - if (n > 40) - { - size_t d = (n / 8) * es; - - pl = med3(pl, pl + d, pl + 2 * d, cmp); - pm = med3(pm - d, pm, pm + d, cmp); - pn = med3(pn - 2 * d, pn - d, pn, cmp); - } - pm = med3(pl, pm, pn, cmp); - } - swap(a, pm); - pa = pb = (char *) a + es; - pc = pd = (char *) a + (n - 1) * es; - for (;;) - { - while (pb <= pc && (r = cmp(pb, a)) <= 0) - { - if (r == 0) - { - swap(pa, pb); - pa += es; - } - pb += es; - } - while (pb <= pc && (r = cmp(pc, a)) >= 0) - { - if (r == 0) - { - swap(pc, pd); - pd -= es; - } - pc -= es; - } - if (pb > pc) - break; - swap(pb, pc); - pb += es; - pc -= es; - } - pn = (char *) a + n * es; - d1 = Min(pa - (char *) a, pb - pa); - vecswap(a, pb - d1, d1); - d1 = Min(pd - pc, pn - pd - es); - vecswap(pb, pn - d1, d1); - d1 = pb - pa; - d2 = pd - pc; - if (d1 <= d2) - { - /* Recurse on left partition, then iterate on right partition */ - if (d1 > es) - pg_qsort(a, d1 / es, es, cmp); - if (d2 > es) - { - /* Iterate rather than recurse to save stack space */ - /* pg_qsort(pn - d2, d2 / es, es, cmp); */ - a = pn - d2; - n = d2 / es; - goto loop; - } - } - else - { - /* Recurse on right partition, then iterate on left partition */ - if (d2 > es) - pg_qsort(pn - d2, d2 / es, es, cmp); - if (d1 > es) - { - /* Iterate rather than recurse to save stack space */ - /* pg_qsort(a, d1 / es, es, cmp); */ - n = d1 / es; - goto loop; - } - } -} +#define ST_SORT pg_qsort +#define ST_ELEMENT_TYPE_VOID +#define ST_COMPARE_RUNTIME_POINTER +#define ST_SCOPE +#define ST_DECLARE +#define ST_DEFINE +#include "lib/sort_template.h" /* * qsort comparator wrapper for strcmp. diff --git a/src/port/qsort_arg.c b/src/port/qsort_arg.c index 6d54fbc2b4..fa7e11a3b8 100644 --- a/src/port/qsort_arg.c +++ b/src/port/qsort_arg.c @@ -1,226 +1,14 @@ /* * qsort_arg.c: qsort with a passthrough "void *" argument - * - * Modifications from vanilla NetBSD source: - * Add do ... while() macro fix - * Remove __inline, _DIAGASSERTs, __P - * Remove ill-considered "swap_cnt" switch to insertion sort, - * in favor of a simple check for presorted input. - * Take care to recurse on the smaller partition, to bound stack usage. - * - * CAUTION: if you change this file, see also qsort.c, gen_qsort_tuple.pl - * - * src/port/qsort_arg.c - */ - -/* $NetBSD: qsort.c,v 1.13 2003/08/07 16:43:42 agc Exp $ */ - -/*- - * Copyright (c) 1992, 1993 - * The Regents of the University of California. All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * 3. Neither the name of the University nor the names of its contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. */ #include "c.h" - -static char *med3(char *a, char *b, char *c, - qsort_arg_comparator cmp, void *arg); -static void swapfunc(char *, char *, size_t, int); - -/* - * Qsort routine based on J. L. Bentley and M. D. McIlroy, - * "Engineering a sort function", - * Software--Practice and Experience 23 (1993) 1249-1265. - * - * We have modified their original by adding a check for already-sorted input, - * which seems to be a win per discussions on pgsql-hackers around 2006-03-21. - * - * Also, we recurse on the smaller partition and iterate on the larger one, - * which ensures we cannot recurse more than log(N) levels (since the - * partition recursed to is surely no more than half of the input). Bentley - * and McIlroy explicitly rejected doing this on the grounds that it's "not - * worth the effort", but we have seen crashes in the field due to stack - * overrun, so that judgment seems wrong. - */ - -#define swapcode(TYPE, parmi, parmj, n) \ -do { \ - size_t i = (n) / sizeof (TYPE); \ - TYPE *pi = (TYPE *)(void *)(parmi); \ - TYPE *pj = (TYPE *)(void *)(parmj); \ - do { \ - TYPE t = *pi; \ - *pi++ = *pj; \ - *pj++ = t; \ - } while (--i > 0); \ -} while (0) - -#define SWAPINIT(a, es) swaptype = ((char *)(a) - (char *)0) % sizeof(long) || \ - (es) % sizeof(long) ? 2 : (es) == sizeof(long)? 0 : 1 - -static void -swapfunc(char *a, char *b, size_t n, int swaptype) -{ - if (swaptype <= 1) - swapcode(long, a, b, n); - else - swapcode(char, a, b, n); -} - -#define swap(a, b) \ - if (swaptype == 0) { \ - long t = *(long *)(void *)(a); \ - *(long *)(void *)(a) = *(long *)(void *)(b); \ - *(long *)(void *)(b) = t; \ - } else \ - swapfunc(a, b, es, swaptype) - -#define vecswap(a, b, n) if ((n) > 0) swapfunc(a, b, n, swaptype) - -static char * -med3(char *a, char *b, char *c, qsort_arg_comparator cmp, void *arg) -{ - return cmp(a, b, arg) < 0 ? - (cmp(b, c, arg) < 0 ? b : (cmp(a, c, arg) < 0 ? c : a)) - : (cmp(b, c, arg) > 0 ? b : (cmp(a, c, arg) < 0 ? a : c)); -} - -void -qsort_arg(void *a, size_t n, size_t es, qsort_arg_comparator cmp, void *arg) -{ - char *pa, - *pb, - *pc, - *pd, - *pl, - *pm, - *pn; - size_t d1, - d2; - int r, - swaptype, - presorted; - -loop:SWAPINIT(a, es); - if (n < 7) - { - for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) - for (pl = pm; pl > (char *) a && cmp(pl - es, pl, arg) > 0; - pl -= es) - swap(pl, pl - es); - return; - } - presorted = 1; - for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) - { - if (cmp(pm - es, pm, arg) > 0) - { - presorted = 0; - break; - } - } - if (presorted) - return; - pm = (char *) a + (n / 2) * es; - if (n > 7) - { - pl = (char *) a; - pn = (char *) a + (n - 1) * es; - if (n > 40) - { - size_t d = (n / 8) * es; - - pl = med3(pl, pl + d, pl + 2 * d, cmp, arg); - pm = med3(pm - d, pm, pm + d, cmp, arg); - pn = med3(pn - 2 * d, pn - d, pn, cmp, arg); - } - pm = med3(pl, pm, pn, cmp, arg); - } - swap(a, pm); - pa = pb = (char *) a + es; - pc = pd = (char *) a + (n - 1) * es; - for (;;) - { - while (pb <= pc && (r = cmp(pb, a, arg)) <= 0) - { - if (r == 0) - { - swap(pa, pb); - pa += es; - } - pb += es; - } - while (pb <= pc && (r = cmp(pc, a, arg)) >= 0) - { - if (r == 0) - { - swap(pc, pd); - pd -= es; - } - pc -= es; - } - if (pb > pc) - break; - swap(pb, pc); - pb += es; - pc -= es; - } - pn = (char *) a + n * es; - d1 = Min(pa - (char *) a, pb - pa); - vecswap(a, pb - d1, d1); - d1 = Min(pd - pc, pn - pd - es); - vecswap(pb, pn - d1, d1); - d1 = pb - pa; - d2 = pd - pc; - if (d1 <= d2) - { - /* Recurse on left partition, then iterate on right partition */ - if (d1 > es) - qsort_arg(a, d1 / es, es, cmp, arg); - if (d2 > es) - { - /* Iterate rather than recurse to save stack space */ - /* qsort_arg(pn - d2, d2 / es, es, cmp, arg); */ - a = pn - d2; - n = d2 / es; - goto loop; - } - } - else - { - /* Recurse on right partition, then iterate on left partition */ - if (d2 > es) - qsort_arg(pn - d2, d2 / es, es, cmp, arg); - if (d1 > es) - { - /* Iterate rather than recurse to save stack space */ - /* qsort_arg(a, d1 / es, es, cmp, arg); */ - n = d1 / es; - goto loop; - } - } -} +#define ST_SORT qsort_arg +#define ST_ELEMENT_TYPE_VOID +#define ST_COMPARATOR_TYPE_NAME qsort_arg_comparator +#define ST_COMPARE_RUNTIME_POINTER +#define ST_COMPARE_ARG_TYPE void +#define ST_SCOPE +#define ST_DEFINE +#include "lib/sort_template.h" -- 2.20.1
From 6a767c0e5ebf72aac2cf0883b1333a839a87567a Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Wed, 19 Aug 2020 20:25:12 +1200 Subject: [PATCH 4/4] Use sort_template.h for qsort_tuple() and qsort_ssup(). Replace the Perl code the previously generated specialized sort functions with an instantiation of sort_template.h. --- src/backend/utils/sort/.gitignore | 1 - src/backend/utils/sort/Makefile | 3 - src/backend/utils/sort/gen_qsort_tuple.pl | 271 ---------------------- src/backend/utils/sort/tuplesort.c | 21 +- 4 files changed, 20 insertions(+), 276 deletions(-) delete mode 100644 src/backend/utils/sort/.gitignore delete mode 100644 src/backend/utils/sort/gen_qsort_tuple.pl diff --git a/src/backend/utils/sort/.gitignore b/src/backend/utils/sort/.gitignore deleted file mode 100644 index f2958633e6..0000000000 --- a/src/backend/utils/sort/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/qsort_tuple.c diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile index 7ac3659261..39f132ef68 100644 --- a/src/backend/utils/sort/Makefile +++ b/src/backend/utils/sort/Makefile @@ -23,9 +23,6 @@ OBJS = \ tuplesort.o: qsort_tuple.c -qsort_tuple.c: gen_qsort_tuple.pl - $(PERL) $(srcdir)/gen_qsort_tuple.pl $< > $@ - include $(top_srcdir)/src/backend/common.mk maintainer-clean: diff --git a/src/backend/utils/sort/gen_qsort_tuple.pl b/src/backend/utils/sort/gen_qsort_tuple.pl deleted file mode 100644 index eb0f7c5814..0000000000 --- a/src/backend/utils/sort/gen_qsort_tuple.pl +++ /dev/null @@ -1,271 +0,0 @@ -#!/usr/bin/perl - -# -# gen_qsort_tuple.pl -# -# This script generates specialized versions of the quicksort algorithm for -# tuple sorting. The quicksort code is derived from the NetBSD code. The -# code generated by this script runs significantly faster than vanilla qsort -# when used to sort tuples. This speedup comes from a number of places. -# The major effects are (1) inlining simple tuple comparators is much faster -# than jumping through a function pointer and (2) swap and vecswap operations -# specialized to the particular data type of interest (in this case, SortTuple) -# are faster than the generic routines. -# -# Modifications from vanilla NetBSD source: -# Add do ... while() macro fix -# Remove __inline, _DIAGASSERTs, __P -# Remove ill-considered "swap_cnt" switch to insertion sort, -# in favor of a simple check for presorted input. -# Take care to recurse on the smaller partition, to bound stack usage. -# -# Instead of sorting arbitrary objects, we're always sorting SortTuples. -# Add CHECK_FOR_INTERRUPTS(). -# -# CAUTION: if you change this file, see also qsort.c and qsort_arg.c -# - -use strict; -use warnings; - -my $SUFFIX; -my $EXTRAARGS; -my $EXTRAPARAMS; -my $CMPPARAMS; - -emit_qsort_boilerplate(); - -$SUFFIX = 'tuple'; -$EXTRAARGS = ', SortTupleComparator cmp_tuple, Tuplesortstate *state'; -$EXTRAPARAMS = ', cmp_tuple, state'; -$CMPPARAMS = ', state'; -emit_qsort_implementation(); - -$SUFFIX = 'ssup'; -$EXTRAARGS = ', SortSupport ssup'; -$EXTRAPARAMS = ', ssup'; -$CMPPARAMS = ', ssup'; -print <<'EOM'; - -#define cmp_ssup(a, b, ssup) \ - ApplySortComparator((a)->datum1, (a)->isnull1, \ - (b)->datum1, (b)->isnull1, ssup) - -EOM -emit_qsort_implementation(); - -sub emit_qsort_boilerplate -{ - print <<'EOM'; -/* - * autogenerated by src/backend/utils/sort/gen_qsort_tuple.pl, do not edit! - * - * This file is included by tuplesort.c, rather than compiled separately. - */ - -/* $NetBSD: qsort.c,v 1.13 2003/08/07 16:43:42 agc Exp $ */ - -/*- - * Copyright (c) 1992, 1993 - * The Regents of the University of California. All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * 3. Neither the name of the University nor the names of its contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. - */ - -/* - * Qsort routine based on J. L. Bentley and M. D. McIlroy, - * "Engineering a sort function", - * Software--Practice and Experience 23 (1993) 1249-1265. - * - * We have modified their original by adding a check for already-sorted input, - * which seems to be a win per discussions on pgsql-hackers around 2006-03-21. - * - * Also, we recurse on the smaller partition and iterate on the larger one, - * which ensures we cannot recurse more than log(N) levels (since the - * partition recursed to is surely no more than half of the input). Bentley - * and McIlroy explicitly rejected doing this on the grounds that it's "not - * worth the effort", but we have seen crashes in the field due to stack - * overrun, so that judgment seems wrong. - */ - -static void -swapfunc(SortTuple *a, SortTuple *b, size_t n) -{ - do - { - SortTuple t = *a; - *a++ = *b; - *b++ = t; - } while (--n > 0); -} - -#define swap(a, b) \ - do { \ - SortTuple t = *(a); \ - *(a) = *(b); \ - *(b) = t; \ - } while (0) - -#define vecswap(a, b, n) if ((n) > 0) swapfunc(a, b, n) - -EOM - - return; -} - -sub emit_qsort_implementation -{ - print <<EOM; -static SortTuple * -med3_$SUFFIX(SortTuple *a, SortTuple *b, SortTuple *c$EXTRAARGS) -{ - return cmp_$SUFFIX(a, b$CMPPARAMS) < 0 ? - (cmp_$SUFFIX(b, c$CMPPARAMS) < 0 ? b : - (cmp_$SUFFIX(a, c$CMPPARAMS) < 0 ? c : a)) - : (cmp_$SUFFIX(b, c$CMPPARAMS) > 0 ? b : - (cmp_$SUFFIX(a, c$CMPPARAMS) < 0 ? a : c)); -} - -static void -qsort_$SUFFIX(SortTuple *a, size_t n$EXTRAARGS) -{ - SortTuple *pa, - *pb, - *pc, - *pd, - *pl, - *pm, - *pn; - size_t d1, - d2; - int r, - presorted; - -loop: - CHECK_FOR_INTERRUPTS(); - if (n < 7) - { - for (pm = a + 1; pm < a + n; pm++) - for (pl = pm; pl > a && cmp_$SUFFIX(pl - 1, pl$CMPPARAMS) > 0; pl--) - swap(pl, pl - 1); - return; - } - presorted = 1; - for (pm = a + 1; pm < a + n; pm++) - { - CHECK_FOR_INTERRUPTS(); - if (cmp_$SUFFIX(pm - 1, pm$CMPPARAMS) > 0) - { - presorted = 0; - break; - } - } - if (presorted) - return; - pm = a + (n / 2); - if (n > 7) - { - pl = a; - pn = a + (n - 1); - if (n > 40) - { - size_t d = (n / 8); - - pl = med3_$SUFFIX(pl, pl + d, pl + 2 * d$EXTRAPARAMS); - pm = med3_$SUFFIX(pm - d, pm, pm + d$EXTRAPARAMS); - pn = med3_$SUFFIX(pn - 2 * d, pn - d, pn$EXTRAPARAMS); - } - pm = med3_$SUFFIX(pl, pm, pn$EXTRAPARAMS); - } - swap(a, pm); - pa = pb = a + 1; - pc = pd = a + (n - 1); - for (;;) - { - while (pb <= pc && (r = cmp_$SUFFIX(pb, a$CMPPARAMS)) <= 0) - { - if (r == 0) - { - swap(pa, pb); - pa++; - } - pb++; - CHECK_FOR_INTERRUPTS(); - } - while (pb <= pc && (r = cmp_$SUFFIX(pc, a$CMPPARAMS)) >= 0) - { - if (r == 0) - { - swap(pc, pd); - pd--; - } - pc--; - CHECK_FOR_INTERRUPTS(); - } - if (pb > pc) - break; - swap(pb, pc); - pb++; - pc--; - } - pn = a + n; - d1 = Min(pa - a, pb - pa); - vecswap(a, pb - d1, d1); - d1 = Min(pd - pc, pn - pd - 1); - vecswap(pb, pn - d1, d1); - d1 = pb - pa; - d2 = pd - pc; - if (d1 <= d2) - { - /* Recurse on left partition, then iterate on right partition */ - if (d1 > 1) - qsort_$SUFFIX(a, d1$EXTRAPARAMS); - if (d2 > 1) - { - /* Iterate rather than recurse to save stack space */ - /* qsort_$SUFFIX(pn - d2, d2$EXTRAPARAMS); */ - a = pn - d2; - n = d2; - goto loop; - } - } - else - { - /* Recurse on right partition, then iterate on left partition */ - if (d2 > 1) - qsort_$SUFFIX(pn - d2, d2$EXTRAPARAMS); - if (d1 > 1) - { - /* Iterate rather than recurse to save stack space */ - /* qsort_$SUFFIX(a, d1$EXTRAPARAMS); */ - n = d1; - goto loop; - } - } -} -EOM - - return; -} diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 3c49476483..8119c41328 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -676,8 +676,27 @@ static void tuplesort_updatemax(Tuplesortstate *state); * reduces to ApplySortComparator(), that is single-key MinimalTuple sorts * and Datum sorts. */ -#include "qsort_tuple.c" +#define ST_SORT qsort_tuple +#define ST_ELEMENT_TYPE SortTuple +#define ST_COMPARE_RUNTIME_POINTER +#define ST_COMPARE_ARG_TYPE Tuplesortstate +#define ST_CHECK_FOR_INTERRUPTS +#define ST_SCOPE static +#define ST_DECLARE +#define ST_DEFINE +#include "lib/sort_template.h" + +#define ST_SORT qsort_ssup +#define ST_ELEMENT_TYPE SortTuple +#define ST_COMPARE(a, b, ssup) \ + ApplySortComparator((a)->datum1, (a)->isnull1, \ + (b)->datum1, (b)->isnull1, (ssup)) +#define ST_COMPARE_ARG_TYPE SortSupportData +#define ST_CHECK_FOR_INTERRUPTS +#define ST_SCOPE static +#define ST_DEFINE +#include "lib/sort_template.h" /* * tuplesort_begin_xxx -- 2.20.1