Attached is revision of my patch with some clean-ups. In particular, I'm now using switch statements for greater readability, plus supporting fast path sorting of the time datatype. I've also updated the documentation on "Date/Time Types" to note the additional disadvantage of using the deprecated "store timestamp + friends as double precision floating-point numbers" compile time option.
There is one aspect to this optimisation that I haven't touched on, which is the effect on memory consumption. I think that much of the value that this patch will deliver will come from being able to release sort memory earlier. Consider that the substantial improvements in raw sorting speed (far more substantial than the improvements in query runtime) will sometimes result in a concomitant reduction in the time that the executor holds onto memory allocated for sorting. Maybe the effect will only be really noticeable for plans with a sort node as their root node, but that isn't exactly a rare occurrence, particularly among large, expensive sorts. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
From cffad91578edd698e89ef2bf7384d82aee26b57e Mon Sep 17 00:00:00 2001 From: peter2ndQuadrant <pe...@2ndquadrant.com> Date: Sun, 20 Nov 2011 20:36:25 +0000 Subject: [PATCH] Initial commit of optimization Stop directly using oid Added int8 quicksort fast path specialisation, which can also be used in place of F_TIMESTAMP_CMP for HAVE_INT64_TIMESTAMP builds. Rebased, revised patch for -hackers, with timestamp and int8 fast path sorting using the same int8 specialization. Remove unneeded line Rebase for -hackers Descriminate against cases where nKeys > 1 when selecting sort specialisation. We now support fast path sorting with multiple scanKeys. We now inline for one scanKey, but do not inline our comparetup_heap replacement when multiple scanKeys are used (although this is at the compiler's discretion). However, when multiple scanKeys are used, we still use a specialisation that will elide the SQL-function-call machinery for the first scanKey, which is almost as good for most cases. Further formatting clean-ups Move specialization selection logic to tuplesort_begin_heap(), per suggestion of Robert Haas Add simple switch for testing optimization. Improvements to template qsort_arg comments. Support fast path sorting of dates. Added float fast path sorting Patch mimimization. Additional clean-up. Indentation issue. Specialization clean-ups Use switch statement for sort specialization selection Further comment clean-up. Additional clean-up Make comparators consistent Added date fast path support, documentation updates Fix template_qsort_arg.h assertion Assert nScanKeys == 1 for inlining case --- doc/src/sgml/datatype.sgml | 6 + src/backend/utils/sort/tuplesort.c | 223 ++++++++++++++++++++++++- src/include/utils/template_qsort_arg.h | 288 ++++++++++++++++++++++++++++++++ src/port/qsort.c | 3 +- 4 files changed, 512 insertions(+), 8 deletions(-) create mode 100644 src/include/utils/template_qsort_arg.h diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index fe59a1c..f5dfbbb 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -1619,6 +1619,12 @@ SELECT E'\\xDEADBEEF'; floating-point case, large <type>interval</type> values degrade in precision as the size of the interval increases. </para> + <para> + Note that if the server is built to represent these types as + floating point numbers, that will prevent it from using fast path + sort specializations for the types, and sorting columns of those + types will be noticeably slower. + </para> </note> <para> diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 3505236..0a14a90 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -107,11 +107,13 @@ #include "miscadmin.h" #include "pg_trace.h" #include "utils/datum.h" +#include "utils/fmgroids.h" #include "utils/logtape.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/pg_rusage.h" #include "utils/rel.h" +#include "utils/template_qsort_arg.h" #include "utils/tuplesort.h" @@ -226,6 +228,13 @@ struct Tuplesortstate Tuplesortstate *state); /* + * This specialization function pointer is sometimes used as an alternative + * to the standard qsort_arg, when it has been determined that we can + * benefit from various per-type performance optimizations. + */ + void (*qsort_arg_spec)(void *a, size_t n, size_t es, void *arg); + + /* * Function to copy a supplied input tuple into palloc'd space and set up * its SortTuple representation (ie, set tuple/datum1/isnull1). Also, * state->availMem must be decreased by the amount of space used for the @@ -457,6 +466,11 @@ static void tuplesort_heap_insert(Tuplesortstate *state, SortTuple *tuple, static void tuplesort_heap_siftup(Tuplesortstate *state, bool checkIndex); static unsigned int getlen(Tuplesortstate *state, int tapenum, bool eofOK); static void markrunend(Tuplesortstate *state, int tapenum); +static inline int32 inlineApplySortFunction(FmgrInfo *sortFunction, + int sk_flags, Oid collation, + Datum datum1, bool isNull1, + Datum datum2, bool isNull2); + static int comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state); static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup); @@ -576,6 +590,99 @@ tuplesort_begin_common(int workMem, bool randomAccess) return state; } +/* + * Manufacture type specific sorting specialisations + * with inline comparators + */ +static inline int +compare_int4(Datum first, Datum second) +{ + int32 a = DatumGetInt32(first); + int32 b = DatumGetInt32(second); + if (a > b) + return 1; + else if (a == b) + return 0; + else + return -1; +} + +static inline int +compare_int8(Datum first, Datum second) +{ + int64 a = DatumGetInt64(first); + int64 b = DatumGetInt64(second); + + if (a > b) + return 1; + else if (a == b) + return 0; + else + return -1; +} + +static inline int +compare_float4(Datum first, Datum second) +{ + float4 a = DatumGetFloat4(first); + float4 b = DatumGetFloat4(second); + + if (isnan(a)) + { + if (isnan(b)) + return 0; /* NAN = NAN */ + else + return 1; /* NAN > non-NAN */ + } + else if (isnan(b)) + { + return -1; /* non-NAN < NAN */ + } + else + { + if (a > b) + return 1; + else if (a < b) + return -1; + else + return 0; + } +} + +static inline int +compare_float8(Datum first, Datum second) +{ + float8 a = DatumGetFloat8(first); + float8 b = DatumGetFloat8(second); + + if (isnan(a)) + { + if (isnan(b)) + return 0; /* NAN = NAN */ + else + return 1; /* NAN > non-NAN */ + } + else if (isnan(b)) + { + return -1; /* non-NAN < NAN */ + } + else + { + if (a > b) + return 1; + else if (a < b) + return -1; + else + return 0; + } +} + +/* Instantiate sorting specializations */ +TEMPLATE_QSORT_ARG(int4, compare_int4); +TEMPLATE_QSORT_ARG(int8, compare_int8); +TEMPLATE_QSORT_ARG(float4, compare_float4); +TEMPLATE_QSORT_ARG(float8, compare_float8); + Tuplesortstate * tuplesort_begin_heap(TupleDesc tupDesc, int nkeys, AttrNumber *attNums, @@ -650,6 +757,83 @@ tuplesort_begin_heap(TupleDesc tupDesc, (Datum) 0); } + /* Select a qsort specialization, if possible */ + if (state->scanKeys) + { + if (state->nKeys == 1) + { + /* Single scankey, use an inlining specialization */ + switch (state->scanKeys->sk_func.fn_oid) + { + /* Some comparison that has an underlying int4 representation */ + case F_BTINT4CMP: + state->qsort_arg_spec = int4inlqsort_arg; + break; + case F_DATE_CMP: + state->qsort_arg_spec = int4inlqsort_arg; + break; + /* Some comparison that has an underlying int8 representation */ + case F_BTINT8CMP: + state->qsort_arg_spec = int8inlqsort_arg; + break; +#ifdef HAVE_INT64_TIMESTAMP + case F_TIMESTAMP_CMP: + state->qsort_arg_spec = int8inlqsort_arg; + break; + case F_TIME_CMP: + state->qsort_arg_spec = int8inlqsort_arg; + break; +#endif + /* floating point types */ + case F_BTFLOAT4CMP: + state->qsort_arg_spec = float4inlqsort_arg; + break; + case F_BTFLOAT8CMP: + state->qsort_arg_spec = float8inlqsort_arg; + break; + default: + state->qsort_arg_spec = NULL; + } + } + else + { + /* + * Multiple scankeys, use a specialisation to elide + * SQL function call machinery for the first + */ + switch (state->scanKeys->sk_func.fn_oid) + { + /* Some comparison that has an underlying int4 representation */ + case F_BTINT4CMP: + state->qsort_arg_spec = int4regqsort_arg; + break; + case F_DATE_CMP: + state->qsort_arg_spec = int4regqsort_arg; + break; + /* Some comparison that has an underlying int8 representation */ + case F_BTINT8CMP: + state->qsort_arg_spec = int8regqsort_arg; + break; +#ifdef HAVE_INT64_TIMESTAMP + case F_TIMESTAMP_CMP: + state->qsort_arg_spec = int8regqsort_arg; + break; + case F_TIME_CMP: + state->qsort_arg_spec = int8regqsort_arg; + break; +#endif + /* floating point types */ + case F_BTFLOAT4CMP: + state->qsort_arg_spec = float4regqsort_arg; + break; + case F_BTFLOAT8CMP: + state->qsort_arg_spec = float8regqsort_arg; + break; + default: + state->qsort_arg_spec = NULL; + } + } + } MemoryContextSwitchTo(oldcontext); return state; @@ -1225,7 +1409,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple) } /* - * All tuples have been provided; finish the sort. + * All tuples have been provided; perform the sort. */ void tuplesort_performsort(Tuplesortstate *state) @@ -1247,11 +1431,28 @@ tuplesort_performsort(Tuplesortstate *state) * amount of memory. Just qsort 'em and we're done. */ if (state->memtupcount > 1) - qsort_arg((void *) state->memtuples, - state->memtupcount, - sizeof(SortTuple), - (qsort_arg_comparator) state->comparetup, - (void *) state); + { + /* Consider a sorting specialization */ + if (state->qsort_arg_spec) + { + state->qsort_arg_spec((void *) state->memtuples, + state->memtupcount, + sizeof(SortTuple), + (void *) state); + } + else + { + /* + * Fall back on type-neutral qsort with + * function-pointer comparator + */ + qsort_arg((void *) state->memtuples, + state->memtupcount, + sizeof(SortTuple), + (qsort_arg_comparator) state->comparetup, + (void *) state); + } + } state->current = 0; state->eof_reached = false; state->markpos_offset = 0; @@ -2657,6 +2858,9 @@ myFunctionCall2Coll(FmgrInfo *flinfo, Oid collation, Datum arg1, Datum arg2) * and return a 3-way comparison result. This takes care of handling * reverse-sort and NULLs-ordering properly. We assume that DESC and * NULLS_FIRST options are encoded in sk_flags the same way btree does it. + * + * Note that this logic is partially duplicated in template_qsort_arg.h, + * from which it is "instantiated" per-type. */ static inline int32 inlineApplySortFunction(FmgrInfo *sortFunction, int sk_flags, Oid collation, @@ -2708,10 +2912,15 @@ ApplySortFunction(FmgrInfo *sortFunction, int sortFlags, Oid collation, } + + /* * Routines specialized for HeapTuple (actually MinimalTuple) case + * + * N.B. : Some of this code is partially duplicated for the purposes + * of heap comparison specializations. Take care to keep the code + * within template_qsort_arg.h consistent with this code. */ - static int comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state) { diff --git a/src/include/utils/template_qsort_arg.h b/src/include/utils/template_qsort_arg.h new file mode 100644 index 0000000..9fee892 --- /dev/null +++ b/src/include/utils/template_qsort_arg.h @@ -0,0 +1,288 @@ +/*------------------------------------------------------------------------- + * template_qsort_arg.h: "template" version of qsort_arg.c + * + * This version of qsort_arg is exclusively used within tuplesort.c to more + * efficiently sort common types such as integers and floats. In providing + * this version, we seek to take advantage of compile-time optimizations, + * in particular, the inlining of comparators, as well as reducing the + * overhead of comparisons that the general SQL-callable-function API imposes. + * + * The TEMPLATE_QSORT_ARG() macro generates an inlining variant (for sorts + * with a single scanKey) and non-inlining variant for sorts with multiple + * scanKeys. + * + * We rely on various function declarations, and indeed partially duplicate + * some code from tuplesort.c, so this file should be considered private to + * that module, rather than a generic piece of infrastructure. + * + * CAUTION: if you change this file, see also qsort_arg.c as well as + * qsort.c. qsort_arg.c should be considered authoratative. + * + * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/template_qsort_arg.h + *------------------------------------------------------------------------- + */ + +#include "c.h" + +#define swapcode(TYPE, parmi, parmj, n) \ +do { \ + size_t i = (n) / sizeof (TYPE); \ + TYPE *pi = (TYPE *)(void *)(parmi); \ + TYPE *pj = (TYPE *)(void *)(parmj); \ + do { \ + TYPE t = *pi; \ + *pi++ = *pj; \ + *pj++ = t; \ + } while (--i > 0); \ +} while (0) + +#define SWAPINIT(a, es) swaptype = ((char *)(a) - (char *)0) % sizeof(long) || \ + (es) % sizeof(long) ? 2 : (es) == sizeof(long)? 0 : 1; + +#define thistype_vecswap(TYPE, a, b, n, SPEC_VAR) \ + if ((n) > 0) TYPE##SPEC_VAR##swapfunc((a), (b), (size_t)(n), swaptype) + +#define thistype_swap(TYPE, a, b, SPEC_VAR) \ +if (swaptype == 0) { \ + long t = *(long *)(void *)(a); \ + *(long *)(void *)(a) = *(long *)(void *)(b);\ + *(long *)(void *)(b) = t; \ + } else \ + TYPE##SPEC_VAR##swapfunc(a, b, es, swaptype) + +/* + * This macro manufactures a type-specific implementation of qsort_arg with + * the comparator, COMPAR, known at compile time. COMPAR is typically an + * inline function. + * + * COMPAR should take as its arguments two Datums, and return an int, in + * line with standard qsort convention. + * + * We have void* parameters for TYPE##AppSort just to shut up the compiler. + * They could be SortTuple pointers instead, but that would make it more + * difficult to keep template_qsort_arg.h consistent with tuplesort.c. + * + */ + +#define DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, SPEC_VAR, REG_ADDITIONAL_CODE) \ +void TYPE##SPEC_VAR##qsort_arg(void *a, size_t n, size_t es, void *arg); \ + \ +static inline int32 \ +TYPE##SPEC_VAR##AppSort(const void *a, const void *b, Tuplesortstate *state) \ +{ \ + int32 compare; \ + const SortTuple* aT = a; \ + const SortTuple* bT = b; \ + \ + /* Allow interrupting long sorts */ \ + CHECK_FOR_INTERRUPTS(); \ + \ + Assert(state->scanKeys); \ + \ + if ( aT->isnull1) \ + { \ + if (bT->isnull1) \ + compare = 0; /* NULL "=" NULL */ \ + else if (state->scanKeys->sk_flags & SK_BT_NULLS_FIRST) \ + compare = -1; /* NULL "<" NOT_NULL */ \ + else \ + compare = 1; /* NULL ">" NOT_NULL */ \ + } \ + else if (bT->isnull1) \ + { \ + if (state->scanKeys->sk_flags & SK_BT_NULLS_FIRST) \ + compare = 1; /* NOT_NULL ">" NULL */ \ + else \ + compare = -1; /* NOT_NULL "<" NULL */ \ + } \ + else \ + { \ + compare = COMPAR(aT->datum1, bT->datum1); \ + \ + if (state->scanKeys->sk_flags & SK_BT_DESC) \ + compare = -compare; \ + } \ + if (compare != 0) \ + return compare; \ + /* Additional code for variants where we have more than one scanKey */ \ + REG_ADDITIONAL_CODE \ + return 0; \ +} \ + \ +static inline void \ +TYPE##SPEC_VAR##swapfunc(char *a, char *b, size_t n, int swaptype) \ +{ \ + if (swaptype <= 1) \ + swapcode(long, a, b, n); \ + else \ + swapcode(char, a, b, n); \ +} \ + \ +static inline char * \ +TYPE##SPEC_VAR##med3(char *a, char *b, char *c, void *arg) \ +{ \ + return TYPE##SPEC_VAR##AppSort(a, b, arg) < 0 ? \ + (TYPE##SPEC_VAR##AppSort(b, c, arg) < 0 ? \ + b : (TYPE##SPEC_VAR##AppSort(a, c, arg) < 0 ? c : a)) \ + : (TYPE##SPEC_VAR##AppSort(b, c, arg) > 0 ? \ + b : (TYPE##SPEC_VAR##AppSort(a, c, arg) < 0 ? a : c)); \ +} \ + \ +void \ +TYPE##SPEC_VAR##qsort_arg(void *a, size_t n, size_t es, void *arg) \ +{ \ + char *pa, \ + *pb, \ + *pc, \ + *pd, \ + *pl, \ + *pm, \ + *pn; \ + int d, \ + r, \ + swaptype, \ + presorted; \ + \ +loop:SWAPINIT(a, es); \ + if (n < 7) \ + { \ + for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) \ + for (pl = pm; pl > (char *) a && \ + TYPE##SPEC_VAR##AppSort(pl - es, pl, arg) > 0; \ + pl -= es) \ + thistype_swap(TYPE, pl, pl - es, SPEC_VAR); \ + return; \ + } \ + presorted = 1; \ + for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) \ + { \ + if (TYPE##SPEC_VAR##AppSort(pm - es, pm, arg) > 0) \ + { \ + presorted = 0; \ + break; \ + } \ + } \ + if (presorted) \ + return; \ + pm = (char *) a + (n / 2) * es; \ + if (n > 7) \ + { \ + pl = (char *) a; \ + pn = (char *) a + (n - 1) * es; \ + if (n > 40) \ + { \ + d = (n / 8) * es; \ + pl = TYPE##SPEC_VAR##med3(pl, pl + d, pl + 2 * d, arg); \ + pm = TYPE##SPEC_VAR##med3(pm - d, pm, pm + d, arg); \ + pn = TYPE##SPEC_VAR##med3(pn - 2 * d, pn - d, pn, arg); \ + } \ + pm = TYPE##SPEC_VAR##med3(pl, pm, pn, arg); \ + } \ + thistype_swap(TYPE, a, pm, SPEC_VAR); \ + pa = pb = (char *) a + es; \ + pc = pd = (char *) a + (n - 1) * es; \ + for (;;) \ + { \ + while (pb <= pc && \ + (r = TYPE##SPEC_VAR##AppSort(pb, a, arg)) <= 0) \ + { \ + if (r == 0) \ + { \ + thistype_swap(TYPE, pa, pb, SPEC_VAR); \ + pa += es; \ + } \ + pb += es; \ + } \ + while (pb <= pc && \ + (r = TYPE##SPEC_VAR##AppSort(pc, a, arg)) >= 0) \ + { \ + if (r == 0) \ + { \ + thistype_swap(TYPE, pc, pd, SPEC_VAR); \ + pd -= es; \ + } \ + pc -= es; \ + } \ + if (pb > pc) \ + break; \ + thistype_swap(TYPE, pb, pc, SPEC_VAR); \ + pb += es; \ + pc -= es; \ + } \ + pn = (char *) a + n * es; \ + r = Min(pa - (char *) a, pb - pa); \ + thistype_vecswap(TYPE, a, pb - r, r, SPEC_VAR); \ + r = Min(pd - pc, pn - pd - es); \ + thistype_vecswap(TYPE, pb, pn - r, r, SPEC_VAR); \ + if ((r = pb - pa) > es) \ + TYPE##SPEC_VAR##qsort_arg(a, r / es, es, arg); \ + if ((r = pd - pc) > es) \ + { \ + /* Iterate rather than recurse to save stack space */ \ + a = pn - r; \ + n = r / es; \ + goto loop; \ + } \ +} + +/* + * This code becomes part of the comparator meta-function for the "reg" + * specialization variant of each datatype-specific specialization. + * + * For "reg", we can handle multiple scankeys, but the function generally + * will not be inlined. Inlining has been found to offer additional performance + * benefits beyond eliding the regular SQL function-call machinery. + * + * We do not elide the regular SQL function-call machinery in the "reg" case + * for the second and subsequent scanKeys. The higher the cardinality of the + * first scanKey column, the less frequently we'll need to fall back. + */ + +#define REG_ADDITIONAL_CODE \ +{ \ + ScanKey scanKey = state->scanKeys; \ + HeapTupleData ltup; \ + HeapTupleData rtup; \ + TupleDesc tupDesc; \ + int nkey; \ + \ + Assert(state->nKeys > 1); \ + /* Compare additional sort keys */ \ + ltup.t_len = ((MinimalTuple) aT->tuple)->t_len + MINIMAL_TUPLE_OFFSET; \ + ltup.t_data = (HeapTupleHeader) ((char *) aT->tuple - MINIMAL_TUPLE_OFFSET); \ + rtup.t_len = ((MinimalTuple) bT->tuple)->t_len + MINIMAL_TUPLE_OFFSET; \ + rtup.t_data = (HeapTupleHeader) ((char *) bT->tuple - MINIMAL_TUPLE_OFFSET); \ + tupDesc = state->tupDesc; \ + scanKey++; \ + for (nkey = 1; nkey < state->nKeys; nkey++, scanKey++) \ + { \ + AttrNumber attno = scanKey->sk_attno; \ + Datum datum1, \ + datum2; \ + bool isnull1, \ + isnull2; \ + \ + datum1 = heap_getattr(<up, attno, tupDesc, &isnull1); \ + datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2); \ + \ + compare = inlineApplySortFunction(&scanKey->sk_func, \ + scanKey->sk_flags, \ + scanKey->sk_collation, \ + datum1, isnull1, \ + datum2, isnull2); \ + if (compare != 0) \ + return compare; \ + } \ +} +#define INL_ADDITIONAL_CODE Assert(state->nKeys == 1); +/* + * Manufacture inlining variant for nKeys=1 case, and non-inlining variant + * for nKeys > 1 case + */ +#define TEMPLATE_QSORT_ARG(TYPE, COMPAR) \ +DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, inl, INL_ADDITIONAL_CODE) \ +DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, reg, REG_ADDITIONAL_CODE) \ + diff --git a/src/port/qsort.c b/src/port/qsort.c index 8e2c6d9..c949c8e 100644 --- a/src/port/qsort.c +++ b/src/port/qsort.c @@ -7,7 +7,8 @@ * Remove ill-considered "swap_cnt" switch to insertion sort, * in favor of a simple check for presorted input. * - * CAUTION: if you change this file, see also qsort_arg.c + * CAUTION: if you change this file, see also qsort_arg.c and + * template_qsort_arg.h * * src/port/qsort.c */ -- 1.7.7.3
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers