Sokolov Yura <funny.fal...@postgrespro.ru> writes: > [ 0001-Improve-compactify_tuples.patch, v5 or thereabouts ]
I started to review this patch. I spent a fair amount of time on beautifying the code, because I found it rather ugly and drastically undercommented. Once I had it to the point where it seemed readable, I went to check the shellsort algorithm against Wikipedia's entry, and found that this appears to be an incorrect implementation of shellsort: where pg_shell_sort_pass has for (_i = off; _i < _n; _i += off) \ it seems to me that we need to have for (_i = off; _i < _n; _i += 1) \ or maybe just _i++. As-is, this isn't h-sorting the whole file, but just the subset of entries that have multiple-of-h indexes (ie, the first of the h distinct subfiles that should get sorted). The bug is masked by the final pass of plain insertion sort, but we are not getting the benefit we should get from the earlier passes. However, I'm a bit dubious that it's worth fixing that; instead my inclination would be to rip out the shellsort implementation entirely. The code is only using it for the nitems <= 48 case (which makes the first three offset steps certainly no-ops) and I am really unconvinced that it's worth expending the code space for a shellsort rather than plain insertion sort in that case, especially when we have good reason to think that the input data is nearly sorted. BTW, the originally given test case shows no measurable improvement on my box. I was eventually able to convince myself by profiling that the patch makes us spend less time in compactify_tuples, but this test case isn't a very convincing one. So, quite aside from the bug, I'm not excited about committing the attached as-is. I think we should remove pg_shell_sort and just use pg_insertion_sort. If somebody can show a test case that provides a measurable speed improvement from the extra code, I could be persuaded to reconsider. I also wonder if the nitems <= 48 cutoff needs to be reconsidered in light of this. But since I can hardly measure any benefit from the patch at all, I'm not in the best position to test different values for that cutoff. Have not looked at the 0002 patch yet. regards, tom lane
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c index 41642eb..1af1b85 100644 *** a/src/backend/storage/page/bufpage.c --- b/src/backend/storage/page/bufpage.c *************** *** 18,23 **** --- 18,24 ---- #include "access/itup.h" #include "access/xlog.h" #include "storage/checksum.h" + #include "utils/inline_sort.h" #include "utils/memdebug.h" #include "utils/memutils.h" *************** typedef struct itemIdSortData *** 425,439 **** } itemIdSortData; typedef itemIdSortData *itemIdSort; ! static int itemoffcompare(const void *itemidp1, const void *itemidp2) { - /* Sort in decreasing itemoff order */ return ((itemIdSort) itemidp2)->itemoff - ((itemIdSort) itemidp1)->itemoff; } /* * After removing or marking some line pointers unused, move the tuples to * remove the gaps caused by the removed items. */ --- 426,542 ---- } itemIdSortData; typedef itemIdSortData *itemIdSort; ! /* Comparator for sorting in decreasing itemoff order */ ! static inline int itemoffcompare(const void *itemidp1, const void *itemidp2) { return ((itemIdSort) itemidp2)->itemoff - ((itemIdSort) itemidp1)->itemoff; } /* + * Sort an array of itemIdSort's on itemoff, descending. + * + * This uses Shell sort. Given that array is small and itemoffcompare + * can be inlined, it is much faster than general-purpose qsort. + */ + static void + sort_itemIds_small(itemIdSort itemidbase, int nitems) + { + pg_shell_sort(itemIdSortData, itemidbase, nitems, itemoffcompare); + } + + /* + * Sort an array of itemIdSort's on itemoff, descending. + * + * This uses bucket sort: + * - single pass of stable prefix sort on high 8 bits of itemoffs + * - then insertion sort on buckets larger than 1 element + */ + static void + sort_itemIds(itemIdSort itemidbase, int nitems) + { + /* number of buckets to use: */ + #define NSPLIT 256 + /* divisor to scale input values into 0..NSPLIT-1: */ + #define PREFDIV (BLCKSZ / NSPLIT) + /* per-bucket counts; we need two extra elements, see below */ + uint16 count[NSPLIT + 2]; + itemIdSortData copy[Max(MaxIndexTuplesPerPage, MaxHeapTuplesPerPage)]; + int i, + max, + total, + pos, + highbits; + + Assert(nitems <= lengthof(copy)); + + /* + * Count how many items in each bucket. We assume all itemoff values are + * less than BLCKSZ, therefore dividing by PREFDIV gives a value less than + * NSPLIT. + */ + memset(count, 0, sizeof(count)); + for (i = 0; i < nitems; i++) + { + highbits = itemidbase[i].itemoff / PREFDIV; + count[highbits]++; + } + + /* + * Now convert counts to bucket position info, placing the buckets in + * decreasing order. After this loop, count[k+1] is start of bucket k + * (for 0 <= k < NSPLIT), count[k] is end+1 of bucket k, and therefore + * count[k] - count[k+1] is length of bucket k. + * + * Also detect whether any buckets have more than one element. For this + * purpose, "max" is set to the OR of all the counts (not really the max). + */ + max = total = count[NSPLIT - 1]; + for (i = NSPLIT - 2; i >= 0; i--) + { + max |= count[i]; + total += count[i]; + count[i] = total; + } + Assert(count[0] == nitems); + + /* + * Now copy the data to be sorted into appropriate positions in the copy[] + * array. We increment each bucket-start pointer as we insert data into + * its bucket; hence, after this loop count[k+1] is the end+1 of bucket k, + * count[k+2] is the start of bucket k, and count[k+1] - count[k+2] is the + * length of bucket k. + */ + for (i = 0; i < nitems; i++) + { + highbits = itemidbase[i].itemoff / PREFDIV; + pos = count[highbits + 1]++; + copy[pos] = itemidbase[i]; + } + Assert(count[1] == nitems); + + /* + * If any buckets are larger than 1 item, we must sort them. They should + * be small enough to make insertion sort effective. + */ + if (max > 1) + { + /* i is bucket number plus 1 */ + for (i = NSPLIT; i > 0; i--) + { + pg_insertion_sort(itemIdSortData, + copy + count[i + 1], + count[i] - count[i + 1], + itemoffcompare); + } + } + + /* And transfer the sorted data back to the caller */ + memcpy(itemidbase, copy, sizeof(itemIdSortData) * nitems); + } + + /* * After removing or marking some line pointers unused, move the tuples to * remove the gaps caused by the removed items. */ *************** compactify_tuples(itemIdSort itemidbase, *** 445,452 **** int i; /* sort itemIdSortData array into decreasing itemoff order */ ! qsort((char *) itemidbase, nitems, sizeof(itemIdSortData), ! itemoffcompare); upper = phdr->pd_special; for (i = 0; i < nitems; i++) --- 548,558 ---- int i; /* sort itemIdSortData array into decreasing itemoff order */ ! /* empirically, bucket sort is worth the trouble above 48 items */ ! if (nitems > 48) ! sort_itemIds(itemidbase, nitems); ! else ! sort_itemIds_small(itemidbase, nitems); upper = phdr->pd_special; for (i = 0; i < nitems; i++) diff --git a/src/include/utils/inline_sort.h b/src/include/utils/inline_sort.h index ...c97a248 . *** a/src/include/utils/inline_sort.h --- b/src/include/utils/inline_sort.h *************** *** 0 **** --- 1,88 ---- + /*------------------------------------------------------------------------- + * + * inline_sort.h + * Macros to perform specialized types of sorts. + * + * + * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/inline_sort.h + * + *------------------------------------------------------------------------- + */ + #ifndef INLINE_SORT_H + #define INLINE_SORT_H + + /* + * pg_shell_sort - sort for small arrays with inlinable comparator. + * + * This is best used with arrays smaller than 200 elements, and could be + * safely used with up to 1000 elements. But it degrades fast after that. + * + * Since this is implemented as a macro it can be optimized together with + * comparison function; using a macro or inlinable function is recommended. + * + * Arguments: + * elem_t - type of array elements (for declaring temporary variables) + * array - pointer to elements to be sorted + * nitems - number of elements to be sorted + * cmp - comparison function that accepts addresses of 2 elements + * (same API as qsort comparison function). + * cmp argument should be a function or macro name. + * array and nitems arguments are evaluated only once. + * + * This uses Shellsort (see e.g. wikipedia's entry), with gaps selected as + * "gap(i) = smallest prime number below e^i". These are close to the gaps + * recommended by Incerpi & Sedwick, but look to be better on average. + */ + #define pg_shell_sort(elem_t, array, nitems, cmp) \ + do { \ + elem_t *_arr = (array); \ + int _n = (nitems); \ + static const int _offsets[] = {401, 139, 53, 19, 7, 3}; \ + int _noff; \ + for (_noff = 0; _noff < lengthof(_offsets); _noff++) \ + { \ + int _off = _offsets[_noff]; \ + pg_shell_sort_pass(elem_t, cmp, _off, _arr, _n); \ + } \ + pg_shell_sort_pass(elem_t, cmp, 1, _arr, _n); \ + } while (0) + + /* + * pg_insertion_sort - plain insertion sort. + * Useful for very small array, or if array was almost sorted already. + * Same API as pg_shell_sort. + */ + #define pg_insertion_sort(elem_t, array, nitems, cmp) \ + do { \ + elem_t *_arr = (array); \ + int _n = (nitems); \ + pg_shell_sort_pass(elem_t, cmp, 1, _arr, _n); \ + } while (0) + + /* + * One pass of Shellsort: simple insertion sort of the subset of entries + * at stride "off". Not intended to be used outside of above macros. + */ + #define pg_shell_sort_pass(elem_t, cmp, off, _arr, _n) \ + do { \ + int _i; \ + for (_i = off; _i < _n; _i += off) \ + { \ + if (cmp(_arr + _i - off, _arr + _i) > 0) \ + { \ + elem_t _temp = _arr[_i]; \ + int _j = _i; \ + do \ + { \ + _arr[_j] = _arr[_j - off]; \ + _j -= off; \ + } while (_j >= off && cmp(_arr + _j - off, &_temp) > 0); \ + _arr[_j] = _temp; \ + } \ + } \ + } while (0) + + #endif /* INLINE_SORT_H */
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers