On 03/03/2026 19:31, David Geier wrote:
Attached are the patches rebased on latest master.
I've removed the ASCII fast-path patch 0006 as it turned out to be more
complicated to make work than expected.
I kept the radix sort patch because it gives a decent speedup but I
would like to focus for now on getting patches 0001 - 0004 merged.
They're all simple and, the way I see it, uncontroversial.
I remeasured the savings of 0001 - 0004, which comes on top of the
already committed patch that inlined the comparison function, which gave
another ~5%:
Data set | Patched (ms) | Master (ms) | Speedup
--------------------|--------------|--------------|----------
movies(plot) | 8,058 | 10,311 | 1.27x
lineitem(l_comment) | 223,233 | 256,986 | 1.19x
I've also registered the change at the commit fest, see
https://commitfest.postgresql.org/patch/6418/.
Attached is v5 that removes an incorrect assertion from the radix sort code.
v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch
v5-0002-Optimize-generate_trgm-with-sort_template.h.patch
v5-0003-Make-btint4cmp-branchless.patch
v5-0004-Faster-qunique-comparator-in-generate_trgm.patch
v5-0005-Optimize-generate_trgm-with-radix-sort.patch
Pushed 0001 as commit 6f5ad00ab7.
I squashed 0002 and 0004 into one commit, and did some more refactoring:
I created a trigram_qsort() helper function that calls the signed or
unsigned variant, so that that logic doesn't need to be duplicated in
the callers. For symmetry, I also added a trigram_qunique() helper
function which just calls qunique() with the new, faster CMPTRGM_EQ
comparator. Pushed these as commit 9f3755ea07.
Patch 0003 gives me pause. It's a tiny patch:
@@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS)
int32 a = PG_GETARG_INT32(0);
int32 b = PG_GETARG_INT32(1);
- if (a > b)
- PG_RETURN_INT32(A_GREATER_THAN_B);
- else if (a == b)
- PG_RETURN_INT32(0);
- else
- PG_RETURN_INT32(A_LESS_THAN_B);
+ PG_RETURN_INT32(pg_cmp_s32(a, b));
}
But the comments on the pg_cmp functions say:
* NB: If the comparator function is inlined, some compilers may produce
* worse code with these helper functions than with code with the
* following form:
*
* if (a < b)
* return -1;
* if (a > b)
* return 1;
* return 0;
*
So, uh, is that really a universal improvement? Is that comment about
producing worse code outdated?
- Heikki