On 03/03/2026 19:31, David Geier wrote:
Attached are the patches rebased on latest master.

I've removed the ASCII fast-path patch 0006 as it turned out to be more
complicated to make work than expected.

I kept the radix sort patch because it gives a decent speedup but I
would like to focus for now on getting patches 0001 - 0004 merged.
They're all simple and, the way I see it, uncontroversial.

I remeasured the savings of 0001 - 0004, which comes on top of the
already committed patch that inlined the comparison function, which gave
another ~5%:

Data set            | Patched (ms) | Master (ms)  | Speedup
--------------------|--------------|--------------|----------
movies(plot)        |   8,058      |  10,311      | 1.27x
lineitem(l_comment) | 223,233      | 256,986      | 1.19x

I've also registered the change at the commit fest, see
https://commitfest.postgresql.org/patch/6418/.

Attached is v5 that removes an incorrect assertion from the radix sort code.

v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch
v5-0002-Optimize-generate_trgm-with-sort_template.h.patch
v5-0003-Make-btint4cmp-branchless.patch
v5-0004-Faster-qunique-comparator-in-generate_trgm.patch
v5-0005-Optimize-generate_trgm-with-radix-sort.patch

Pushed 0001 as commit 6f5ad00ab7.

I squashed 0002 and 0004 into one commit, and did some more refactoring: I created a trigram_qsort() helper function that calls the signed or unsigned variant, so that that logic doesn't need to be duplicated in the callers. For symmetry, I also added a trigram_qunique() helper function which just calls qunique() with the new, faster CMPTRGM_EQ comparator. Pushed these as commit 9f3755ea07.

Patch 0003 gives me pause. It's a tiny patch:

@@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS)
        int32           a = PG_GETARG_INT32(0);
        int32           b = PG_GETARG_INT32(1);
- if (a > b)
-               PG_RETURN_INT32(A_GREATER_THAN_B);
-       else if (a == b)
-               PG_RETURN_INT32(0);
-       else
-               PG_RETURN_INT32(A_LESS_THAN_B);
+       PG_RETURN_INT32(pg_cmp_s32(a, b));
 }

But the comments on the pg_cmp functions say:

 * NB: If the comparator function is inlined, some compilers may produce
 * worse code with these helper functions than with code with the
 * following form:
 *
 *     if (a < b)
 *         return -1;
 *     if (a > b)
 *         return 1;
 *     return 0;
 *

So, uh, is that really a universal improvement? Is that comment about producing worse code outdated?

- Heikki



Reply via email to