Hi Heikki!
Thanks for looking at the patch set.
On 06.01.2026 18:00, Heikki Linnakangas wrote:
> On 05/01/2026 17:01, David Geier wrote:
>> - v1-0002-Optimized-comparison-functions.patch: Use FunctionCallInvoke()
>> instead of FunctionCall2Coll(). This saves a bunch of per-comparison
>> setup code, such as calling InitFunctionCallInfoData().
>
> You lose the check for NULL result with this. That's probably still
> worth checking.
It seems like existing code where all args are not null, has that safety
check. Added it for consistency.
>> - v1-0004-Avoid-dedup-and-sort-in-ginExtractEntries.patch
>> ginExtractEntries() deduplicates and sorts the entries returned from the
>> extract value function. In case of pg_trgm, that is completely redundant
>> because the trigrams are already deduplicated and sorted. The current
>> version of this patch is just to demonstrate the potential. We need to
>> think about what we want here. Ideally, we would require the extraction
>> function to provide the entries deduplicated and sorted. Alternatively,
>> we could indicate to ginExtractEntries() if the entries are already
>> deduplicated and sorted. If we don't want to alter the signature of the
>> extract value function, we could e.g. use the MSB of the nentries
>> argument.
>
> Yeah, this seems wrong as it is. You're assuming that if the extract
> function returns nullFlags == NULL, the array is already sorted and
> deduped.
As said, that was just for demonstration purposes of the possible gains.
I've changed the code now such that the extractValue function of the GIN
index can indicate via the third argument uniqueAndSorted, if the
returned keys are already unique and sorted.
Unfortunately, it seems like this patch regresses performance. See
measurements below. I haven't had the time to investigate why that is.
It's pretty counter intuitive, given that this patch effectively only
removes code. Maybe you could re-test patch 0004 and share your runtimes?
>> v1-0007-Faster-qunique-comparator.patch: qunique() doesn't require a
>> full sort comparator (-1 = less, 0 = equal, 1 = greater) but only a
>> equal/unequal comparator (e.g. 0 = equal and 1 = unequal). The same
>> optimization can be done in plenty of other places in our code base.
>> Likely, in most of them the gains are insignificant.
>
> Makes sense. I'm a little disappointed the compiler won't do that
> optimization for us..
I thought the same.
>
> Perhaps we should introduce a new qunique_eq() function with a different
> callback signature:
>
> /* like qunique(), but the callback function returns true/false rather
> than int */
> static inline size_t
> qunique_eq(void *array, size_t elements, size_t width,
> bool (*equal) (const void *, const void *))
>
I would prefer to change qunique() instead. That would enforce using an
adequate comparison function from the get go. There are only ~15 calls
to qunique(). So refactoring this should also be a fairly small patch. I
can do that if there's agreement for that approach.
>> v1-0008-Add-ASCII-fastpath-to-generate_trgm_only.patch: Typically lots
>> of text is actually ASCII. Hence, we provide a fast path for this case
>> which is exercised if the MSB of the current character is unset.
>
> This uses pg_ascii_tolower() when for ASCII characters when built with
> the IGNORECASE. I don't think that's correct, if the proper collation
> would do something more complicated for than what pg_ascii_tolower() does.
Oh, that's evil. I had tested that specifically. But it only worked
because the code in master uses str_tolower() with
DEFAULT_COLLATION_OID. So using a different locale like in the following
example does something different than when creating a database with the
same locale.
postgres=# select lower('III' COLLATE "tr_TR");
lower
-------
ııı
postgres=# select show_trgm('III' COLLATE "tr_TR");
show_trgm
-------------------------
{" i"," ii","ii ",iii}
(1 row)
But when using tr_TR as default locale of the database the following
happens:
postgres=# select lower('III' COLLATE "tr_TR");
lower
-------
ııı
postgres=# select show_trgm('III');sü
show_trgm
---------------------------------------
{0xbbd8dd,0xf26fab,0xf31e1a,0x2af4f1}
I'm wondering if that's intentional to begin with. Shouldn't the code
instead pass PG_GET_COLLATION() to str_tolower()? Might require some
research to see how other index types handle locales.
Coming back to the original problem: the lengthy comment at the top of
pg_locale_libc.c, suggests that in some cases ASCII characters are
handled the pg_ascii_tolower() way for the default locale. See for
example tolower_libc_mb(). So a character by character conversion using
that function will yield a different result than strlower_libc_mb(). I'm
wondering why that is.
Anyways, we could limit the optimization to only kick in when the used
locale follows the same rules as pg_ascii_tolower(). We could test that
when creating the locale and store that info in pg_locale_struct.
Thoughts?
> Did you measure how big is the impact from each individual patch?
> Patches 1 and 2 seem pretty much ready to be committed, but I wonder if
> they make any difference on their own.
Here is the impact of each patch. I ran again CREATE INDEX three times
and took the fastest run. The run of each patch includes all previous
patches as well. For example, the timings for patch 0003 were measured
with a binary that also had patch 0002 and 0001 applied. To get the
impact of each patch in isolation, the delta to the previous run was taken.
Code | movies |delta | lineitem | delta
------------------------------------|--------|-------|------------------
master | 10,311 | 0 | 256,986 | 0
v1-0001-Inline-ginCompareAttEntries | 9,694 | 617 | 239,778 | 17,208
v1-0002-Optimized-comparison-func | 9,510 | 184 | 238,094 | 1,684
v1-0003-Use-sort_template.h | 8,661 | 849 | 231,190 | 6,904
v1-0004-Avoid-dedup-and-sort-in | 9,305 | -644 | 232,472 | -1,282
v1-0005-Make-btint4cmp-branchless | 8,240 | 1,065 | 228,387 | 4,085
v1-0006-Use-radix-sort | 6,976 | 1,264 | 207,687 | 20,700
v1-0007-Faster-qunique-comparator | 5,911 | 1,065 | 203,744 | 3,943
v1-0008-Add-ASCII-fastpath | 3,409 | 2,502 | 161,469 | 42,275
Attached is v2 of the patch set with the aforementioned changes. I've
also fixed the white space errors in 0003, 0004 and 0008, as reported by
Kirill.
--
David GeierFrom 00c75bd58b7982c8bfe62d1260e9366766bc7f34 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Fri, 14 Nov 2025 11:37:40 +0100
Subject: [PATCH v2 8/8] Add ASCII fastpath to generate_trgm_only()
---
contrib/pg_trgm/trgm_op.c | 124 ++++++++++++++++++++------------------
1 file changed, 65 insertions(+), 59 deletions(-)
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index 39b586f5b9a..d2087b3a45e 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -226,32 +226,6 @@ show_limit(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT4(similarity_threshold);
}
-/*
- * Finds first word in string, returns pointer to the word,
- * endword points to the character after word
- */
-static char *
-find_word(char *str, int lenstr, char **endword, int *charlen)
-{
- char *beginword = str;
-
- while (beginword - str < lenstr && !ISWORDCHR(beginword))
- beginword += pg_mblen(beginword);
-
- if (beginword - str >= lenstr)
- return NULL;
-
- *endword = beginword;
- *charlen = 0;
- while (*endword - str < lenstr && ISWORDCHR(*endword))
- {
- *endword += pg_mblen(*endword);
- (*charlen)++;
- }
-
- return beginword;
-}
-
/*
* Reduce a trigram (three possibly multi-byte characters) to a trgm,
* which is always exactly three bytes. If we have three single-byte
@@ -337,58 +311,90 @@ make_trigrams(trgm *tptr, char *str, int bytelen, int charlen)
static int
generate_trgm_only(trgm *trg, char *str, int slen, TrgmBound *bounds)
{
- trgm *tptr;
- char *buf;
- int charlen,
- bytelen;
- char *bword,
- *eword;
+ trgm *tptr = trg;
+ char *buf;
if (slen + LPADDING + RPADDING < 3 || slen == 0)
return 0;
- tptr = trg;
-
- /* Allocate a buffer for case-folded, blank-padded words */
- buf = (char *) palloc(slen * pg_database_encoding_max_length() + 4);
+ buf = palloc_array(char, slen * pg_database_encoding_max_length() + 4 + 1);
+ memset(buf, ' ', LPADDING);
- if (LPADDING > 0)
+ for (int i = 0; i < slen; )
{
- *buf = ' ';
- if (LPADDING > 1)
- *(buf + 1) = ' ';
- }
+ int num_bytes = LPADDING;
+ int num_chars = LPADDING;
+ char *word;
- eword = str;
- while ((bword = find_word(eword, slen - (eword - str), &eword, &charlen)) != NULL)
- {
+ /* Extract next word */
+ while (i < slen)
+ {
+ if ((str[i] & 0x80) == 0) /* Fast path for ASCII-only */
+ {
+ if (isalnum(str[i]))
+ {
#ifdef IGNORECASE
- bword = str_tolower(bword, eword - bword, DEFAULT_COLLATION_OID);
- bytelen = strlen(bword);
+ buf[num_bytes++] = pg_ascii_tolower(str[i++]);
#else
- bytelen = eword - bword;
+ buf[num_bytes++] = str[i++];
#endif
+ }
+ else
+ {
+ i++;
+ break;
+ }
+ }
+ else
+ {
+ const int mblen = pg_mblen(str + i);
+ Assert(mblen >= 2); /* Otherwise, it would be ASCII */
+
+ if (ISWORDCHR(str + i))
+ {
+ memcpy(buf + num_bytes, str + i, mblen);
+ num_bytes += mblen;
+ i += mblen;
+ }
+ else
+ {
+ i += mblen;
+ break;
+ }
+ }
+
+ num_chars++;
+ }
- memcpy(buf + LPADDING, bword, bytelen);
+ if (num_chars > LPADDING)
+ {
+ memset(buf + num_bytes, ' ', RPADDING);
+ num_bytes += RPADDING;
+ num_chars += RPADDING;
+ word = buf;
#ifdef IGNORECASE
- pfree(bword);
+ if (num_chars != num_bytes)
+ {
+ word = str_tolower(buf, num_bytes, DEFAULT_COLLATION_OID);
+ num_bytes = strlen(word); /* String can get shorter from lower-casing */
+ }
#endif
- buf[LPADDING + bytelen] = ' ';
- buf[LPADDING + bytelen + 1] = ' ';
+ if (bounds)
+ bounds[tptr - trg] |= TRGM_BOUND_LEFT;
+
+ tptr = make_trigrams(tptr, word, num_bytes, num_chars);
+
+ if (bounds)
+ bounds[tptr - trg - 1] |= TRGM_BOUND_RIGHT;
- /* Calculate trigrams marking their bounds if needed */
- if (bounds)
- bounds[tptr - trg] |= TRGM_BOUND_LEFT;
- tptr = make_trigrams(tptr, buf, bytelen + LPADDING + RPADDING,
- charlen + LPADDING + RPADDING);
- if (bounds)
- bounds[tptr - trg - 1] |= TRGM_BOUND_RIGHT;
+ if (word != buf)
+ pfree(word);
+ }
}
pfree(buf);
-
return tptr - trg;
}
--
2.51.0
From 457a3d0d57a834a80237d628d353408f3e2f4378 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Wed, 12 Nov 2025 14:27:13 +0100
Subject: [PATCH v2 7/8] Faster qunique() comparator
---
contrib/pg_trgm/trgm_op.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index 039c273f6a1..39b586f5b9a 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -139,6 +139,14 @@ CMPTRGM_UNSIGNED(const void *a, const void *b)
: CMPPCHAR_UNS(a, b, 2));
}
+static inline int
+CMPTRGM_EQ(const void *a, const void *b)
+{
+ char *aa = (char *)a;
+ char *bb = (char *)b;
+ return aa[0] != bb[0] || aa[1] != bb[1] || aa[2] != bb[2] ? 1 : 0;
+}
+
/*
* This gets called on the first call. It replaces the function pointer so
* that subsequent calls are routed directly to the chosen implementation.
@@ -482,15 +490,11 @@ generate_trgm(char *str, int slen)
if (len > 1)
{
if (GetDefaultCharSignedness())
- {
radix_sort_trigrams_signed((trgm *)GETARR(trg), len);
- len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
- }
else
- {
trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm));
- len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED);
- }
+
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_EQ);
}
SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len));
@@ -1038,15 +1042,11 @@ generate_wildcard_trgm(const char *str, int slen)
if (len > 1)
{
if (GetDefaultCharSignedness())
- {
radix_sort_trigrams_signed((trgm *)GETARR(trg), len);
- len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
- }
else
- {
trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm));
- len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED);
- }
+
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_EQ);
}
SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len));
--
2.51.0
From 447ce313602e768f29c43d4a5359f4c909b8db46 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Tue, 11 Nov 2025 13:18:59 +0100
Subject: [PATCH v2 6/8] Use radix sort
---
contrib/pg_trgm/trgm_op.c | 64 +++++++++++++++++++++++++++++++++------
1 file changed, 54 insertions(+), 10 deletions(-)
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index 6af120fa1ad..039c273f6a1 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -155,14 +155,6 @@ CMPTRGM_CHOOSE(const void *a, const void *b)
}
/* Define our specialized sort function name */
-#define ST_SORT trigram_qsort_signed
-#define ST_ELEMENT_TYPE_VOID
-#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b)
-#define ST_SCOPE static
-#define ST_DEFINE
-#define ST_DECLARE
-#include "lib/sort_template.h"
-
#define ST_SORT trigram_qsort_unsigned
#define ST_ELEMENT_TYPE_VOID
#define ST_COMPARE(a, b) CMPTRGM_UNSIGNED(a, b)
@@ -392,6 +384,58 @@ generate_trgm_only(trgm *trg, char *str, int slen, TrgmBound *bounds)
return tptr - trg;
}
+/*
+ * Needed to properly handle negative numbers in case char is signed.
+ */
+static inline unsigned char FlipSign(char x)
+{
+ return x^0x80;
+}
+
+static void radix_sort_trigrams_signed(trgm *trg, int count)
+{
+ trgm *buffer = palloc_array(trgm, count);
+ trgm *starts[256];
+ trgm *from = trg;
+ trgm *to = buffer;
+ int freqs[3][256];
+
+ /*
+ * Compute frequencies to partition the buffer.
+ */
+ memset(freqs, 0, sizeof(freqs));
+
+ for (int i=0; i<count; i++)
+ for (int j=0; j<3; j++)
+ freqs[j][FlipSign(trg[i][j])]++;
+
+ /*
+ * Do the sorting. Start with last character because that's the is "LSB"
+ * in a trigram. Avoid unnecessary copies by ping-ponging between the buffers.
+ */
+ for (int i=2; i>=0; i--)
+ {
+ trgm *old_from = from;
+ trgm *next = to;
+
+ for (int j=0; j<256; j++)
+ {
+ starts[j] = next;
+ next += freqs[i][j];
+ }
+
+ for (int j=0; j<count; j++)
+ memcpy(starts[FlipSign(from[j][i])]++, from[j], sizeof(trgm));
+
+ from = to;
+ to = old_from;
+ }
+
+ Assert(to == buffer);
+ memcpy(trg, buffer, sizeof(trgm) * count);
+ pfree(buffer);
+}
+
/*
* Guard against possible overflow in the palloc requests below. (We
* don't worry about the additive constants, since palloc can detect
@@ -439,7 +483,7 @@ generate_trgm(char *str, int slen)
{
if (GetDefaultCharSignedness())
{
- trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm));
+ radix_sort_trigrams_signed((trgm *)GETARR(trg), len);
len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
}
else
@@ -995,7 +1039,7 @@ generate_wildcard_trgm(const char *str, int slen)
{
if (GetDefaultCharSignedness())
{
- trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm));
+ radix_sort_trigrams_signed((trgm *)GETARR(trg), len);
len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
}
else
--
2.51.0
From c68aa700a245f1eb05f701a227776e24d0c0afe7 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Mon, 10 Nov 2025 15:40:11 +0100
Subject: [PATCH v2 5/8] Make btint4cmp() branchless
---
src/backend/access/nbtree/nbtcompare.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c
index bffc4b7709c..5ae27c22621 100644
--- a/src/backend/access/nbtree/nbtcompare.c
+++ b/src/backend/access/nbtree/nbtcompare.c
@@ -60,6 +60,7 @@
#include "utils/fmgrprotos.h"
#include "utils/skipsupport.h"
#include "utils/sortsupport.h"
+#include "common/int.h"
#ifdef STRESS_SORT_INT_MIN
#define A_LESS_THAN_B INT_MIN
@@ -202,12 +203,7 @@ btint4cmp(PG_FUNCTION_ARGS)
int32 a = PG_GETARG_INT32(0);
int32 b = PG_GETARG_INT32(1);
- if (a > b)
- PG_RETURN_INT32(A_GREATER_THAN_B);
- else if (a == b)
- PG_RETURN_INT32(0);
- else
- PG_RETURN_INT32(A_LESS_THAN_B);
+ PG_RETURN_INT32(pg_cmp_s32(a, b));
}
Datum
--
2.51.0
From 0cdc87640cf2a2d8c946aff1669706f99280c555 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Mon, 10 Nov 2025 14:40:37 +0100
Subject: [PATCH v2 4/8] Avoid dedup and sort in ginExtractEntries
---
contrib/pg_trgm/trgm_gin.c | 2 ++
doc/src/sgml/gin.sgml | 7 ++++++-
src/backend/access/gin/ginutil.c | 32 +++++++++++++++++++-------------
3 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/contrib/pg_trgm/trgm_gin.c b/contrib/pg_trgm/trgm_gin.c
index 66ff6adde99..862e650efec 100644
--- a/contrib/pg_trgm/trgm_gin.c
+++ b/contrib/pg_trgm/trgm_gin.c
@@ -36,10 +36,12 @@ gin_extract_value_trgm(PG_FUNCTION_ARGS)
{
text *val = (text *) PG_GETARG_TEXT_PP(0);
int32 *nentries = (int32 *) PG_GETARG_POINTER(1);
+ bool *uniqueAndSorted = (bool *) PG_GETARG_POINTER(3);
Datum *entries = NULL;
TRGM *trg;
int32 trglen;
+ *uniqueAndSorted = true;
*nentries = 0;
trg = generate_trgm(VARDATA_ANY(val), VARSIZE_ANY_EXHDR(val));
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index 82410b1fbdf..b96478731f8 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -167,7 +167,7 @@
<variablelist>
<varlistentry>
<term><function>Datum *extractValue(Datum itemValue, int32 *nkeys,
- bool **nullFlags)</function></term>
+ bool **nullFlags, bool *uniqueAndSorted)</function></term>
<listitem>
<para>
Returns a palloc'd array of keys given an item to be indexed. The
@@ -177,6 +177,11 @@
<literal>*nullFlags</literal>, and set these null flags as needed.
<literal>*nullFlags</literal> can be left <symbol>NULL</symbol> (its initial value)
if all keys are non-null.
+ If the returned keys do not contain duplicates and are sorted w.r.t. the comparison
+ function of the GIN type's operator class, store <symbol>true</symbol> in
+ <literal>uniqueAndSorted</literal>. <literal>uniqueAndSorted</literal> can be left
+ <symbol>false</symbol> (its initial value) if the keys are either unsorted or contain
+ duplicates. In that case, duplicate removal and sorting is performed by the GIN index.
The return value can be <symbol>NULL</symbol> if the item contains no keys.
</para>
</listitem>
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 75a18f457bc..22b588483d0 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -464,6 +464,7 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
{
Datum *entries;
bool *nullFlags;
+ bool uniqueAndSorted = false;
int32 i;
/*
@@ -483,11 +484,12 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
/* OK, call the opclass's extractValueFn */
nullFlags = NULL; /* in case extractValue doesn't set it */
entries = (Datum *)
- DatumGetPointer(FunctionCall3Coll(&ginstate->extractValueFn[attnum - 1],
+ DatumGetPointer(FunctionCall4Coll(&ginstate->extractValueFn[attnum - 1],
ginstate->supportCollation[attnum - 1],
value,
PointerGetDatum(nentries),
- PointerGetDatum(&nullFlags)));
+ PointerGetDatum(&nullFlags),
+ PointerGetDatum(&uniqueAndSorted)));
/*
* Generate a placeholder if the item contained no keys.
@@ -502,13 +504,6 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
return entries;
}
- /*
- * If the extractValueFn didn't create a nullFlags array, create one,
- * assuming that everything's non-null.
- */
- if (nullFlags == NULL)
- nullFlags = (bool *) palloc0(*nentries * sizeof(bool));
-
/*
* If there's more than one key, sort and unique-ify.
*
@@ -516,11 +511,18 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
* pretty bad too. For small numbers of keys it'd likely be better to use
* a simple insertion sort.
*/
- if (*nentries > 1)
+ if (*nentries > 1 && !uniqueAndSorted)
{
keyEntryData *keydata;
cmpEntriesArg arg;
+ /*
+ * If the extractValueFn didn't create a nullFlags array, create one,
+ * assuming that everything's non-null.
+ */
+ if (nullFlags == NULL)
+ nullFlags = (bool *) palloc0(*nentries * sizeof(bool));
+
keydata = palloc_array(keyEntryData, *nentries);
for (i = 0; i < *nentries; i++)
{
@@ -568,9 +570,13 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
/*
* Create GinNullCategory representation from nullFlags.
*/
- *categories = (GinNullCategory *) palloc0(*nentries * sizeof(GinNullCategory));
- for (i = 0; i < *nentries; i++)
- (*categories)[i] = (nullFlags[i] ? GIN_CAT_NULL_KEY : GIN_CAT_NORM_KEY);
+ StaticAssertStmt(GIN_CAT_NORM_KEY == 0, "Assuming GIN_CAT_NORM_KEY is 0");
+ *categories = palloc0_array(GinNullCategory, *nentries);
+
+ if (nullFlags != NULL)
+ for (i = 0; i < *nentries; i++)
+ if (nullFlags[i])
+ (*categories)[i] = GIN_CAT_NULL_KEY;
return entries;
}
--
2.51.0
From c38f3517d05d3cefa0fc6d994f4e8f6ac14273d9 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Mon, 10 Nov 2025 13:35:11 +0100
Subject: [PATCH v2 3/8] Use sort_template.h
---
contrib/pg_trgm/trgm_op.c | 47 ++++++++++++++++++++++++++++++---------
1 file changed, 37 insertions(+), 10 deletions(-)
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index 81182a15e07..6af120fa1ad 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -154,6 +154,23 @@ CMPTRGM_CHOOSE(const void *a, const void *b)
return CMPTRGM(a, b);
}
+/* Define our specialized sort function name */
+#define ST_SORT trigram_qsort_signed
+#define ST_ELEMENT_TYPE_VOID
+#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b)
+#define ST_SCOPE static
+#define ST_DEFINE
+#define ST_DECLARE
+#include "lib/sort_template.h"
+
+#define ST_SORT trigram_qsort_unsigned
+#define ST_ELEMENT_TYPE_VOID
+#define ST_COMPARE(a, b) CMPTRGM_UNSIGNED(a, b)
+#define ST_SCOPE static
+#define ST_DEFINE
+#define ST_DECLARE
+#include "lib/sort_template.h"
+
/*
* Deprecated function.
* Use "pg_trgm.similarity_threshold" GUC variable instead of this function.
@@ -209,12 +226,6 @@ show_limit(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT4(similarity_threshold);
}
-static int
-comp_trgm(const void *a, const void *b)
-{
- return CMPTRGM(a, b);
-}
-
/*
* Finds first word in string, returns pointer to the word,
* endword points to the character after word
@@ -426,8 +437,16 @@ generate_trgm(char *str, int slen)
*/
if (len > 1)
{
- qsort(GETARR(trg), len, sizeof(trgm), comp_trgm);
- len = qunique(GETARR(trg), len, sizeof(trgm), comp_trgm);
+ if (GetDefaultCharSignedness())
+ {
+ trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm));
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
+ }
+ else
+ {
+ trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm));
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED);
+ }
}
SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len));
@@ -974,8 +993,16 @@ generate_wildcard_trgm(const char *str, int slen)
*/
if (len > 1)
{
- qsort(GETARR(trg), len, sizeof(trgm), comp_trgm);
- len = qunique(GETARR(trg), len, sizeof(trgm), comp_trgm);
+ if (GetDefaultCharSignedness())
+ {
+ trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm));
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED);
+ }
+ else
+ {
+ trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm));
+ len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED);
+ }
}
SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len));
--
2.51.0
From ae80c2dc190493f8c9811c1949c7ebced265abd5 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Mon, 10 Nov 2025 13:35:01 +0100
Subject: [PATCH v2 2/8] Optimized comparison functions
---
src/backend/access/gin/ginutil.c | 23 ++++++++++++++++-------
src/include/access/gin_private.h | 19 +++++++++++++++----
2 files changed, 31 insertions(+), 11 deletions(-)
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..75a18f457bc 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -114,6 +114,7 @@ initGinState(GinState *state, Relation index)
for (i = 0; i < origTupdesc->natts; i++)
{
Form_pg_attribute attr = TupleDescAttr(origTupdesc, i);
+ FunctionCallInfoBaseData *fci = &state->compareFnCallInfo[i].fcinfo;
if (state->oneCol)
state->tupdesc[i] = state->origTupdesc;
@@ -222,6 +223,10 @@ initGinState(GinState *state, Relation index)
state->supportCollation[i] = index->rd_indcollation[i];
else
state->supportCollation[i] = DEFAULT_COLLATION_OID;
+
+ InitFunctionCallInfoData(*fci, &state->compareFn[i], 2, state->supportCollation[i], NULL, NULL);
+ fci->args[0].isnull = false;
+ fci->args[1].isnull = false;
}
}
@@ -402,8 +407,7 @@ typedef struct
typedef struct
{
- FmgrInfo *cmpDatumFunc;
- Oid collation;
+ FunctionCallInfoBaseData *cmpFuncInfo;
bool haveDups;
} cmpEntriesArg;
@@ -425,9 +429,15 @@ cmpEntries(const void *a, const void *b, void *arg)
else if (bb->isnull)
res = -1; /* not-NULL "<" NULL */
else
- res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc,
- data->collation,
- aa->datum, bb->datum));
+ {
+ FunctionCallInfo fci = data->cmpFuncInfo;
+ fci->args[0].value = aa->datum;
+ fci->args[1].value = bb->datum;
+ res = DatumGetInt32(FunctionCallInvoke(fci));
+
+ if (fci->isnull)
+ elog(ERROR, "function %u returned NULL", fci->flinfo->fn_oid);
+ }
/*
* Detect if we have any duplicates. If there are equal keys, qsort must
@@ -518,8 +528,7 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
keydata[i].isnull = nullFlags[i];
}
- arg.cmpDatumFunc = &ginstate->compareFn[attnum - 1];
- arg.collation = ginstate->supportCollation[attnum - 1];
+ arg.cmpFuncInfo = &ginstate->compareFnCallInfo[attnum - 1].fcinfo;
arg.haveDups = false;
qsort_arg(keydata, *nentries, sizeof(keyEntryData),
cmpEntries, &arg);
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index e155045ce8a..7cf19c8a5dc 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -51,6 +51,11 @@ typedef struct GinOptions
#define GIN_SHARE BUFFER_LOCK_SHARE
#define GIN_EXCLUSIVE BUFFER_LOCK_EXCLUSIVE
+typedef union CompareFuncCallInfoData
+{
+ FunctionCallInfoBaseData fcinfo;
+ char fcinfo_data[SizeForFunctionCallInfo(2)];
+} CompareFuncCallInfoData;
/*
* GinState: working data structure describing the index being worked on
@@ -77,6 +82,10 @@ typedef struct GinState
/*
* Per-index-column opclass support functions
*/
+
+
+ CompareFuncCallInfoData compareFnCallInfo[INDEX_MAX_KEYS];
+
FmgrInfo compareFn[INDEX_MAX_KEYS];
FmgrInfo extractValueFn[INDEX_MAX_KEYS];
FmgrInfo extractQueryFn[INDEX_MAX_KEYS];
@@ -504,6 +513,8 @@ ginCompareEntries(GinState *ginstate, OffsetNumber attnum,
Datum a, GinNullCategory categorya,
Datum b, GinNullCategory categoryb)
{
+ FunctionCallInfoBaseData *fci;
+
/* if not of same null category, sort by that first */
if (categorya != categoryb)
return (categorya < categoryb) ? -1 : 1;
@@ -512,10 +523,10 @@ ginCompareEntries(GinState *ginstate, OffsetNumber attnum,
if (categorya != GIN_CAT_NORM_KEY)
return 0;
- /* both not null, so safe to call the compareFn */
- return DatumGetInt32(FunctionCall2Coll(&ginstate->compareFn[attnum - 1],
- ginstate->supportCollation[attnum - 1],
- a, b));
+ fci = &ginstate->compareFnCallInfo[attnum - 1].fcinfo;
+ fci->args[0].value = a;
+ fci->args[1].value = b;
+ return DatumGetInt32(FunctionCallInvoke(fci));
}
/*
--
2.51.0
From a484e7c69bec62474b041eb4e533601e4883dab3 Mon Sep 17 00:00:00 2001
From: David Geier <[email protected]>
Date: Thu, 6 Nov 2025 09:42:27 +0100
Subject: [PATCH v2 1/8] Inline ginCompareAttEntries
---
src/backend/access/gin/ginutil.c | 38 ---------------------------
src/include/access/gin_private.h | 44 +++++++++++++++++++++++++++-----
2 files changed, 38 insertions(+), 44 deletions(-)
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index a546cac18d3..d205093e21d 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -387,44 +387,6 @@ GinInitMetabuffer(Buffer b)
((char *) metadata + sizeof(GinMetaPageData)) - (char *) page;
}
-/*
- * Compare two keys of the same index column
- */
-int
-ginCompareEntries(GinState *ginstate, OffsetNumber attnum,
- Datum a, GinNullCategory categorya,
- Datum b, GinNullCategory categoryb)
-{
- /* if not of same null category, sort by that first */
- if (categorya != categoryb)
- return (categorya < categoryb) ? -1 : 1;
-
- /* all null items in same category are equal */
- if (categorya != GIN_CAT_NORM_KEY)
- return 0;
-
- /* both not null, so safe to call the compareFn */
- return DatumGetInt32(FunctionCall2Coll(&ginstate->compareFn[attnum - 1],
- ginstate->supportCollation[attnum - 1],
- a, b));
-}
-
-/*
- * Compare two keys of possibly different index columns
- */
-int
-ginCompareAttEntries(GinState *ginstate,
- OffsetNumber attnuma, Datum a, GinNullCategory categorya,
- OffsetNumber attnumb, Datum b, GinNullCategory categoryb)
-{
- /* attribute number is the first sort key */
- if (attnuma != attnumb)
- return (attnuma < attnumb) ? -1 : 1;
-
- return ginCompareEntries(ginstate, attnuma, a, categorya, b, categoryb);
-}
-
-
/*
* Support for sorting key datums in ginExtractEntries
*
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index b33f7cec5b4..e155045ce8a 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -97,12 +97,6 @@ extern Buffer GinNewBuffer(Relation index);
extern void GinInitBuffer(Buffer b, uint32 f);
extern void GinInitPage(Page page, uint32 f, Size pageSize);
extern void GinInitMetabuffer(Buffer b);
-extern int ginCompareEntries(GinState *ginstate, OffsetNumber attnum,
- Datum a, GinNullCategory categorya,
- Datum b, GinNullCategory categoryb);
-extern int ginCompareAttEntries(GinState *ginstate,
- OffsetNumber attnuma, Datum a, GinNullCategory categorya,
- OffsetNumber attnumb, Datum b, GinNullCategory categoryb);
extern Datum *ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
Datum value, bool isNull,
int32 *nentries, GinNullCategory **categories);
@@ -502,6 +496,44 @@ ginCompareItemPointers(ItemPointer a, ItemPointer b)
return pg_cmp_u64(ia, ib);
}
+/*
+ * Compare two keys of the same index column
+ */
+static inline int
+ginCompareEntries(GinState *ginstate, OffsetNumber attnum,
+ Datum a, GinNullCategory categorya,
+ Datum b, GinNullCategory categoryb)
+{
+ /* if not of same null category, sort by that first */
+ if (categorya != categoryb)
+ return (categorya < categoryb) ? -1 : 1;
+
+ /* all null items in same category are equal */
+ if (categorya != GIN_CAT_NORM_KEY)
+ return 0;
+
+ /* both not null, so safe to call the compareFn */
+ return DatumGetInt32(FunctionCall2Coll(&ginstate->compareFn[attnum - 1],
+ ginstate->supportCollation[attnum - 1],
+ a, b));
+}
+
+/*
+ * Compare two keys of possibly different index columns
+ */
+static inline int
+ginCompareAttEntries(GinState *ginstate,
+ OffsetNumber attnuma, Datum a, GinNullCategory categorya,
+ OffsetNumber attnumb, Datum b, GinNullCategory categoryb)
+{
+ /* attribute number is the first sort key */
+ if (attnuma != attnumb)
+ return (attnuma < attnumb) ? -1 : 1;
+
+ return ginCompareEntries(ginstate, attnuma, a, categorya, b, categoryb);
+
+}
+
extern int ginTraverseLock(Buffer buffer, bool searchMode);
#endif /* GIN_PRIVATE_H */
--
2.51.0