Re: Why do indexes and sorts use the database collation?

Jeff Davis Tue, 14 Nov 2023 16:14:10 -0800

On Wed, 2023-11-15 at 00:52 +0100, Matthias van de Meent wrote:
> That doesn't really answer the question for me. Why would you have a
> primary key that has different collation rules (which include
> equality
> rules)


The equality rules for all deterministic collations are the same: if
the bytes are identical, the values are considered equal; and if the
bytes are not identical, the values are considered unequal.

That's the basis for this entire thread. The "C" collation provides the
same equality semantics as every other deterministic collation, but
with better performance and lower risk. (As long as you don't actually
need range scans or path keys from the index.)

See varstr_cmp() or varstrfastcmp_locale(). Those functions first check
for identical bytes and return 0 if so. If the bytes aren't equal, it
passes it to the collation provider, but if the collation provider
returns 0, we do a final memcmp() to break the tie. You can also see
this in hashtext(), where for deterministic collations it just calls
hash_any() on the bytes.

None of this works for non-deterministic collations (e.g. case
insensitive), but that would be easy to block where necessary.

Regards,
        Jeff Davis

Re: Why do indexes and sorts use the database collation?

Reply via email to