On Wed, 2023-11-15 at 00:52 +0100, Matthias van de Meent wrote: > That doesn't really answer the question for me. Why would you have a > primary key that has different collation rules (which include > equality > rules)
The equality rules for all deterministic collations are the same: if the bytes are identical, the values are considered equal; and if the bytes are not identical, the values are considered unequal. That's the basis for this entire thread. The "C" collation provides the same equality semantics as every other deterministic collation, but with better performance and lower risk. (As long as you don't actually need range scans or path keys from the index.) See varstr_cmp() or varstrfastcmp_locale(). Those functions first check for identical bytes and return 0 if so. If the bytes aren't equal, it passes it to the collation provider, but if the collation provider returns 0, we do a final memcmp() to break the tie. You can also see this in hashtext(), where for deterministic collations it just calls hash_any() on the bytes. None of this works for non-deterministic collations (e.g. case insensitive), but that would be easy to block where necessary. Regards, Jeff Davis