Bruce Momjian wrote:

> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
> 
>       https://momjian.us/pgsql_docs/release-17.html

About the changes in collations:

<quote>
  Create a "builtin" collation provider similar to libc's C locale
  (Jeff Davis)

  It uses a "C" locale which is identical but independent of
  libc, but it allows the use of non-"C" collations like "en_US"
  and "C.UTF-8" with the "C" locale, which libc does not. MORE?
</quote>

The new builtin provider has two collations:
* ucs_basic which is 100% identical to "C". It was introduced
several versions ago and the v17 novelty is simply to change
its pg_collation.collprovider from 'c' to 'b'.

* pg_c_utf8 which sorts like "C" but is Unicode-aware for
the rest, which makes it quite different from "C".
It's also different from the other UTF-8 collations that could
be used up to v17 in that it does not depend on an external
library, making it free from the collation OS-upgrade risks.

The part that is concretely of interest to users is the introduction
of pg_c_utf8. As described in [1]:

<quote>
pg_c_utf8

 This collation sorts by Unicode code point values rather than
 natural language order. For the functions lower, initcap, and
 upper, it uses Unicode simple case mapping. For pattern
 matching (including regular expressions), it uses the POSIX
 Compatible variant of Unicode Compatibility Properties. Behavior
 is efficient and stable within a Postgres major version. This
 collation is only available for encoding UTF8.
</quote>

I'd suggest that the relnote entry should be more like a condensed
version of that description, without mentioning en_US or C.UTF-8,
whose existence and semantics are OS-dependent, contrary to pg_c_utf8.


[1] https://www.postgresql.org/docs/devel/collation.html

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite


Reply via email to