While working on [0], I was wondering why the collations ucs_basic and
unicode are not in pg_collation.dat. I traced this back through
history, and I think this was just lost in a game of telephone.
The initial commit for pg_collation.h (414c5a2ea6) has only the default
collation in pg_collation.h (pre .dat), with initdb handling everything
else. Over time, additional collations "C" and "POSIX" were moved to
pg_collation.h, and other logic was moved from initdb to
pg_import_system_collations(). But ucs_basic was untouched. Commit
0b13b2a771 rearranged the relative order of operations in initdb and
added the current comment "We don't want to pin these", but looking at
the email[1], I think this was more a guess about the previous intent.
I suggest we fix this now; see attached patch.
[0]:
https://www.postgresql.org/message-id/flat/1293e382-2093-a2bf-a397-c04e8f83d3c2%40enterprisedb.com
[1]: https://www.postgresql.org/message-id/28195.1498172402%40sss.pgh.pa.us
From 0d2c6b92a3340833f13bab395e0556ce1f045226 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Tue, 28 Mar 2023 12:04:34 +0200
Subject: [PATCH] Move definition of standard collations from initdb to
pg_collation.dat
---
src/bin/initdb/initdb.c | 15 +--------------
src/include/catalog/pg_collation.dat | 7 +++++++
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index bae97539fc..9ccbf998ec 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1695,20 +1695,7 @@ setup_description(FILE *cmdfd)
static void
setup_collation(FILE *cmdfd)
{
- /*
- * Add SQL-standard names. We don't want to pin these, so they don't go
- * in pg_collation.dat. But add them before reading system collations,
so
- * that they win if libc defines a locale with the same name.
- */
- PG_CMD_PRINTF("INSERT INTO pg_collation (oid, collname, collnamespace,
collowner, collprovider, collisdeterministic, collencoding, colliculocale)"
- "VALUES
(pg_nextoid('pg_catalog.pg_collation', 'oid',
'pg_catalog.pg_collation_oid_index'), 'unicode', 'pg_catalog'::regnamespace,
%u, '%c', true, -1, 'und');\n\n",
- BOOTSTRAP_SUPERUSERID, COLLPROVIDER_ICU);
-
- PG_CMD_PRINTF("INSERT INTO pg_collation (oid, collname, collnamespace,
collowner, collprovider, collisdeterministic, collencoding, collcollate,
collctype)"
- "VALUES
(pg_nextoid('pg_catalog.pg_collation', 'oid',
'pg_catalog.pg_collation_oid_index'), 'ucs_basic', 'pg_catalog'::regnamespace,
%u, '%c', true, %d, 'C', 'C');\n\n",
- BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC,
PG_UTF8);
-
- /* Now import all collations we can find in the operating system */
+ /* Import all collations we can find in the operating system */
PG_CMD_PUTS("SELECT pg_import_system_collations('pg_catalog');\n\n");
}
diff --git a/src/include/catalog/pg_collation.dat
b/src/include/catalog/pg_collation.dat
index f4bda1c769..14df398ad2 100644
--- a/src/include/catalog/pg_collation.dat
+++ b/src/include/catalog/pg_collation.dat
@@ -23,5 +23,12 @@
descr => 'standard POSIX collation',
collname => 'POSIX', collprovider => 'c', collencoding => '-1',
collcollate => 'POSIX', collctype => 'POSIX' },
+{ oid => '962',
+ descr => 'sorts using the Unicode Collation Algorithm with default settings',
+ collname => 'unicode', collprovider => 'i', collencoding => '-1',
+ colliculocale => 'und' },
+{ oid => '963', descr => 'sorts by Unicode code point',
+ collname => 'ucs_basic', collprovider => 'c', collencoding => '6',
+ collcollate => 'C', collctype => 'C' },
]
--
2.40.0