On Mon, Aug 7, 2017 at 3:23 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > The thing that I'm particularly thinking about is that if someone wants > an ICU variant collation that we didn't make initdb provide, they'll do > a CREATE COLLATION and go use it. At update time, pg_dump or pg_upgrade > will export/import that via CREATE COLLATION, and the only way it fails > is if ICU rejects the collation name as garbage. (Which, as we already > established upthread, it's quite unlikely to do.)
Actually, it's *impossible* for ICU to fail to accept any string as a valid locale within CREATE COLLATION, because CollationCreate() simply doesn't sanitize ICU names. It doesn't do something like call get_icu_language_tag(), unlike initdb (within pg_import_system_collations()). If I add such a test to CollationCreate(), it does a reasonable job of sanitizing, while preserving the spirit of the BCP 47 language tag format by not assuming that the user didn't specify a brand new locale that it hasn't heard of. All of these are accepted with unmodified master: postgres=# CREATE COLLATION test1 (provider = icu, locale = 'en-x-icu'); CREATE COLLATION postgres=# CREATE COLLATION test2 (provider = icu, locale = 'foo bar baz'); ERROR: XX000: could not convert locale name "foo bar baz" to language tag: U_ILLEGAL_ARGUMENT_ERROR LOCATION: get_icu_language_tag, collationcmds.c:454 postgres=# CREATE COLLATION test3 (provider = icu, locale = 'en-gb-icu'); ERROR: XX000: could not convert locale name "en-gb-icu" to language tag: U_ILLEGAL_ARGUMENT_ERROR LOCATION: get_icu_language_tag, collationcmds.c:454 postgres=# CREATE COLLATION test4 (provider = icu, locale = 'not-a-country'); CREATE COLLATION If it's mandatory for get_icu_language_tag() to not throw an error during initdb import when passed strings like these (that are generated mechanically), why should we not do the same with CREATE COLLATION? While the choice to preserve BCP 47's tolerance of missing collations is debatable, not doing at least this much up-front is a bug IMV. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers