On Mon, Sep 25, 2017 at 08:26:21AM +0000, Noah Misch wrote: > On Tue, Sep 19, 2017 at 07:01:47PM -0700, Peter Geoghegan wrote: > > On Tue, Sep 19, 2017 at 5:52 PM, Peter Eisentraut > > <peter.eisentr...@2ndquadrant.com> wrote: > > > On 9/18/17 18:46, Peter Geoghegan wrote: > > >> As I pointed out a couple of times already [1], we don't currently > > >> sanitize ICU's BCP 47 language tags within CREATE COLLATION. > > > > > > There is no requirement that the locale strings for ICU need to be BCP > > > 47. ICU locale names like 'de@collation=phonebook' are also acceptable. > > > > Right. But, we only document that BCP 47 is supported by Postgres. > > Maybe we can use get_icu_language_tag()/uloc_toLanguageTag() to ensure > > that we end up with BCP 47, even when the user happens to specify the > > legacy syntax. Should we be "canonicalizing" legacy ICU locale strings > > as BCP 47, too? > > > > > The reason they are not validated is that, as you know, ICU accepts any > > > locale string as valid. You appear to have found a way to do some > > > validation, but I would like to see that code. > > > > As I mentioned, I'm simply calling > > get_icu_language_tag()/uloc_toLanguageTag() to do that sanitization. > > The code to get the extra sanitization is completely trivial. > > > > I didn't post the patch that generates the errors in my opening e-mail > > because I'm not confident it's correct just yet. And, I think that I > > see a bigger problem: we pass a string that is almost certainly a BCP > > 47 string to ucol_open() from within pg_newlocale_from_collation(). We > > do so despite the fact that ucol_open() apparently doesn't accept BCP > > 47 syntax locale strings until ICU 54 [1]. Seems entirely possible > > that this accounts for the problems you saw on ICU 4.2, back when we > > were still creating keyword variants (I guess that the keyword > > variants seem very "BCP 47-ish" to me). > > > > I really think we need to add some kind of debug mode that makes ICU > > optionally spit out a locale display name at key points. We need this > > to gain confidence that the behavior that ICU provides actually > > matches what is expected across ICU different versions for different > > locale formats. We talked about this as a user-facing feature before, > > which can wait till v11; I just want this to debug problems like this > > one. > > > > [1] > > https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a4721e4c0a519bb0139a874e191223590 > > [Action required within three days. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 10 open item. Peter, > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > v10 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within three calendar days of > this message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping v10. Consequently, I will appreciate your efforts > toward speedy resolution. Thanks. > > [1] > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
This PostgreSQL 10 open item is past due for your status update. On the worst week to be violating open item policies. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers