On Fri, Apr 21, 2023 at 3:25 PM Jeff Davis <pg...@j-davis.com> wrote: > I am also having second thoughts about accepting "C" or "POSIX" as an > ICU locale and transforming it to "en-US-u-va-posix" in v16. It's not > terribly useful (why not just use memcmp()?), it's not fast in my > measurements (en-US is faster), so maybe it's better to just throw an > error and tell the user to use C (or provider=none as I suggest > above)?
I mean, to renew a complaint I've made previously, how the heck is anyone supposed to understand what's going on here? We have no meaningful documentation of how to select an ICU locale that works for you. We have a couple of examples and a suggestion that you should use BCP 47. But when I asked before for documentation references, the ones you provided were not clear, basically incomprehensible. In follow-up discussion, you admitted you'd had to consult the source code to figure certain things out. And the fact that "C" or "POSIX" gets transformed into "en-US-u-va-posix" is also completely documented. That string appears twice in the code, but zero times in the documentation. There's code to do it, but users shouldn't have to read code, and it wouldn't help much if they did, because the code comments don't really explain the rationale behind this choice either. I find the fact that people are having trouble here completely predictable. Of course if people ask for "C" and the system tells them that it's using "en-US-u-va-posix" instead they're going to be confused and ask questions, exactly as is happening here. glibc collations aren't particularly well-documented either, but people have some experience with, and they can get a list of values that have a chance of working from /usr/share/locale, and they know what "C" means. Nobody knows what "en-US-u-va-posix" is. It's not even Googleable, really, whereas "C locale" is. My opinion is that the switch to using ICU by default is ill-advised and should be reverted. The compatibility break isn't worth whatever advantages ICU may have, the documentation to allow people to transition to ICU with reasonable effort doesn't exist, and the fact that within weeks of feature freeze people who know a lot about PostgreSQL are struggling to get the behavior they want is a really bad sign. -- Robert Haas EDB: http://www.enterprisedb.com