On 22.03.24 18:26, Jeff Davis wrote:
On Fri, 2024-03-22 at 15:51 +0100, Peter Eisentraut wrote:
I think this might be too big of a compatibility break.  So far,
initcap('123abc') has always returned '123abc'.  If the new collation
returns '123Abc' now, then that's quite a change.  These are not some
obscure Unicode special case characters, after all.

It's a new collation, so I'm not sure it's a compatibility break. But
you are right that it is against documentation and expectations for
INITCAP().

What is the ICU configuration incantation for this?  Maybe we could
have
the builtin provider understand some of that, too.

https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/stringoptions_8h.html#a4975f537b9960f0330b233061ef0608d
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/stringoptions_8h.html#afc65fa226cac9b8eeef0e877b8a7744e

Or we should create a function separate from initcap.

If we create a new function, that also gives us the opportunity to
accept optional arguments to control the behavior rather than relying
on collation for every decision.

Right. I thought when you said there is an ICU configuration for it, that it might be like collation options that you specify in the locale string. But it appears it is only an internal API setting. So that, in my mind, reinforces the opinion that we should leave initcap() as is and make a new function that exposes the new functionality. (This does not have to be part of this patch set.)



Reply via email to