On 12.01.24 03:02, Jeff Davis wrote:
New version attached. Changes:

  * Named collation object PG_C_UTF8, which seems like a good idea to
prevent name conflicts with existing collations. The locale name is
still C.UTF-8, which still makes sense to me because it matches the
behavior of the libc locale of the same name so closely.

I am catching up on this thread. The discussions have been very complicated, so maybe I didn't get it all.

The patches look pretty sound, but I'm questioning how useful this feature is and where you plan to take it.

Earlier in the thread, the aim was summarized as

> If the Postgres default was bytewise sorting+locale-agnostic
> ctype functions directly derived from Unicode data files,
> as opposed to libc/$LANG at initdb time, the main
> annoyance would be that "ORDER BY textcol" would no
> longer be the human-favored sort.

I think that would be a terrible direction to take, because it would regress the default sort order from "correct" to "useless". Aside from the overall message this sends about how PostgreSQL cares about locales and Unicode and such.

Maybe you don't intend for this to be the default provider? But then who would really use it? I mean, sure, some people would, but how would you even explain, in practice, the particular niche of users or use cases?

Maybe if this new provider would be called "minimal", it might describe the purpose better.

I could see a use for this builtin provider if it also included the default UCA collation (what COLLATE UNICODE does now). Then it would provide a "common" default behavior out of the box, and if you want more fine-tuning, you can go to ICU. There would still be some questions about making sure the builtin behavior and the ICU behavior are consistent (different Unicode versions, stock UCA vs CLDR, etc.). But for practical purposes, it might work.

There would still be a risk with that approach, since it would permanently marginalize ICU functionality, in the sense that only some locales would need ICU, and so we might not pay the same amount of attention to the ICU functionality.

I would be curious what your overall vision is here? Is switching the default to ICU still your goal? Or do you want the builtin provider to be the default? Or something else?



Reply via email to