On Tue, Jun 7, 2022 at 1:16 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> This is not the concern that I have.  I agree that if we tell a user
> that collation X changed behavior and he'd better reindex his indexes
> that use collation X, but none of them actually contain any cases that
> changed behavior, that's not a "false positive" --- that's "it's cheaper
> to reindex than to try to identify whether there's a problem".  What
> I mean by "false positive" is telling every macOS user that they'd better
> reindex everything every year, when in point of fact Apple changes those
> collations almost never.

That does seem like a meaningful distinction. I'm sorry if I
misrepresented your position on this.

We're talking about macOS here, which is hardly a paragon of lean
software. I think that it's worth revisiting the assumption that the C
standard library collations are the most useful set of collations, and
we shouldn't presume to know better than the operating system.
Couldn't individual packagers establish their own system for managing
collations across multiple ICU versions, as I outlined up-thread?

I think that it's okay (maybe unavoidable) that we keep "lib C
collations are authoritative" as a generic assumption when Postgres is
built from source. We can still have defacto standards that apply on
all mainstream platforms when users install standard packages for
production databases -- I don't see why we can't do both. Maybe the
best place to solve this problem is at the level of each individual
package ecosystem.

There can be some outsourcing to package managers this way, without
relying on the underlying OS, or lib C collations, or ICU in general.
This scheme wouldn't technically be under our direct control, but
would still be something that we could influence. We could have a back
and forth conversation about what's not working in the field.

-- 
Peter Geoghegan


Reply via email to