On Tue, Jun 7, 2022 at 1:16 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > This is not the concern that I have. I agree that if we tell a user > that collation X changed behavior and he'd better reindex his indexes > that use collation X, but none of them actually contain any cases that > changed behavior, that's not a "false positive" --- that's "it's cheaper > to reindex than to try to identify whether there's a problem". What > I mean by "false positive" is telling every macOS user that they'd better > reindex everything every year, when in point of fact Apple changes those > collations almost never.
That does seem like a meaningful distinction. I'm sorry if I misrepresented your position on this. We're talking about macOS here, which is hardly a paragon of lean software. I think that it's worth revisiting the assumption that the C standard library collations are the most useful set of collations, and we shouldn't presume to know better than the operating system. Couldn't individual packagers establish their own system for managing collations across multiple ICU versions, as I outlined up-thread? I think that it's okay (maybe unavoidable) that we keep "lib C collations are authoritative" as a generic assumption when Postgres is built from source. We can still have defacto standards that apply on all mainstream platforms when users install standard packages for production databases -- I don't see why we can't do both. Maybe the best place to solve this problem is at the level of each individual package ecosystem. There can be some outsourcing to package managers this way, without relying on the underlying OS, or lib C collations, or ICU in general. This scheme wouldn't technically be under our direct control, but would still be something that we could influence. We could have a back and forth conversation about what's not working in the field. -- Peter Geoghegan