Re: [19] Proposal: function markers to indicate collation/ctype sensitivity

Peter Eisentraut Wed, 11 Jun 2025 00:04:07 -0700

On 05.06.25 21:56, Jeff Davis wrote:

On Thu, 2025-06-05 at 10:12 +0200, Peter Eisentraut wrote:

The reason we don't do it at parse time is that we don't have the
information which functions care about collations, which is exactly
what
you are proposing here to add.


Currently, we have:

    create table c(x text collate "C", y text collate "en_US");
    insert into c values ('x', 'y');
    select x < y from c; -- fails (runtime check)
    select x || y from c; -- succeeds

Surely, "<" would be marked as ordering-sensitive, and we could move
the error to parse-time.

But what about UDFs? If we assume that all UDFs are ordering-sensitive
unless marked otherwise, then a user-defined version of "||" that
previously worked would now start failing, until they add the ordering-
insensitive mark.

I think no matter how we slice it, there is going to be some case thatwill be degraded until some update is applied. I would be content toaccept this particular variant, because it doesn't seem very realistic.Why would a user define their own concatenation function? There alreadyis one. Unless your concatenation function does something special, inwhich case you should probably think about this collations topic. Moregenerally, there are I think only so many operations you can do oncharacters strings that you can do without considering thecollation/ctype/etc. These are essentially all the operations that youcan do without looking at the characters, like length(), ||, repeat().Everything beyond that looks at the characters and needs to takecollation/ctype/etc. into account.

We'd need some kind of migration path where we could retain the runtime
checks and disable the parse time checks until people have a chance to
add the right marks to their UDFs. Migration paths like that are not
great because they take several releases to work out, and we're never
quite sure when to finally remove the deprecated behavior.


Perhaps pg_dump can apply some properties during upgrades?

If we make the opposite assumption, that none are ordering-sensitive
unless we mark them so, that would allow properly-marked functions to
fail at parse time, and the rest to fail at runtime. But this
assumption doesn't work as well for recording dependencies, because
we'd miss the dependencies for UDFs that aren't properly marked.


That feels like the worst of both worlds.

Re: [19] Proposal: function markers to indicate collation/ctype sensitivity

Reply via email to