On Fri, Apr 24, 2026 at 11:53 PM Tom Lane <[email protected]> wrote: > Richard Guo <[email protected]> writes: > > My first thought was to fix this by: > > > + if (!IndexCollMatchesExprColl(ind->indexcollations[c], > > + exprInputCollation((Node *) > > rinfo->clause))) > > + continue; > > > However, this caused an unexpected plan diff in join.out where a > > left-join removal over (name, text) stopped working, because name and > > text use different collations. So this check is too strict: a > > mismatch between two deterministic collations should be OK for > > uniqueness proof, as a deterministic collation treats two strings as > > equal iff they are byte-wise equal (see CREATE COLLATION).
> Yes, we'd be taking a serious performance hit if we insisted on > exact collation matches for this purpose. I agree that disallowing > non-matching non-deterministic collations is the right fix. Thanks for taking a look! > > Hence, I got attached patch. Thoughts? > I don't love doing it like this, for two reasons: > > 1. I think there are other places in the planner that will need > substantially this same logic. I recommend breaking out a > subroutine defined more or less as "do these collations have > equivalent notions of equality". Right. I just found several other places that need this same logic, which are in query_is_distinct_for(). Without it, we produce wrong results: select * from t t1 join (select distinct a from t) t2 on t1.a = t2.a COLLATE "ci"; a | a ---+--- A | a a | a (2 rows) select * from t t1 join (select a from t group by a) t2 on t1.a = t2.a COLLATE "ci"; a | a ---+--- A | a a | a (2 rows) > 2. I find the test next to unreadable as written --- for example, > it's more difficult than it should be to figure out what happens > if one collation is deterministic and the other not. Using a > subroutine would help here by letting you break down the test > into multiple steps. Agreed. Will wrap the logic in a subroutine. - Richard
