On 2019-09-17 01:13, Tom Lane wrote: > Whilst poking at the leakproofness-of-texteq issue, I realized > that there's an independent problem caused by the nondeterminism > patch. To wit, that the text_pattern_ops btree opclass uses > texteq as its equality operator, even though that operator is > no longer guaranteed to be bitwise equality. That means that > depending on which collation happens to get attached to the > operator, equality might be inconsistent with the other members > of the opclass, leading to who-knows-what bad results.
You can't create a text_pattern_ops index on a column with nondeterministic collation: create collation c1 (provider = icu, locale = 'und', deterministic = false); create table t1 (a int, b text collate c1); create index on t1 (b text_pattern_ops); ERROR: nondeterministic collations are not supported for operator class "text_pattern_ops" There is some discussion in internal_text_pattern_compare(). Are there other cases we need to consider? I notice that there is a hash opclass text_pattern_ops, which I'd actually never heard of until now, and I don't see documented. What would we need to do about that? > The obvious fix for this is to invent separate new equality operators, > but that's actually rather disastrous for performance, because > text_pattern_ops indexes would no longer be able to use WHERE clauses > using plain equality. That also feeds into whether equality clauses > deduced from equivalence classes will work for them (nope, not any > more). People using such indexes are just about certain to be > bitterly unhappy. Would it help if one created COLLATE "C" indexes instead of text_pattern_ops? What are the tradeoffs between the two approaches? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services