On Mon, Jan 17, 2011 at 2:56 AM, Peter Eisentraut <pete...@gmx.net> wrote: > On mån, 2011-01-17 at 07:35 +0100, Magnus Hagander wrote: >> For text, I think locales may make that impossible. Aren't there >> locale rules where two different characters can "behave the same" when >> comparing them? I know in Swedish at least w and v behave the same >> when sorting (but not when comparing) in some variants of the locale. >> >> In fact, aren't there cases where the *length test* also fails? I >> don't know this for sure, but unless we know for certain that two >> different length strings can never be the same *independent of >> locale*, this whole patch has a big problem... > > Currently, two text values are only equal of strcoll() considers them > equal and the bits are the same. So this patch is safe in that regard. > > There is, however, some desire to loosen this. Possible applications > are case-insensitive comparison and Unicode normalization. It's not > going to happen soon, but it may be worth considering not putting in an > optimization that we'll end up having to rip out again in a year > perhaps.
Hmm. I hate to give up on this - it's a nice optimization for the cases to which it applies. Would it be possible to jigger things so that we can still do it byte-for-byte when case-insensitive comparison or Unicode normalization AREN'T in use? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers