Re: [HACKERS] Unicode Normalization

Andrew Dunstan Thu, 24 Sep 2009 08:59:38 -0700


David E. Wheeler wrote:

On Sep 24, 2009, at 6:24 AM, [email protected] wrote:
In a context using normalization, wouldn't you typically want tostore a normalized-text type that could perhaps (depending on locale)take advantage of simpler, more-efficient comparison functions?
That might be nice, but I'd be wary of a geometric multiplication oftext types. We already have TEXT and CITEXT; what if we had your NTEXT(normalized text) but I wanted it to also be case-insensitive?

Actually, I don't think it's necessarily a good idea at all. If a userinputs a perfectly valid piece of UTF8 text, we should be able to giveit back to them exactly, whether or not it's in normalized form. Thenormalized forms are useful for certain comparison purposes, but theydon't affect the validity of the text. CITEXT doesn't mangle what isstored, just how it's compared.



cheers

andrew

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode Normalization

Reply via email to