Re: [HACKERS] Unicode support

Gregory Stark Mon, 13 Apr 2009 20:25:58 -0700

- - <crossroads0...@googlemail.com> writes:

>>> The original post seemed to be a contrived attempt to say "you should
>>> use ICU".
>>
>> Indeed.  The OP should go read all the previous arguments about ICU
>> in our archives.
>
> Not at all. I just was making a suggestion. You may use any other
> library or implement it yourself (I even said that in my original
> post). www.unicode.org - the official website of the Unicode
> consortium, have a complete database of all Unicode characters which
> can be used as a basis.
>
> But if you want to ignore the normalization/multiple code point issue,
> point 2--the collation problem--still remains. And given that even a
> crappy database as MySQL supports Unicode collation, this isn't
> something to be ignored, IMHO.


Sure, supporting multiple collations in a database is definitely a known
missing feature. There is a lot of work required to do it and a patch to do so
was too late to make it into 8.4 and required more work so hopefully the
issues will be worked out for 8.5.

I suggest you read the old threads and make any contibutions you can
suggesting how to solve the problems that arose.


>> I don't believe that the standard forbids the use of combining chars at all.
>> RFC 3629 says:
>>
>>  ... This issue is amenable to solutions based on Unicode Normalization
>>  Forms, see [UAX15].

This is the relevant part. Tom was claiming that the UTF8 encoding required
normalizing the string of unicode codepoints before encoding. I'm not sure
that's true though, is it?


-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode support

Reply via email to