On 6/7/2010 9:57 AM, Ryan Chan wrote:
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

Since MySQL only support BMP, so in fact 16 bit is needed actually?

I imagine they were thinking they'd extend the support to full Unicode in the future and didn't want you to have to dump and reload your databases when that happened. The Unicode consortium has stated that Unicode will never require more than 21 bits per character[*], and 24 bits is the next even multiple of 8 up from that.

[*] Why 21? Because that's the maximum number of bits you can express in 4 bytes with UTF-8 encoding. If Unicode were allowed to use all 2^32 code points as originally envisioned, it would require up to 6 bytes per character in UTF-8 encoding. This promise makes UTF-8 code easier to write and easier to future-proof without bad performance penalties.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to