Hi,
On 22/04/2016 04:50, Martin Mueller wrote:
MySQL has a bewildering variety of unicode collation choices. Most of them are language specific, but what is
the difference between "utf8-general-ci", "utf8-unicode-ci", and
"utf8-unicode-520-ci." Do they differ in the range of characters they can handle or is it just a
matter of the cort order. I understand that utf8-bin is different because it is case sensitive, but the other
differences elude me.
Under what circumstances does it make a difference to use on or the other? I work with a
lot of Early Modern print data and the weird symbols of various kinds they use. I've had
trouble at times with the "utf8-general-ci" setting, but it may have been more
a matter of settings on my front end tool than of the choice of this rather than unicode
collation.
Under character sets, there is just one utf8 setting. The simplest way to make
sense of the choices would be to say that given a character set (utf8) the
collation only makes a difference to the sort but makes no difference to what
can be displayed. Is that correct.
A collation contains definitions for sorting order and comparison. For
most purposes one wants "crème brûlée" to be the same as "creme brulee".
For unicode characters these rules can be complex. A character set (in
your case UTF-8) defines which character can be stored.
utf8-general-ci contains a simplified version of those conversion rules.
It works for a lot of Western European languages very well, but in some
cases there are problems. For Asian languages there are a lot more
problems. For example, 'ß' isn't considered the same as 'ss'.
utf8-unicode-ci has more complex rules and works fine for more
languages. Due to the more complex rule set it is a bit slower than
utf8-general-ci.
utf8-unicode-520-ci uses a newer version of the rule set that is used in
utf8-unicode-ci.
Other utf8-* collations may contain specific rules for specific languages
utf8-general-ci is the default collation for utf-8 in MySQL. If you use
literal strings MySQL may assume that these have the default collation
and comparing them to columns with other collations or performing things
like cast operations may produce errors about invalid combinations of
collations.
--
Met vriendelijke groet,
Jigal van Hemert.
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql