MySQL has a bewildering variety of unicode collation choices. Most of them are 
language specific, but what is the difference between "utf8-general-ci", 
"utf8-unicode-ci", and "utf8-unicode-520-ci." Do they differ in the range of 
characters they can handle or is it just a matter of the cort order. I 
understand that utf8-bin is different because it is case sensitive, but the 
other differences elude me. 

Under what circumstances does it make a difference to use on or the other? I 
work with a lot of Early Modern print data and the weird symbols of various 
kinds they use. I've had trouble at times with the "utf8-general-ci" setting, 
but it may have been more a matter of settings on my front end tool than of the 
choice of this rather than unicode collation. 

Under character sets, there is just one utf8 setting.  The simplest way to make 
sense of the choices would be to say that given a character set (utf8) the 
collation only makes a difference to the sort but makes no difference to what 
can be displayed. Is that correct. 

Reply via email to