There is a section on German Sharp-s in http://mysql.rjweb.org/doc.php/charcoll I agree with "dirty". As I understand it, about 5.1.24, they said "Oops, sharp-s is collated wrong, let's fix it". The fix broke things, but they stuck by the "correct" sorting. Eventually they said "Oh, let's add another, compatible, collation".
They were caught between a rock and a hard place. > -----Original Message----- > From: Johan De Meersman [mailto:vegiv...@tuxera.be] > Sent: Monday, August 27, 2012 7:43 AM > To: mysql > Subject: MySQL, UTF8 and collations > > Hey, > > We're upgrading MySQLs from 5.0 to 5.5, and running into the > predictable utf8_general_ci bu^Wwe-meant-to-do-that with german ß and > similar characters. The server swallowed the existing datafiles (thank > god for that) so we're up and running, but "check table" does whine > about needing a full rebuild, which is clearly that issue. > > Now, I'm aware that this should be fixed by using the utf8_unicode_ci > collation; but somehow it isn't... > > The its-not-a-bug-but-heres-the-fix collation utf8_general_mysql500_ci > does work, but honestly I feel that that's kind of dirty. > > Any ideas why utf8_unicode_ci doesn't work as expected? The new server > version is 5.5.24-1~dotdeb.1-log. > > Thx, > Johan > > > mysql> set names UTF8; > Query OK, 0 rows affected (0.00 sec) > > mysql> show create table search_index; > +--------------+------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > -------------------------------------------------------------+ > | Table | Create Table | > +--------------+------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > -------------------------------------------------------------+ > | search_index | CREATE TABLE `search_index` ( > `word` varchar(50) NOT NULL DEFAULT '', > `sid` int(10) unsigned NOT NULL DEFAULT '0', > `type` varchar(16) DEFAULT NULL, > `score` float DEFAULT NULL, > UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`), > KEY `sid_type` (`sid`,`type`), > KEY `word` (`word`) > ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | > +--------------+------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > -------------------------------------------------------------+ > 1 row in set (0.00 sec) > > mysql> show table status like "search_index"; > +--------------+--------+---------+------------+---------+------------- > ---+-------------+-----------------+--------------+-----------+-------- > --------+---------------------+-------------+------------+------------- > ----+----------+----------------+---------+ > | Name | Engine | Version | Row_format | Rows | Avg_row_length | > Data_length | Max_data_length | Index_length | Data_free | > Auto_increment | Create_time | Update_time | Check_time | Collation | > Checksum | Create_options | Comment | > +--------------+--------+---------+------------+---------+------------- > ---+-------------+-----------------+--------------+-----------+-------- > --------+---------------------+-------------+------------+------------- > ----+----------+----------------+---------+ > | search_index | InnoDB | 10 | Compact | 6893296 | 58 | 405536768 | 0 | > 725172224 | 255852544 | NULL | 2012-08-27 14:53:20 | NULL | NULL | > utf8_general_ci | NULL | | | > +--------------+--------+---------+------------+---------+------------- > ---+-------------+-----------------+--------------+-----------+-------- > --------+---------------------+-------------+------------+------------- > ----+----------+----------------+---------+ > 1 row in set (0.07 sec) > > mysql> CREATE TABLE `search_johan` ( `word` varchar(50) NOT NULL > DEFAULT '', `count` float DEFAULT NULL, PRIMARY KEY (`word`) ) > ENGINE=InnoDB DEFAULT CHARSET=utf8 collate=utf8_unicode_ci; > Query OK, 0 rows affected (0.00 sec) > > mysql> insert into search_johan select * from search_total; > ERROR 1062 (23000): Duplicate entry 'cœur' for key 'PRIMARY' > > > > > -- > > Linux Bier Wanderung 2012, now also available in Belgium! > August, 12 to 19, Diksmuide, Belgium - http://lbw2012.tuxera.be