Hey, We're upgrading MySQLs from 5.0 to 5.5, and running into the predictable utf8_general_ci bu^Wwe-meant-to-do-that with german ß and similar characters. The server swallowed the existing datafiles (thank god for that) so we're up and running, but "check table" does whine about needing a full rebuild, which is clearly that issue.
Now, I'm aware that this should be fixed by using the utf8_unicode_ci collation; but somehow it isn't... The its-not-a-bug-but-heres-the-fix collation utf8_general_mysql500_ci does work, but honestly I feel that that's kind of dirty. Any ideas why utf8_unicode_ci doesn't work as expected? The new server version is 5.5.24-1~dotdeb.1-log. Thx, Johan mysql> set names UTF8; Query OK, 0 rows affected (0.00 sec) mysql> show create table search_index; +--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | search_index | CREATE TABLE `search_index` ( `word` varchar(50) NOT NULL DEFAULT '', `sid` int(10) unsigned NOT NULL DEFAULT '0', `type` varchar(16) DEFAULT NULL, `score` float DEFAULT NULL, UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`), KEY `sid_type` (`sid`,`type`), KEY `word` (`word`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> show table status like "search_index"; +--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | +--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | search_index | InnoDB | 10 | Compact | 6893296 | 58 | 405536768 | 0 | 725172224 | 255852544 | NULL | 2012-08-27 14:53:20 | NULL | NULL | utf8_general_ci | NULL | | | +--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ 1 row in set (0.07 sec) mysql> CREATE TABLE `search_johan` ( `word` varchar(50) NOT NULL DEFAULT '', `count` float DEFAULT NULL, PRIMARY KEY (`word`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 collate=utf8_unicode_ci; Query OK, 0 rows affected (0.00 sec) mysql> insert into search_johan select * from search_total; ERROR 1062 (23000): Duplicate entry 'cœur' for key 'PRIMARY' -- Linux Bier Wanderung 2012, now also available in Belgium! August, 12 to 19, Diksmuide, Belgium - http://lbw2012.tuxera.be