Hey, 

We're upgrading MySQLs from 5.0 to 5.5, and running into the predictable 
utf8_general_ci bu^Wwe-meant-to-do-that with german ß and similar characters. 
The server swallowed the existing datafiles (thank god for that) so we're up 
and running, but "check table" does whine about needing a full rebuild, which 
is clearly that issue. 

Now, I'm aware that this should be fixed by using the utf8_unicode_ci 
collation; but somehow it isn't... 

The its-not-a-bug-but-heres-the-fix collation utf8_general_mysql500_ci does 
work, but honestly I feel that that's kind of dirty. 

Any ideas why utf8_unicode_ci doesn't work as expected? The new server version 
is 5.5.24-1~dotdeb.1-log. 

Thx, 
Johan 


mysql> set names UTF8; 
Query OK, 0 rows affected (0.00 sec) 

mysql> show create table search_index; 
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 
| Table | Create Table | 
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 
| search_index | CREATE TABLE `search_index` ( 
`word` varchar(50) NOT NULL DEFAULT '', 
`sid` int(10) unsigned NOT NULL DEFAULT '0', 
`type` varchar(16) DEFAULT NULL, 
`score` float DEFAULT NULL, 
UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`), 
KEY `sid_type` (`sid`,`type`), 
KEY `word` (`word`) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8 | 
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 
1 row in set (0.00 sec) 

mysql> show table status like "search_index"; 
+--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
 
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | 
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | 
Update_time | Check_time | Collation | Checksum | Create_options | Comment | 
+--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
 
| search_index | InnoDB | 10 | Compact | 6893296 | 58 | 405536768 | 0 | 
725172224 | 255852544 | NULL | 2012-08-27 14:53:20 | NULL | NULL | 
utf8_general_ci | NULL | | | 
+--------------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
 
1 row in set (0.07 sec) 

mysql> CREATE TABLE `search_johan` ( `word` varchar(50) NOT NULL DEFAULT '', 
`count` float DEFAULT NULL, PRIMARY KEY (`word`) ) ENGINE=InnoDB DEFAULT 
CHARSET=utf8 collate=utf8_unicode_ci; 
Query OK, 0 rows affected (0.00 sec) 

mysql> insert into search_johan select * from search_total; 
ERROR 1062 (23000): Duplicate entry 'cœur' for key 'PRIMARY' 




-- 

Linux Bier Wanderung 2012, now also available in Belgium! 
August, 12 to 19, Diksmuide, Belgium - http://lbw2012.tuxera.be 

Reply via email to