Hello! I recently discovered that my database was using latin1 default encoding. I switched to utf-8 (which means also sql CONVERTs and friends). Everything seemed to work fine (except some strangely encoded titles), but then I discovered some terrible indexing problems. I reindexed the whole site using the instructions found in bibindex's documentation. However, things seem to be screwed if I try to search for accentuated names like this: http://infoscience.epfl.ch/search?as=0&sc=1&p=süsstrunk&f=&Submit=Search (on my failover server there would be something like 115 results). If i use the verbose mode, I see that Sèèsstrunk is actually converted to susstrunk and then searched. Looking at the tables, there is a difference:
failover (latin-1 db): select term from idxWORD04F where term like "S%sstrunk"; +-----------+ | term | +-----------+ | susstrunk | +-----------+ 1 row in set (0.01 sec) Production (utf8 db): select term from idxWORD04F where term like "S%sstrunk"; +-----------+ | term | +-----------+ | s?sstrunk | | susstrunk | +-----------+ 2 rows in set (0.00 sec) My SQL variables are: | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | collation_connection | utf8_general_ci | | collation_database | utf8_swedish_ci | | collation_server | latin1_swedish_ci | show create table idxWORD04F; CREATE TABLE `idxWORD04F` ( `id` mediumint(9) unsigned NOT NULL auto_increment, `term` varchar(50) default NULL, `hitlist` longblob, PRIMARY KEY (`id`), UNIQUE KEY `term` (`term`) ) ENGINE=MyISAM AUTO_INCREMENT=52242 DEFAULT CHARSET=utf8 | Any suggestion on how to solve this?? Best regards, Greg ____________________________________________________________________ Gregory Favre Coordinateur Infoscience École Polytechnique Fédérale de Lausanne KIS - DIT Case Postale 121 CH-1015 Lausanne +41 21 693 22 88 + 41 79 599 09 06 [email protected] http://plan.epfl.ch/?sciper=128933 ____________________________________________________________________
