> Hi.
>
> Using mySQL 4.1.22, I'd like to carry out an SQL query to find a
> string containing acute vowels.
>
> mytable:
> - item1:
> --- firstname: Antonio
> --- lastname: Fernández
> --- comments: he's from Spain
>
> My SQL query:
> ------
> SELECT id FROM mytable WHERE MATCH(firstname, lastname, comments)
> AGAINST ('+"fernandez"' IN BOOLEAN MODE) ORDER BY firstname, lastname
> --------
>
> If mytable is in latin1, it works fine. But if mytable is in utf8
> (it's the case) it doesn't work. How can I change my SQL query to
> match 'Fernández' in UTF8 within the table?

If the data in the UTF-8 table consists of UTF-8 characters then it should
work fine. If it this is the case but it doesn't work, I suspect the issue
is that whatever client you are using to send your query is using a
different character set. (I got stuck on this a few months back). The
solution for my app was to issue a SET NAMES 'utf8' call when I first
connected to ensure everything was using UTF-8. See
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html.

Looking at your application, I am guessing you are going to be getting the
search terms from a user. This could be annoying for them if they want to
find this record, but just provided 'fernandez' (i.e. no accent).

I resolved this issue by adding extra columns in my DB containing
'standardised' versions of the 'real' data (e.g. firstname_std). Then, when
processing a search, I 'standardise' the user input and match this with the
standardised column. If you're interested, this is the PHP code I use to
'standardise':

        /**
         * Standardise a string. This converts accented characters to the
non-accented equivalents
         * makes it lowercase and removes any characters that are not [a-z], 
[0-9]
or @.
         *
         * These standardised strings are stored in the database alongside their
'normal' counterparts.
         * Searches are performed by standardising the query and comparing the
standardised forms. This
         * enables us to match regardless of punctation, whitespace, accented 
chars
etc.
         *
         * @param string Input string to standardise
         * @return string Standardised version of the input string
         */
        public static function StandardiseString($strInput) {
                $strInput = Search::TranslateAccentedChars($strInput);
                $strInput = strtolower($strInput);
                return preg_replace('/[EMAIL PROTECTED]/', '', $strInput);
        }

        /**
         * Translate accented characters to their non-accented counterparts
         *
         * @param string Input string
         * @return string String with accented characters replaced
         */
        public static function TranslateAccentedChars($strInput) {
                $strAccentedChars    =
"ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ";
                $strNonAccentedChars =
"SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy";
                return strtr($strInput, $strAccentedChars, 
$strNonAccentedChars);
        }

HTH,
Edward


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to