I want to find rows that contain a word that matches a term, accent
insensitive:  I am using utf8-general collation everywhere.

attempt 1:
SELECT * FROM t WHERE txt LIKE '%que%'
Matches que qué, but also matches 'queue'

attempt 1.5:
SELECT * FROM t WHERE txt LIKE '% que %' OR LIKE 'que %' OR LIKE '% que';
Almost, but misses "que!"  or 'que...'

attempt2:
SELECT * FROM t WHERE txt REGEXP '[[:<:]]que[[:>:]]'
Matches que, not queue, but doesn't match qué.

attempt3
SELECT * FROM t WHERE txt REGEXP '[[:<:]]q[uùúûüũūŭůűųǔǖǘǚǜ][eèéêëēĕėęě][[:>:]]'
Matches que, queue, qué.  (I have no idea why this matches queue, but
the Regex behavior is bizarre with unicode.)

Does anyone know why the final regex acts weird?  It there a good solution?

Thanks in advance,
John Campbell

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to