* Dave Callaghan > I want to use SOUNDEX for name queries, but I don't > seem to be doing it correctly. > > As a test, I made a small table and popluated it > with my name. My goal is to be able to find my name > 'Callaghan', with its most common misspelling > 'Callahan'.
That will be difficult using soundex(), because the two names have different soundex values: mysql> select soundex('Callaghan'),soundex('Callahan'); +----------------------+---------------------+ | soundex('Callaghan') | soundex('Callahan') | +----------------------+---------------------+ | C425 | C450 | +----------------------+---------------------+ 1 row in set (0.00 sec) > I've read the MySQL doc, so I know how to actually > invoke soundex at the mysql prompt. I read an > article that said soundex queries move more quickly > if you add a column for the soundex. That makes sense, yes. > Thus: > > CREATE TABLE sound ( > id tinyint(3) NOT NULL auto_increment, > name varchar(100) default NULL, > sound varchar(100) default NULL, > PRIMARY KEY (id) > ) TYPE=MyISAM; > > select soundex('Callaghan'); > result = C425 > > INSERT INTO sound VALUES (1, 'Callaghan', 'C425'); > > Simple enough. Now, I expected all three of the > following queries to return a value. > > select id from sound where sound = > soundex('Callaghan'); > returns 1 records > > select id from sound where sound = > soundex('Callahan'); > returns 0 records You are looking for a row with sound="C450", which does not exist. > select id from sound where name = > soundex('Callaghan'); > returns 0 records You are looking for a row with name="C425", which does not exist. > I'm missing something obvious here, but I've read > the doc and I'm not sure what's wrong. You expected soundex() to catch the common misspelling for your name, but soundex() is far from perfect. Sometimes it does not find a name that is very similar, and sometimes it will return names that are very far from similar... All these should find your row: select id from sound where sound = soundex('Calaghan'); select id from sound where sound = soundex('Callagan'); select id from sound where sound = soundex('Clgn'); select id from sound where sound = soundex('Clxm'); select id from sound where sound = soundex('Celikan'); select id from sound where sound = soundex('Cilligen'); Note that a soundex() code is always 4 characters or more in mysql. Zeroes are appended for the missing characters: mysql> select soundex('C'); +--------------+ | soundex('C') | +--------------+ | C000 | +--------------+ 1 row in set (0.00 sec) Also, when there are very many letters in a word, the soundex code can become rather long...: mysql> select soundex('Circumstantial'); +---------------------------+ | soundex('Circumstantial') | +---------------------------+ | C62523534 | +---------------------------+ 1 row in set (0.00 sec) Some soundex() implementations I have seen cut such a code, so that you end up with just C625 or C6252, but with mysql you have to do this cutting yourself, if you want it. -- Roger sql --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php