* Dave Callaghan
> I want to use SOUNDEX for name queries, but I don't
> seem to be doing it correctly.
>
> As a test, I made a small table and popluated it
> with my name. My goal is to be able to find my name
> 'Callaghan', with its most common misspelling
> 'Callahan'.

That will be difficult using soundex(), because the two names have different
soundex values:

mysql> select soundex('Callaghan'),soundex('Callahan');
+----------------------+---------------------+
| soundex('Callaghan') | soundex('Callahan') |
+----------------------+---------------------+
| C425                 | C450                |
+----------------------+---------------------+
1 row in set (0.00 sec)

> I've read the MySQL doc, so I know how to actually
> invoke soundex at the mysql prompt. I read an
> article that said soundex queries move more quickly
> if you add a column for the soundex.

That makes sense, yes.

> Thus:
>
> CREATE TABLE sound (
>   id tinyint(3) NOT NULL auto_increment,
>   name varchar(100) default NULL,
>   sound varchar(100) default NULL,
>   PRIMARY KEY  (id)
> ) TYPE=MyISAM;
>
> select soundex('Callaghan');
> result = C425
>
> INSERT INTO sound VALUES (1, 'Callaghan', 'C425');
>
> Simple enough. Now, I expected all three of the
> following queries to return a value.
>
> select id from sound where sound =
> soundex('Callaghan');
> returns 1 records
>
> select id from sound where sound =
> soundex('Callahan');
> returns 0 records

You are looking for a row with sound="C450", which does not exist.

> select id from sound where name =
> soundex('Callaghan');
> returns 0 records

You are looking for a row with name="C425", which does not exist.

> I'm missing something obvious here, but I've read
> the doc and I'm not sure what's wrong.

You expected soundex() to catch the common misspelling for your name, but
soundex() is far from perfect. Sometimes it does not find a name that is
very similar, and sometimes it will return names that are very far from
similar...

All these should find your row:

select id from sound where sound = soundex('Calaghan');
select id from sound where sound = soundex('Callagan');
select id from sound where sound = soundex('Clgn');
select id from sound where sound = soundex('Clxm');
select id from sound where sound = soundex('Celikan');
select id from sound where sound = soundex('Cilligen');

Note that a soundex() code is always 4 characters or more in mysql. Zeroes
are appended for the missing characters:

mysql> select soundex('C');
+--------------+
| soundex('C') |
+--------------+
| C000         |
+--------------+
1 row in set (0.00 sec)

Also, when there are very many letters in a word, the soundex code can
become rather long...:

mysql> select soundex('Circumstantial');
+---------------------------+
| soundex('Circumstantial') |
+---------------------------+
| C62523534                 |
+---------------------------+
1 row in set (0.00 sec)

Some soundex() implementations I have seen cut such a code, so that you end
up with just C625 or C6252, but with mysql you have to do this cutting
yourself, if you want it.

--
Roger
sql


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to