On 27.11.2006, at 04:26, Curtis Hatter wrote:
>> Yep. In the same way as 'bag', 'pack', 'back', 'poke' and 'pike' all
>> become 'PK'. I think the accurracy of this particular phonetic
>> algorithm is disputable.
>
> true.. and had I not been introduced to guitar hero 2 this weekend
> I think I
> might have realized that myself.
Sounds like a lot of FN.
> I think I'll try this approach first and add in a phonetic
> algorithm if
> necessary.
It depends on what you want to achieve. If you want to compensate for
typos, FuzzyQuery is probably better. A simple letter-swap can easily
trick a phonetic algorithm:
metaphone => MTFN
metahpone => MTPN
FuzzyQuery will catch this even at a relatively low sensitivity of
0.75 ('metahpone~0.75')
I used FuzzyQuery to build a doublet detection into my Rails app.
When a user creates a new Person record, a fuzzy search is run in the
background as the form fields are filled out. Possible doublets are
displayed next to the form in a "Did you mean..." fashion.
For example, if the user enters "Rachel Welsh", the doublet detection
would find "Raquel Welch". Before Ferret I tried to achieve this with
MySQL's SOUNDEX function, which didn't work quite as well. (Although
SOUNDEX, which is based on the algorithm of the same name, still
works way better than metaphone.)
> At least I discovered how to write filters for Ferret, which was
> much easier
> that I would have imagined.
Yep. It's great that Ferret can be extended and customized in so many
ways.
> Thanks for the information, is nice to learn a bit more about the
> things
> Ferret can already do so well.
Ferret is the single best Ruby library I've come across in the past
two years. It just rocks. Period. Thanks David for giving it to us!
Cheers,
Andreas
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk