Double Metaphone is a good idea, but not that useful. Searchers just don't type in full phonetic versions of their query. Nobody types "ratatooie", instead they type "rata" then stop instead of making a mistake.
So, not that important. wunder On Apr 27, 2013, at 5:57 PM, Mark Bennett wrote: > As I understand Wikipedia, Double Metaphone improves over Metaphone in 2 > areas: > 1: Better linguistic matching > 2: Can output a secondary token for words like Schmidt > > A quick look at the Apache commons codec and Lucene filter, it doesn't seem > like that secondary token is supported? There is "save" code for whether > inject is true/false, but that's not the same thing, and doesn't seem to have > been extended. > > Either I'm reading it wrong? Or it somehow produces a compound token in > those cases? > > Looking on the web, one author claims that only 10% of names need a second > token anyway, so not a big deal, but still good to know. > > Thanks > > -- > Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com > Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 -