Doh!

Turns out there's TWO ways to invoke Double Metaphone:

lucene/analysis/phonetic/src/java/org/apache/lucene/analysis/phonetic/PhoneticFilterFactory.java
(and Factory)  - use a setting
lucene/analysis/phonetic/src/java/org/apache/lucene/analysis/phonetic/DoubleMetaphoneFilter.java
(and Factory) - only D.M.

And the second more specific one has in it's comments:
"... DoubleMetaphone (supporting secondary codes)..."

In my defense, it wasn't in the wiki ;-)  TODO to add it

Hi Walter!

Thanks for the reply.  In my case it's special app that deals with surnames
already in a database.  Not everybody is interactively searching for movie
rentals y'know ;-)

--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


On Sat, Apr 27, 2013 at 5:57 PM, Mark Bennett <mbenn...@ideaeng.com> wrote:

> As I understand Wikipedia, Double Metaphone improves over Metaphone in 2
> areas:
> 1: Better linguistic matching
> 2: Can output a secondary token for words like Schmidt
>
> A quick look at the Apache commons codec and Lucene filter, it doesn't
> seem like that secondary token is supported?  There is "save" code for
> whether inject is true/false, but that's not the same thing, and doesn't
> seem to have been extended.
>
> Either I'm reading it wrong?  Or it somehow produces a compound token in
> those cases?
>
> Looking on the web, one author claims that only 10% of names need a second
> token anyway, so not a big deal, but still good to know.
>
> Thanks
>
> --
> Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
> Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513
>

Reply via email to