On Jul 2, 2007, at 8:07 AM, Darren Hartford wrote:
Thank you for the link to the previous thread, lot of information
there!
*Synonym use of nicknames - that sounds quite feasible. Do you
specifically mean the WordNet module in the Sandbox, or something
different?
No, I think I was thinking along the lines of the SynonymAnalyzer in
Lucene In Action whereby you add the nicknames as tokens at the same
position as the original, that way searches on the nicknames would
still match. Don't know that it solves your need for "weight" in the
relevance, but maybe it would.
-----Original Message-----
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: Friday, June 29, 2007 12:30 PM
To: java-user@lucene.apache.org
Subject: Re: Geneology, nicknames, levenstein, soundex/metaphone, etc
You may find this thread useful: http://www.gossamer-threads.com/
lists/lucene/java-user/47824?search_string=record%20linkage;#47824
although it doesn't answer all your ?'s
*nickname: would it be feasible to create an Analyzer that
will tie
to an external/internal nickname datasource (datasource would vary
dramatically based on nationality). Usecase: Jon, John, Johnny,
Jonathan would have 'weight' in the relevance. Similarly 'Dick',
'Chuck', and 'Charles'.
Maybe you could inject these as synonyms?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp
Read the Lucene Java FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]