Have you tried stemming? Simpler and available in core lucene. Look
at PorterStemFilter or use your favourite search engine to find more
info and options.
If instead you go the synonym route, there is sample code in Lucene in
Action and a wordnet contrib module you might find useful.
--
Ian.
Thanks for the detailed response sujit. UIMA, especially looks like an
interesting option.
On 3/24/11 3:57 PM, Sujit Pal sujit@comcast.net wrote:
I don't know if there is already an analyzer available for this, but you
could use GATE or UIMA for Named Entity Extraction against names and
On 3/25/11 5:57 AM, Ian Lea ian@gmail.com wrote:
Have you tried stemming? Simpler and available in core lucene. Look
at PorterStemFilter or use your favourite search engine to find more
info and options.
Ian,
I did try PorterStemFilter and couldn't get the result I wanted. (Dan ==
Hi,
I would like to build a search system where a search for Dan would also
search for Daniel and a search for Will, William . Any ideas on how to go
about implementing that? I can think of writing a custom Analyzer that would
map these partial tokens to their full firstname or lastnames. But
I don't know if there is already an analyzer available for this, but you
could use GATE or UIMA for Named Entity Extraction against names and
expand the query to include the extra names that are used synonymously.
You could do this outside Lucene or inline using a custom Lucene
tokenizer that