Re: How to promote an unstemmed match over a stemmed match in an index that's stemmed...

Erick Erickson Mon, 11 Feb 2008 10:59:38 -0800

You have to bet a bit clever. You can certainly inject the original with an
increment of 0. See SynonymAnalyzer in Lucene In Action. This will not
break phrase queries since your two tokens occupy the same position.

But you'll have to do something like add a $ to the original at index time.
That way, for exact matches you can search on olive$, boosted however
you  want. When you want the stemmed version you can search for olive.
Or you could add a clause with the unstemmed version boosted. Or
something like that <G>.... Note that whether you add the $ to the stemmed
or unstemmed version is up to you.......

Watch what analyzer you use to be sure it doesn't strip out the special
symbol....

Best
Erick

On Feb 11, 2008 12:56 PM, Michael Stoppelman <[EMAIL PROTECTED]> wrote:

> Hi all,
> I've got an index with tokens that are stemmed. Sometimes I really need to
> boost the unstemmed
> version of a query word to get the most relevant documents.
>
> Example:
> Query: [olives].
>
> I don't want to match documents with the words: oliver, oliver's, etc...
>
> Since I'm stemming when creating the index is there a way to store both
> versions (stemmed/unstemmed) with
> setIncrementPosition()? Is that the correct way to deal with this? I was
> reading old archives and this didn't seem
> to be a great way decision since it breaks PhraseQuery [1].
>
> It seems like it would be useful if at query scoring time if I could see
> the
> original string values of the tokens in this case
> at least.
>
> Thanks in advance,
>
> -M
>
> [1]
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg07416.html
>

Re: How to promote an unstemmed match over a stemmed match in an index that's stemmed...

Reply via email to