Ah, very cool. Thanks for the tip.

-M

On Feb 11, 2008 10:58 AM, Erick Erickson <[EMAIL PROTECTED]> wrote:

> You have to bet a bit clever. You can certainly inject the original with
> an
> increment of 0. See SynonymAnalyzer in Lucene In Action. This will not
> break phrase queries since your two tokens occupy the same position.
>
> But you'll have to do something like add a $ to the original at index
> time.
> That way, for exact matches you can search on olive$, boosted however
> you  want. When you want the stemmed version you can search for olive.
> Or you could add a clause with the unstemmed version boosted. Or
> something like that <G>.... Note that whether you add the $ to the stemmed
> or unstemmed version is up to you.......
>
> Watch what analyzer you use to be sure it doesn't strip out the special
> symbol....
>
> Best
> Erick
>
> On Feb 11, 2008 12:56 PM, Michael Stoppelman <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
> > I've got an index with tokens that are stemmed. Sometimes I really need
> to
> > boost the unstemmed
> > version of a query word to get the most relevant documents.
> >
> > Example:
> > Query: [olives].
> >
> > I don't want to match documents with the words: oliver, oliver's, etc...
> >
> > Since I'm stemming when creating the index is there a way to store both
> > versions (stemmed/unstemmed) with
> > setIncrementPosition()? Is that the correct way to deal with this? I was
> > reading old archives and this didn't seem
> > to be a great way decision since it breaks PhraseQuery [1].
> >
> > It seems like it would be useful if at query scoring time if I could see
> > the
> > original string values of the tokens in this case
> > at least.
> >
> > Thanks in advance,
> >
> > -M
> >
> > [1]
> > http://www.mail-archive.com/[EMAIL PROTECTED]/msg07416.html
> >
>

Reply via email to