Re: Don't snowball depending on terms

Rob Brown Wed, 30 Nov 2011 04:39:51 -0800

I guess I could do a bit of pre-processing, look for any words that are
quoted, and search in a diff field for those

How is a query like this formulated?

q=unstemmed:perl or java&q=stemmed:manager

-- 

IntelCompute
Web Design and Online Marketing

http://www.intelcompute.com

-----Original Message-----
From: Tomas Zerolo <tomas.zer...@axelspringer.de>
Reply-to: solr-user@lucene.apache.org
To: solr-user@lucene.apache.org
Subject: Re: Don't snowball depending on terms
Date: Wed, 30 Nov 2011 08:49:37 +0100

On Tue, Nov 29, 2011 at 01:53:44PM -0500, François Schiettecatte wrote:
> It won't and depending on how your analyzer is set up the terms are most 
> likely stemmed at index time.
> 
> You could create a separate field for unstemmed terms though, or use a less 
> aggressive stemmer such as EnglishMinimalStemFilterFactory.

This is surprising to me. Snowball introduces new homonyms, meaning it
will lump e.g. "management" and "manage" into one index entry. Thus,
I'd expect a handful of "false positives" (but usually not too many).

That's a "lossy index" (loosely speaking) and could be fixed by
post-filtering (instead of introducing another index, which in
most cases would seem a waste of resurces).

Is there no way in SOLR of filtering the results *after* the index
scan? I'd be disappointed!

Regards
-- tomás

Re: Don't snowball depending on terms

Reply via email to