Re: handling stopwords for special scenarios

2020-04-09 Thread Walter Underwood
Agreed, leave the stopwords alone. I ran into this same problem
thirteen years ago at Netflix. Even before that, I wasn’t removing 
stopwords, but I accidentally left them in the Solr 1.3 config.

https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 9, 2020, at 7:34 AM, Erick Erickson  wrote:
> 
> 1> why use stopwords at all? They’re largely a holdover from the
> bad old days when memory was limited. I usually recommend
> people just start by not using stopwords at all.
> 
> 2> assuming <1> doesn’t work for you, why doesn’t it look feasible
>  to remove here from the stopword list? True, you have to re-index.
> 
> But what you’re asking for is not possible. Stopwords are simply gone
> from the index without a trace, there’s absolutely no way to reconstruct
> one.
> 
> I’d also argue that this is an N+1 situation. Sure, you’ll solve the “here”
> problem by removing it from the stopword list, but then you’ll have
> the same problem with “there”…
> 
> Best,
> Erick
> 
>> On Apr 9, 2020, at 9:10 AM, rashi gandhi  wrote:
>> 
>> Hi All,
>> 
>> We are using stopword filter factory at both index and search time, to omit
>> the stopwords.
>> 
>> However, for a one particular case, we are getting "here" as a search query
>> and "here" is one the words in title/name representing our client.
>> We are returning zero results as "here" is one of the English
>> language stopwords which is getting omitted while indexing and searching
>> both.
>> 
>> One solution could be that I remove the "here" from list of stopwords,
>> however does not look feasible.
>> 
>> Is there any way where we can handle this kind of cases, where
>> stopwrods are meant to be actual search term?
>> 
>> Any leads would be appreciated.
> 



Re: handling stopwords for special scenarios

2020-04-09 Thread Erick Erickson
1> why use stopwords at all? They’re largely a holdover from the
 bad old days when memory was limited. I usually recommend
 people just start by not using stopwords at all.

2> assuming <1> doesn’t work for you, why doesn’t it look feasible
  to remove here from the stopword list? True, you have to re-index.

But what you’re asking for is not possible. Stopwords are simply gone
from the index without a trace, there’s absolutely no way to reconstruct
one.

I’d also argue that this is an N+1 situation. Sure, you’ll solve the “here”
problem by removing it from the stopword list, but then you’ll have
the same problem with “there”…

Best,
Erick

> On Apr 9, 2020, at 9:10 AM, rashi gandhi  wrote:
> 
> Hi All,
> 
> We are using stopword filter factory at both index and search time, to omit
> the stopwords.
> 
> However, for a one particular case, we are getting "here" as a search query
> and "here" is one the words in title/name representing our client.
> We are returning zero results as "here" is one of the English
> language stopwords which is getting omitted while indexing and searching
> both.
> 
> One solution could be that I remove the "here" from list of stopwords,
> however does not look feasible.
> 
> Is there any way where we can handle this kind of cases, where
> stopwrods are meant to be actual search term?
> 
> Any leads would be appreciated.



handling stopwords for special scenarios

2020-04-09 Thread rashi gandhi
Hi All,

We are using stopword filter factory at both index and search time, to omit
the stopwords.

However, for a one particular case, we are getting "here" as a search query
and "here" is one the words in title/name representing our client.
We are returning zero results as "here" is one of the English
language stopwords which is getting omitted while indexing and searching
both.

One solution could be that I remove the "here" from list of stopwords,
however does not look feasible.

Is there any way where we can handle this kind of cases, where
stopwrods are meant to be actual search term?

Any leads would be appreciated.