Re: handling stopwords for special scenarios
Agreed, leave the stopwords alone. I ran into this same problem thirteen years ago at Netflix. Even before that, I wasn’t removing stopwords, but I accidentally left them in the Solr 1.3 config. https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/ wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 9, 2020, at 7:34 AM, Erick Erickson wrote: > > 1> why use stopwords at all? They’re largely a holdover from the > bad old days when memory was limited. I usually recommend > people just start by not using stopwords at all. > > 2> assuming <1> doesn’t work for you, why doesn’t it look feasible > to remove here from the stopword list? True, you have to re-index. > > But what you’re asking for is not possible. Stopwords are simply gone > from the index without a trace, there’s absolutely no way to reconstruct > one. > > I’d also argue that this is an N+1 situation. Sure, you’ll solve the “here” > problem by removing it from the stopword list, but then you’ll have > the same problem with “there”… > > Best, > Erick > >> On Apr 9, 2020, at 9:10 AM, rashi gandhi wrote: >> >> Hi All, >> >> We are using stopword filter factory at both index and search time, to omit >> the stopwords. >> >> However, for a one particular case, we are getting "here" as a search query >> and "here" is one the words in title/name representing our client. >> We are returning zero results as "here" is one of the English >> language stopwords which is getting omitted while indexing and searching >> both. >> >> One solution could be that I remove the "here" from list of stopwords, >> however does not look feasible. >> >> Is there any way where we can handle this kind of cases, where >> stopwrods are meant to be actual search term? >> >> Any leads would be appreciated. >
Re: handling stopwords for special scenarios
1> why use stopwords at all? They’re largely a holdover from the bad old days when memory was limited. I usually recommend people just start by not using stopwords at all. 2> assuming <1> doesn’t work for you, why doesn’t it look feasible to remove here from the stopword list? True, you have to re-index. But what you’re asking for is not possible. Stopwords are simply gone from the index without a trace, there’s absolutely no way to reconstruct one. I’d also argue that this is an N+1 situation. Sure, you’ll solve the “here” problem by removing it from the stopword list, but then you’ll have the same problem with “there”… Best, Erick > On Apr 9, 2020, at 9:10 AM, rashi gandhi wrote: > > Hi All, > > We are using stopword filter factory at both index and search time, to omit > the stopwords. > > However, for a one particular case, we are getting "here" as a search query > and "here" is one the words in title/name representing our client. > We are returning zero results as "here" is one of the English > language stopwords which is getting omitted while indexing and searching > both. > > One solution could be that I remove the "here" from list of stopwords, > however does not look feasible. > > Is there any way where we can handle this kind of cases, where > stopwrods are meant to be actual search term? > > Any leads would be appreciated.
handling stopwords for special scenarios
Hi All, We are using stopword filter factory at both index and search time, to omit the stopwords. However, for a one particular case, we are getting "here" as a search query and "here" is one the words in title/name representing our client. We are returning zero results as "here" is one of the English language stopwords which is getting omitted while indexing and searching both. One solution could be that I remove the "here" from list of stopwords, however does not look feasible. Is there any way where we can handle this kind of cases, where stopwrods are meant to be actual search term? Any leads would be appreciated.