Re: Stopwords impact on search

2020-04-26 Thread Steven White
Thanks Walter. Much appreciated. To the Solr dev team, it would be of great help if there Walter's IDF summary is made part of stop-filter: https://lucene.apache.org/solr/guide/8_5/filter-descriptions.html#stop-filter Steve On Fri, Apr 24, 2020 at 8:49 PM Walter Underwood wrote: > IDF and sto

Re: Stopwords impact on search

2020-04-24 Thread Walter Underwood
IDF and stopword removal are different approaches to the same thing. Removing stopwords is a binary decision on how important common words are for search. It says some words are completely useless. IDF is a proportional measure on how important common words are for search. Instead of removing a

Re: Stopwords impact on search

2020-04-24 Thread Steven White
Hi everyone, I get it why and when if stopwords are note indexed is a bad idea and can give you 0 or incomplete results. But what about the quality of search result when stopwords are indexed vs. not indexed? 1) Stopwords are removed and I do word search, not phrase for "solr and lucene are so c

Re: Stopwords impact on search

2020-04-24 Thread Walter Underwood
I’m astonished that the default still has that. It was a bad idea in Solr 1.3, when it bit my ass. We help people with this about once a month and the advice is always the same. Imagine all the poor people who never ask about it and run with that default! wunder Walter Underwood wun...@wunderwoo

Re: Stopwords impact on search

2020-04-24 Thread Jan Høydahl
Turns out there is already a JIRA for this SOLR-10992 where both you and I commented already :) But it’s 3 years old... > 24. apr. 2020 kl. 16:34 skrev Erick Erickson : > > +1 to removing stopword filters. > >> On Apr 24, 2020, at 10:28 AM, Jan

Re: Stopwords impact on search

2020-04-24 Thread Rohan Kasat
So do we use stopwords filter as part of query analyzer, to avoid highlighting of these stop words ? Regards, Rohan On Fri, Apr 24, 2020 at 7:45 AM Walter Underwood wrote: > Agreed. Here is an article from 13 years ago when I accidentally turned on > stopword removal at Netflix. It caused bad p

Re: Stopwords impact on search

2020-04-24 Thread Walter Underwood
Agreed. Here is an article from 13 years ago when I accidentally turned on stopword removal at Netflix. It caused bad problems. https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/ Infoseek was not removing stopwords when I joined them in 1996. Since then, I’ve always left

Re: Stopwords impact on search

2020-04-24 Thread Erick Erickson
+1 to removing stopword filters. > On Apr 24, 2020, at 10:28 AM, Jan Høydahl wrote: > > I tend to agree. Should we simply remove the stopword filters from the > default configsets shipping with Solr? > > Jan > >> 24. apr. 2020 kl. 14:44 skrev David Hastings : >> >> you should never use the s

Re: Stopwords impact on search

2020-04-24 Thread Jan Høydahl
I tend to agree. Should we simply remove the stopword filters from the default configsets shipping with Solr? Jan > 24. apr. 2020 kl. 14:44 skrev David Hastings : > > you should never use the stopword filter unless you have a very specific > purpose > > On Fri, Apr 24, 2020 at 8:33 AM Steven W

Re: Stopwords impact on search

2020-04-24 Thread David Hastings
you should never use the stopword filter unless you have a very specific purpose On Fri, Apr 24, 2020 at 8:33 AM Steven White wrote: > Hi everyone, > > What is, if any, the impact of stopwords in to my search ranking quality? > Will my ranking improve is I do not index stopwords? > > I'm trying