Hi Steven, List of Stopwords of a language are not fixed, there is no single universal list of stop words used by all natural language processing tools . Ideally stop words should be defined search merchandisers based on their domain instead of referring default.
https://en.wikipedia.org/wiki/Stop_words You are allowed to add lang/stopwords_<languagecode>.txt <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="lang/stopwords_en.txt" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" expand="true" synonyms="synonyms.txt" ignoreCase="true"/> <filter class="solr.StopFilterFactory" words="lang/stopwords_en.txt" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> Regards Srinivas Meenavalli -----Original Message----- From: Steven White [mailto:swhite4...@gmail.com] Sent: Friday, August 26, 2016 4:02 AM To: solr-user@lucene.apache.org Subject: Default stopword list Hi everyone, I'm curious, the current "default" stopword list, for English and other languages, how was it determined? And for English, why "I" is not in the stopword list? Thanks in advanced. Steve Disclaimer: The contents of this e-mail and attachment(s) thereto are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or Zensar Technologies Limited or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of Zensar Technologies Limited or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. Zensar Technologies Ltd or its affiliate do not accept any liability for virus infected mails.