Trey Jones created LUCENE-8416:
----------------------------------

             Summary: Add tokenized version of o.o. to Stempel stopwords
                 Key: LUCENE-8416
                 URL: https://issues.apache.org/jira/browse/LUCENE-8416
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/analysis
            Reporter: Trey Jones


The Stempel stopword list ( 
lucene-solr/lucene/analysis/stempel/src/resources/org/apache/lucene/analysis/pl/stopwords.txt
 ) contains "o.o." which is a good stopword (it's part of the abbreviation for 
"limited liability company", which is "[sp. z 
o.o.|https://en.wiktionary.org/wiki/sp._z_o.o.]";. However, the standard 
tokenizer changes "o.o." to "o.o" so the stopword filter has no effect.

Add "o.o" to the stopword list. (It's probably okay to leave "o.o." in the 
list, though, in case a different tokenizer is used.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to