Robert Muir created LUCENE-5818:
-----------------------------------

             Summary: Fix hunspell zero-string overgeneration
                 Key: LUCENE-5818
                 URL: https://issues.apache.org/jira/browse/LUCENE-5818
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Robert Muir


Currently, its allowed to strip suffixes/prefixes all the way down to the empty 
string. But this is not really allowed, and creates overgeneration in some 
cases (especially where endings can be standalone ... typically these are 
stopwords so it causes a lot of damage).

Example is czech 'už' which should just stem to itself, but today also stems to 
'úžit' because it has a flag compatible with that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to