DictionaryCompoundWordTokenFilter does not properly add tokens from the end compound word. ------------------------------------------------------------------------------------------
Key: LUCENE-3417 URL: https://issues.apache.org/jira/browse/LUCENE-3417 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Affects Versions: 3.3, 4.0 Reporter: Njal Karevoll Due to an off-by-one error, a subword placed at the end of a compound word will not get a token added to the token stream. Example: Dictionary: {"ab", "cd", "ef"} word: "abcdef" Created tokens: {"abcdef", "ab", "cd"} Expected tokens: {"abcdef", "ab", "cd", "ef"} Additionally, it could produce tokens that were shorter than the minSubwordSize due to another off-by-one error. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org