[ https://issues.apache.org/jira/browse/LUCENE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Njal Karevoll updated LUCENE-3417: ---------------------------------- Attachment: LUCENE-3417.patch Adds two unit tests, one showing each behavior, and a fix for both issues. > DictionaryCompoundWordTokenFilter does not properly add tokens from the end > compound word. > ------------------------------------------------------------------------------------------ > > Key: LUCENE-3417 > URL: https://issues.apache.org/jira/browse/LUCENE-3417 > Project: Lucene - Java > Issue Type: Bug > Components: modules/analysis > Affects Versions: 3.3, 4.0 > Reporter: Njal Karevoll > Attachments: LUCENE-3417.patch > > Original Estimate: 5m > Remaining Estimate: 5m > > Due to an off-by-one error, a subword placed at the end of a compound word > will not get a token added to the token stream. > Example: > Dictionary: {"ab", "cd", "ef"} > word: "abcdef" > Created tokens: {"abcdef", "ab", "cd"} > Expected tokens: {"abcdef", "ab", "cd", "ef"} > Additionally, it could produce tokens that were shorter than the > minSubwordSize due to another off-by-one error. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org