[ https://issues.apache.org/jira/browse/SOLR-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763700#comment-16763700 ]
Alan Woodward commented on SOLR-13233: -------------------------------------- I'm honestly not sure what the correct fix here is - possibly we should change WordDelimiterGraphFilter to emit its original token first? And check our other TokenFilters to ensure that they all have this behaviour? > SpellCheckCollator ignores stacked tokens > ----------------------------------------- > > Key: SOLR-13233 > URL: https://issues.apache.org/jira/browse/SOLR-13233 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Alan Woodward > Priority: Major > > When building collations, SpellCheckCollator ignores any tokens with a > position increment of 0, assuming that they've been injected and may > therefore have incorrect offsets (injected terms generally keep the offsets > of the terms they're replacing, as they don't themselves appear anywhere in > the original source). However, this assumption is not necessarily correct - > for example, WordDelimiterGraphFilter emits stacked tokens *before* the > original token, because it needs to iterate through all stacked tokens to > correctly set the original token's position length. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org