I am getting an error using the SpellChecker component with the query "another-test" java.lang.StringIndexOutOfBoundsException: String index out of range: -7
This appears to be related to this issue<https://issues.apache.org/jira/browse/SOLR-1630> which has been marked as fixed. My configuration and test case that follows appear to reproduce the error I am seeing. Both "another" and "test" get changed into tokens with start and end offsets of 0 and 12. <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> &spellcheck=true&spellcheck.collate=true Is this an issue with my configuration/test or is there an issue with the SpellingQueryConverter? Is there a recommended work around such as the WhitespaceTokenizer as mention in the issue comments? Thank you for your help. package org.apache.solr.spelling; import static org.junit.Assert.assertTrue; import java.util.Collection; import org.apache.lucene.analysis.Token; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.util.Version; import org.apache.solr.common.util.NamedList; import org.junit.Test; public class SimpleQueryConverterTest { @Test public void testSimpleQueryConversion() { SpellingQueryConverter converter = new SpellingQueryConverter(); converter.init(new NamedList()); converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35)); String original = "another-test"; Collection<Token> tokens = converter.convert(original); assertTrue("Token offsets do not match", isOffsetCorrect(original, tokens)); } private boolean isOffsetCorrect(String s, Collection<Token> tokens) { for (Token token : tokens) { int start = token.startOffset(); int end = token.endOffset(); if (!s.substring(start, end).equals(token.toString())) return false; } return true; } }