Hi guys,
we are trying to implement search and we have experienced a strange
situation. When our text contains an apostrophe followed by a single
character AND we our search query is composed of exactly two letters
followed by proximity search AND we use highlighting, we get an exception:
*java.lang.IllegalArgumentException: boost must be a positive float, got
> -1.0*
It seems there is a problem at:FuzzyTermsEnum.java:271 (float similarity =
1.0f - (float) ed / (float) minTermLength) when it reaches it with ed=2 and
it sets a negative boost.
I was able to reproduce the error with following code:
import java.io.IOException;
import java.nio.file.Path;
import org.apache.commons.io.FileUtils;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.search.highlight.TokenSources;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.junit.jupiter.api.Test;
class FindSqlHighlightTest {
@Test
void reproduceHighlightProblem() throws IOException,
ParseException, InvalidTokenOffsetsException {
String text = "doesn't";
String field = "text";
//NOK: se~, se~2 and any higher number
//OK: sel~, s~, se~1
String uQuery = "se~";
int maxStartOffset = -1;
Analyzer analyzer = new SimpleAnalyzer();
Path indexLocation = Path.of("temp",
"reproduceHighlightProblem").toAbsolutePath();
if (indexLocation.toFile().exists()) {
FileUtils.deleteDirectory(indexLocation.toFile());
}
Directory indexDir = FSDirectory.open(indexLocation);
//Create index
IndexWriterConfig dimsIndexWriterConfig = new IndexWriterConfig(analyzer);
dimsIndexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
IndexWriter idxWriter = new IndexWriter(indexDir, dimsIndexWriterConfig);
//add doc
Document doc = new Document();
doc.add(new TextField(field, text, Field.Store.NO));
idxWriter.addDocument(doc);
//commit
idxWriter.commit();
idxWriter.close();
//search & highlight
Query query = new QueryParser(field, analyzer).parse(uQuery);
Highlighter highlighter = new Highlighter(new
SimpleHTMLFormatter(), new QueryScorer(query));
TokenStream tokenStream = TokenSources.getTokenStream(field,
null, text, analyzer, maxStartOffset);
String highlighted = highlighter.getBestFragment(tokenStream, text);
System.out.println(highlighted);
}
}
Could you please confirm whether it's a bug in Lucene or whether we do
something that is not allowed?
Thanks a lot!
Best,
Juraj+