Hi Lucene experts, The following is a simple Lucene code which generates StringIndexOutOfBoundsException exception. I am using Lucene 2.2.0 official releasse. Can anyone tell me what is wrong with this code? Is this a bug or a feature of Lucene? Any comments/hits highly welcommed!
In a nutshell I have a document with two (or four) fileds: 1) all 2-4) small I use [all] for searching and [small] for highlighting. [packkage and imports truncated...] public class MemoryIndexCase { static public void main(String[] arg) { Document doc = new Document(); doc.add(new Field("all","example long text", Field.Store.NO, Field.Index.TOKENIZED)); doc.add(new Field("small","example", Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); doc.add(new Field("small","long", Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); doc.add(new Field("small","text", Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); try { Directory idx = new RAMDirectory(); IndexWriter writer = new IndexWriter(idx, new StandardAnalyzer(), true); writer.addDocument(doc); writer.optimize(); writer.close(); Searcher searcher = new IndexSearcher(idx); QueryParser qp = new QueryParser("all", new StandardAnalyzer()); Query query = qp.parse("example text"); Hits hits = searcher.search(query); Highlighter highlighter = new Highlighter(new QueryScorer(query)); IndexReader ir = IndexReader.open(idx); for (int i = 0; i < hits.length(); i++) { String text = hits.doc(i).get("small"); TermFreqVector tfv = ir.getTermFreqVector(hits.id(i), "small"); TokenStream tokenStream= TokenSources.getTokenStream((TermPositionVector) tfv); String result = highlighter.getBestFragment(tokenStream,text); System.out.println(result); } } catch (Throwable e) { e.printStackTrace(); } } } The exception is: java.lang.StringIndexOutOfBoundsException: String index out of range: 11 at java.lang.String.substring(String.java:1935) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments( Highlighter.java:235) at org.apache.lucene.search.highlight.Highlighter.getBestFragments( Highlighter.java:175) at org.apache.lucene.search.highlight.Highlighter.getBestFragment( Highlighter.java:101) at org.lucenetest.MemoryIndexCase.main(MemoryIndexCase.java:70) Best regards, Lukas