I'm trying to highlight a field and I'm getting an exception thrown, only on certain search terms though. I am fairly certain that the cause of the problem is through having synonyms on the highlighted field as I have had highlighting working in the past on other fields.
The added complication is that the field that I am highlighting also has ngramming and stemming. I think what is happening is that the highlighting cannot match the criteria (which happens to be a synonym) against the actual string retrieved from the index and crashes, I think if the string found is greater than a certain number of characters. I wonder if anyone has experienced this problem and knows how to get around it? My field definition is: <!-- An edge nGrammed and stemmed field for the document tags. --> <fieldType name="tagphrase_nGram" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="../../common/tag_synonyms.txt" ignoreCase="true" expand="true"/> <!--<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />--> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15" side="front" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> </fieldType> My query is: sort=tagcount+desc&hl.snippets=1&start=0&q=(+%2Btagsearch:asset)+||+(+%2Btagsearchnostem:asset)+&hl.fl=tagsearch&wt=javabin&hl=true&rows=100&version=1 The exception being thrown is: 09-Dec-2010 11:59:26 org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token profo exceeds length of provided text sized 26 at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:342) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token profo exceeds length of provided text sized 26 at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335) ... 18 more Thanks Shaun