Random failures with highlighting
---------------------------------
Key: SOLR-1404
URL: https://issues.apache.org/jira/browse/SOLR-1404
Project: Solr
Issue Type: Bug
Components: Analysis, highlighter
Affects Versions: 1.4
Reporter: Anders Melchiorsen
With a recent Solr nightly, we started getting errors when highlighting.
I have not been able to reduce our real setup to a minimal one that is failing,
but the same error seems to pop up with the configuration below. Note that the
QUERY will mostly fail, but it will work sometimes. Notably, after running
"java -jar start.jar", the QUERY will work the first time, but then start
failing for a while. Seems that something is not being reset properly.
The example uses the deprecated HTMLStripWhitespaceTokenizerFactory but the
problem apparently also exists with other tokenizers; I was just unable to
create a minimal example with other configurations.
SCHEMA
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.2">
<types>
<fieldType name="string" class="solr.StrField" />
<fieldtype name="testtype" class="solr.TextField">
<analyzer>
<tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory" />
</analyzer>
</fieldtype>
</types>
<fields>
<field name="id" type="string" indexed="true" stored="false" />
<field name="test" type="testtype" indexed="false" stored="true" />
</fields>
<uniqueKey>id</uniqueKey>
</schema>
INDEX
URL=http://localhost:8983/solr/update
curl $URL --data-binary '<add><doc><field name="id">1</field><field
name="test">test</field></doc></add>' -H 'Content-type:text/xml; charset=utf-8'
curl $URL --data-binary '<commit/>' -H 'Content-type:text/xml; charset=utf-8'
QUERY
curl 'http://localhost:8983/solr/select/?hl.fl=test&hl=true&q=id:1'
ERROR
org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test
exceeds length of provided text sized 4
org.apache.solr.common.SolrException:
org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test
exceeds length of provided text sized 4
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:328)
at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException:
Token test exceeds length of provided text sized 4
at
org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:321)
... 23 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.