[ 
https://issues.apache.org/jira/browse/SOLR-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754162#action_12754162
 ] 

Uwe Schindler commented on SOLR-1404:
-------------------------------------

bq. Will LUCENE-1906 fix it (in an alternate way)?

It should fix it. Lucene Tokenizer now do not have separate methods for 
CharStream anymore. They are simply handled as Readers. The trap of overwriting 
the wrong method should be fixed now. The offset correction is now done 
conditionally if the Reader is a CharStream subclass.

> Random failures with highlighting
> ---------------------------------
>
>                 Key: SOLR-1404
>                 URL: https://issues.apache.org/jira/browse/SOLR-1404
>             Project: Solr
>          Issue Type: Bug
>          Components: Analysis, highlighter
>    Affects Versions: 1.4
>            Reporter: Anders Melchiorsen
>             Fix For: 1.4
>
>         Attachments: SOLR-1404.patch
>
>
> With a recent Solr nightly, we started getting errors when highlighting.
> I have not been able to reduce our real setup to a minimal one that is 
> failing, but the same error seems to pop up with the configuration below. 
> Note that the QUERY will mostly fail, but it will work sometimes. Notably, 
> after running "java -jar start.jar", the QUERY will work the first time, but 
> then start failing for a while. Seems that something is not being reset 
> properly.
> The example uses the deprecated HTMLStripWhitespaceTokenizerFactory but the 
> problem apparently also exists with other tokenizers; I was just unable to 
> create a minimal example with other configurations.
> SCHEMA
> <?xml version="1.0" encoding="UTF-8" ?>
> <schema name="example" version="1.2">
>   <types>
>     <fieldType name="string" class="solr.StrField" />
>     <fieldtype name="testtype" class="solr.TextField">
>       <analyzer>
>         <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory" />
>       </analyzer>
>     </fieldtype>
>  </types>
>  <fields>
>    <field name="id" type="string" indexed="true" stored="false" />
>    <field name="test" type="testtype" indexed="false" stored="true" />
>  </fields>
>  <uniqueKey>id</uniqueKey>
> </schema>
> INDEX
> URL=http://localhost:8983/solr/update
> curl $URL --data-binary '<add><doc><field name="id">1</field><field 
> name="test">test</field></doc></add>' -H 'Content-type:text/xml; 
> charset=utf-8'
> curl $URL --data-binary '<commit/>' -H 'Content-type:text/xml; charset=utf-8'
> QUERY
> curl 'http://localhost:8983/solr/select/?hl.fl=test&hl=true&q=id:1'
> ERROR
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test 
> exceeds length of provided text sized 4
> org.apache.solr.common.SolrException: 
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test 
> exceeds length of provided text sized 4
>       at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:328)
>       at 
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
>       at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>       at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>       at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>       at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>       at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>       at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>       at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>       at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>       at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>       at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>       at org.mortbay.jetty.Server.handle(Server.java:285)
>       at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>       at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
>       at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
>       at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>       at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>       at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>       at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: 
> Token test exceeds length of provided text sized 4
>       at 
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254)
>       at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:321)
>       ... 23 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to