Re: PostingHighlighter complains about no offsets

Michael Sokolov Sat, 03 May 2014 10:57:07 -0700

For posterity, in case anybody follows this thread, I tracked theproblem down to WordDelimiterFilter; apparently it creates an offset of-1 in some case, which PostingsHighlighter rejects.


-Mike



On 5/2/2014 10:20 AM, Michael Sokolov wrote:

I checked using the analysis admin page, and I believe there areoffsets being generated (I assume start/end=offsets). So IDK I amgoing to try reindexing again. Maybe I neglected to reload the configbefore I indexed last time.
-Mike

On 05/02/2014 09:34 AM, Michael Sokolov wrote:
I've been wanting to try out the PostingsHighlighter, so I addedstoreOffsetsWithPositions to my field definition, enabled thehighlighter in solrconfig.xml, reindexed and tried it out. When Iissue a query I'm getting this error:
|field 'text' was indexed without offsets, cannot highlight


java.lang.IllegalArgumentException: field 'text' was indexed without offsets, 
cannot highlight
        at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightDoc(PostingsHighlighter.java:545)
        at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightField(PostingsHighlighter.java:467)
        at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:392)
        at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:293)|
I've been trying to figure out why the field wouldn't have offsetsindexed, but I just can't see it. Is there something in the analysischain that could stripping out offsets?
This is the field definition:
<field name="text" type="text_en" indexed="true" stored="true"multiValued="false" termVectors="true" termPositions="true"termOffsets="true" storeOffsetsWithPositions="true" />
(Yes I know PH doesn't require term vectors; I'm keeping them aroundfor now while I experiment)
<fieldType name="text_en" class="solr.TextField"positionIncrementGap="100">
      <analyzer type="index">

        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>

        <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory"stemEnglishPossessive="1" protected="protwords.txt"/>
        
<filter class="solr.SynonymFilterFactory"synonyms="synonyms.txt" expand="true" ignoreCase="true"/><filter class="solr.HunspellStemFilterFactory"dictionary="en_US.dic" affix="en_US.aff" ignoreCase="true"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>

        <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory"protected="protwords.txt"/><filter class="solr.HunspellStemFilterFactory"dictionary="en_US.dic" affix="en_US.aff" ignoreCase="true"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

Re: PostingHighlighter complains about no offsets

Reply via email to