Is it possible to share the sequence of steps with data causing this issue On Wed, Nov 5, 2014 at 4:51 PM, Alan Woodward <a...@flax.co.uk> wrote:
> Hi Min, > > Do you have the specific bit of text that caused this exception to be > thrown? > > Alan Woodward > www.flax.co.uk > > > On 4 Nov 2014, at 23:15, Min L wrote: > > > Hi All: > > > > I am using solr 4.9.1. and trying to use PostingsSolrHighlighter. But I > got > > errors during indexing. I thought LUCENE-5111 has fixed issues with > > WordDelimitedFilter. The error is as below: > > > > Caused by: java.lang.IllegalArgumentException: startOffset must be > > non-negative, and endOffset must be >= startOffset, and offsets must > > not go backwards startOffset=31,endOffset=44,lastStartOffset=37 for > > field 'description_texts' > > at > org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:630) > > at > org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342) > > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301) > > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:241) > > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:451) > > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1539) > > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) > > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) > > > > > > My schema.xml looks like below: > > > > <dynamicField name="*_texts" stored="true" type="text" multiValued="true" > > indexed="true" storeOffsetsWithPositions="true"/> > > > > <fieldType name="text" class="solr.TextField" omitNorms="false"> > > > > <analyzer type="index"> > > > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > > <filter class="solr.LowerCaseFilterFactory"/> > > > > <filter class="solr.StemmerOverrideFilterFactory" dictionary= > > "stemdict_en.txt" /> > > > > <filter class="solr.PatternReplaceFilterFactory" pattern= > > "^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/> > > > > <filter class="solr.KStemFilterFactory"/> > > > > <filter class="solr.StopFilterFactory" > words="stopwords_english.txt" > > ignoreCase="true" enablePositionIncrements="true" /> > > > > <filter class="solr.WordDelimiterFilterFactory" > preserveOriginal="1" > > splitOnNumerics="0" catenateWords="1" /> > > > > </analyzer> > > > > <analyzer type="query"> > > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > > <filter class="solr.LowerCaseFilterFactory"/> > > > > <filter class="solr.StopFilterFactory" > words="stopwords_english.txt" > > ignoreCase="true" enablePositionIncrements="true" /> > > > > <filter class="solr.WordDelimiterFilterFactory" > preserveOriginal="1" > > splitOnNumerics="0" catenateWords="1" /> > > > > <filter class="solr.StemmerOverrideFilterFactory" dictionary= > > "stemdict_en.txt" /> > > > > <filter class="solr.KStemFilterFactory"/> > > > > </analyzer> > > > > </fieldType> > > > > > > Any help is appreciated. > > > > > > Thanks. > > > > Min > >