Stefan, this is a duplicate post for http://lucene.472066.n3.nabble.com/Highlighting-Problem-td2746022.html no? if see, please stick w/ one of them
Regards Stefan On Tue, Mar 29, 2011 at 10:30 AM, Stefan Mueller <solru...@yahoo.com> wrote: > dear solr users, > > my data looks like this: > > j]s(dh)fjk [hf]sjkadh asdj(kfh) [skdjfh aslkfjhalwe uigfrhj bsd bsdfga sjfg > asdlfj. > > if I want to query for the first "word", the following queries must match: > > j]s(dh)fjk > j]s(dh)fjk > j]sdhfjk > jsdhfjk > dhf > > So the matching should ignore some characters like ( ) [ ] and should match > substrings. > > So far I have the following field definition in the schema.xml: > > <fieldType name="text_ngram" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.PatternReplaceFilterFactory" pattern="[\[\]\(\)]" > replacement="" replace="all" /> > <filter class="solr.LowerCaseFilterFactory"/> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <filter class="solr.NGramFilterFactory" minGramSize="2" > maxGramSize="2" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.PatternReplaceFilterFactory" pattern="[\[\]\(\)]" > replacement="" replace="all" /> > <filter class="solr.LowerCaseFilterFactory"/> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-ISOLatin1Accent.txt"/> > <filter class="solr.NGramFilterFactory" minGramSize="2" > maxGramSize="2" /> > </analyzer> > </fieldType> > > > With this definition the matching works as planned. But not for highlighting, > there the special characters seem to move the <em> tags to wrong positions, > for example searching for "jsdhfjk" misses the last 3 letters of the words ( > = 3 special characters from PatternReplaceFilterFactory) > > <em>j]s(dh)</em>fjk > > Solr has so many bells and whistles - what must I do to get a correctly > working highlighting? > > kind regards, > Stefan > > > >