Hi,
In my schema.xml I have for my text field type:
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
(See below for complete fieldType definition.) This correctly transforms
all accented characters, umlauts, etc. to their "normal" form.
The problem is this: When I search for any word with such a character
(e.g. "Ärzte" which becomes "Arzte" internally), highlighting doesn't
work, there are no strings returned. No error message is issued, no
exceptions occur, as far as I can tell.
If searching e.g. for "?rzte" (without quotes), highlighting works fine
again when finding "Ärzte". If I comment out the
solr.MappingCharFilterFactory in the text type, highlighting also works
perfectly.
The problem exists in all versions I tested, i.e., 1.4, 3.5, 3.6.
Google didn't find anything useful. Does anyone have any clues or
suggestions here? Any help would be much appreciated!
Cheers,
remus
-------
Complete fieldType definition:
<fieldType name="text" class="solr.TextField" indexed="true"
stored="true" multiValued="true" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<charFilter class="solr.HTMLStripCharFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>