Thanks for the answer. I examined the ICUFoldingFilterFactory, but it seems to me, that it can't be customized the way I would need it. We have got some special foldings, e.g.: รค->ae. In the CharFilter, I can add it to the following file: "mapping="mapping-FoldToASCII.txt" There seems to be nothing like this mapping file in the ICUFoldingFilter? Exclusion is not enough ....
>>> Shawn Heisey <apa...@elyograg.org> 7/18/2019 3:08 PM >>> On 7/18/2019 3:01 AM, Doris Peter wrote: > So, the mappingCharFilter seems to be executed at first, no matter which > position it has in the configuration? CharFilters are always executed first. Then one Tokenizer, then Filters. This will always be the case, even if you order the config so that the Tokenizer and one or more Filters are listed before CharFilter entries. It's one of the quirks of analysis definitions. The fix for this would be to see if there is a regular Filter that does what the CharFilter you're using does and use that filter instead. If it were me, I would likely use ICUFoldingFilterFactory rather than MappingCharFilterFactory. The ICU analysis components do require installing contrib jars into Solr. https://lucene.apache.org/solr/guide/8_1/filter-descriptions.html#icu-folding-filter Thanks, Shawn