[ https://issues.apache.org/jira/browse/LUCENE-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-8129. --------------------------------- Resolution: Fixed Fix Version/s: 7.3 trunk Thanks [~emaijala]! > Support for defining a Unicode set filter when using ICUFoldingFilter > --------------------------------------------------------------------- > > Key: LUCENE-8129 > URL: https://issues.apache.org/jira/browse/LUCENE-8129 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Ere Maijala > Priority: Minor > Labels: ICUFoldingFilterFactory, patch-available, patch-with-test > Fix For: trunk, 7.3 > > Attachments: LUCENE-8129.patch, LUCENE-8129.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > <filter class="solr.ICUFoldingFilterFactory" filter="[^åäöÅÄÖ]"/> > Note: An additional MappingCharFilterFactory or solr.LowerCaseFilterFactory > would be needed for lowercasing the characters excluded from folding. This is > similar to what ElasticSearch provides (see > https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html). > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org