[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Labels: ICUFoldingFilterFactory patch-available patch-with-test (was: ICUFoldingFilterFactory) > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory, patch-available, patch-with-test > Attachments: SOLR-11811.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > > Note: An additional MappingCharFilterFactory or solr.LowerCaseFilterFactory > would be needed for lowercasing the characters excluded from folding. This is > similar to what ElasticSearch provides (see > https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html). > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Description: While ICUNormalizer2FilterFactory supports a filter attribute to define a Unicode set filter, ICUFoldingFilterFactory does not support it. A filter allows one to e.g. exclude a set of characters from being folded. E.g. for Finnish and Swedish the filter could be defined like this: Note: An additional MappingCharFilterFactory or solr.LowerCaseFilterFactory would be needed for lowercasing the characters excluded from folding. This is similar to what ElasticSearch provides (see https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html). I'll add a patch that does this similar to ICUNormalizer2FilterFactory. Applies at least to master and branch_7x. was: While ICUNormalizer2FilterFactory supports a filter attribute to define a Unicode set filter, ICUFoldingFilterFactory does not support it. A filter allows one to e.g. exclude a set of characters from being folded. E.g. for Finnish and Swedish the filter could be defined like this: (Note: An additional MappingCharFilterFactory for lowercasing the characters excluded from folding would be needed for perfect results.) I'll add a patch that does this similar to ICUNormalizer2FilterFactory. Applies at least to master and branch_7x. > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory > Attachments: SOLR-11811.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > > Note: An additional MappingCharFilterFactory or solr.LowerCaseFilterFactory > would be needed for lowercasing the characters excluded from folding. This is > similar to what ElasticSearch provides (see > https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html). > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Attachment: SOLR-11811.patch Added a patch that includes a test for the filter parameter. > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory > Attachments: SOLR-11811.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > > (Note: An additional MappingCharFilterFactory for lowercasing the characters > excluded from folding would be needed for perfect results.) > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Attachment: (was: SOLR-11811.patch) > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory > Attachments: SOLR-11811.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > > (Note: An additional MappingCharFilterFactory for lowercasing the characters > excluded from folding would be needed for perfect results.) > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Description: While ICUNormalizer2FilterFactory supports a filter attribute to define a Unicode set filter, ICUFoldingFilterFactory does not support it. A filter allows one to e.g. exclude a set of characters from being folded. E.g. for Finnish and Swedish the filter could be defined like this: (Note: An additional MappingCharFilterFactory for lowercasing the characters excluded from folding would be needed for perfect results.) I'll add a patch that does this similar to ICUNormalizer2FilterFactory. Applies at least to master and branch_7x. was: ne a Unicode set filter, ICUFoldingFilterFactory does not support it. A filter allows one to e.g. exclude a set of characters from being folded. E.g. for Finnish and Swedish the filter could be defined like this: (Note: An additional MappingCharFilterFactory for lowercasing the characters excluded from folding would be needed for perfect results.) I'll add a patch that does this similar to ICUNormalizer2FilterFactory. Applies at least to master and branch_7x. > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory > Attachments: SOLR-11811.patch > > > While ICUNormalizer2FilterFactory supports a filter attribute to define a > Unicode set filter, ICUFoldingFilterFactory does not support it. A filter > allows one to e.g. exclude a set of characters from being folded. E.g. for > Finnish and Swedish the filter could be defined like this: > > (Note: An additional MappingCharFilterFactory for lowercasing the characters > excluded from folding would be needed for perfect results.) > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-11811) Support for defining a Unicode set filter when using ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ere Maijala updated SOLR-11811: --- Flags: Patch Attachment: SOLR-11811.patch Attached a patch. > Support for defining a Unicode set filter when using ICUFoldingFilter > - > > Key: SOLR-11811 > URL: https://issues.apache.org/jira/browse/SOLR-11811 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Reporter: Ere Maijala >Priority: Minor > Labels: ICUFoldingFilterFactory > Attachments: SOLR-11811.patch > > > ne a Unicode set filter, ICUFoldingFilterFactory does not support it. A > filter allows one to e.g. exclude a set of characters from being folded. E.g. > for Finnish and Swedish the filter could be defined like this: > > (Note: An additional MappingCharFilterFactory for lowercasing the characters > excluded from folding would be needed for perfect results.) > I'll add a patch that does this similar to ICUNormalizer2FilterFactory. > Applies at least to master and branch_7x. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org