[jira] [Commented] (JENA-1058) add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text

Claude Warren (JIRA) Thu, 29 Oct 2015 08:39:41 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980650#comment-14980650
 ]


Claude Warren commented on JENA-1058:
-------------------------------------

Wouldn't it make more sense to have a set of filters that you can string 
together to get the filter you want?  for example a "LowerCaseKeywordAnalyzer" 
and an "ASCIIFoldingFilter".  Then you could call the  ASCIIFoldingFilter amd 
pass the output to LowerCaseKeywordAnalyzer.  I think this would make the 
ecosystem smaller but able to handle more cases.  However, I am not familure 
with the jena-text framework.

> add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text
> -----------------------------------------------------
>
>                 Key: JENA-1058
>                 URL: https://issues.apache.org/jira/browse/JENA-1058
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: Text
>            Reporter: Osma Suominen
>            Assignee: Osma Suominen
>
> I'd like to have an Analyzer for jena-text which is otherwise like 
> LowerCaseKeywordAnalyzer that I've implemented before, but also includes the 
> ASCIIFoldingFilter from Lucene. This means that the comparison will ignore 
> accents, so that for example "deja vu" will match "déjà vu".
> For some background on why I need this, see 
> https://github.com/NatLibFi/Skosmos/issues/313
> I already have an implementation of this ready, will make a PR shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1058) add ASCIIFoldingLowerCaseKeywordAnalyzer to jena-text

Reply via email to