Osma Suominen created JENA-776:
----------------------------------

             Summary: LowerCaseKeywordAnalyzer for jena-text
                 Key: JENA-776
                 URL: https://issues.apache.org/jira/browse/JENA-776
             Project: Apache Jena
          Issue Type: Improvement
          Components: Text
            Reporter: Osma Suominen


I liked the option to specify Analyzer for jena-text, as implemented in 
JENA-654. But I'd like to use an analyzer that is otherwise like 
KeywordAnalyzer but case-insensitive, for use in an autocomplete/typeahead UI 
widget. Lucene doesn't include such an analyzer, but there are several 
implementations of the same idea, e.g. in neo4j [1] and stargate [2].

I created my own implementation of such an analyzer and added code to use it 
from the assembler. Patch attached.

This analyzer is now in a new package org.apache.jena.query.text.analyzer, in 
case other analyzers for jena-text will appear in the future. If you don't like 
the new package, the class can of course be moved to org.apache.jena.query.text.

I also added a test for case-insensitivity. To avoid lots of duplicate 
boilerplate code, I slightly modified and subclassed the existing test for 
KeywordAnalyzer.

I'd love to see this in the next version of jena-text and Fuseki. Of course 
I'll rework the patch if necessary. I can also tweak the web documentation to 
mention this analyzer.

-Osma


[1] 
https://github.com/apatry/neo4j-lucene4-index/blob/master/src/main/java/org/neo4j/index/impl/lucene/LowerCaseKeywordAnalyzer.java

[2] 
https://github.com/tuplejump/stargate-core/blob/master/src/main/java/com/tuplejump/stargate/lucene/CaseInsensitiveKeywordAnalyzer.java




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to