[jira] [Commented] (JENA-776) LowerCaseKeywordAnalyzer for jena-text

Osma Suominen (JIRA) Wed, 03 Sep 2014 07:40:40 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119800#comment-14119800
 ]


Osma Suominen commented on JENA-776:
------------------------------------

Documentation update for 
http://jena.apache.org/documentation/query/text-query.html#configuring-an-analyzer

Before:
"Other analyzer types that may be specified are SimpleAnalyzer and 
KeywordAnalyzer, neither of which has any configuration parameters. See the 
Lucene documentation for details of what these analyzers do."

After:
"Other analyzer types that may be specified are SimpleAnalyzer and 
KeywordAnalyzer, neither of which has any configuration parameters. See the 
Lucene documentation for details of what these analyzers do. In addition, a 
LowerCaseKeywordAnalyzer is available, which is a case-insensitive version of 
KeywordAnalyzer."


By the way, the table of contents in 
http://jena.apache.org/documentation/query/text-query.html has an incorrect 
link to the "Configuring an Analyzer" section. The correct URL for that section 
is above (i.e. #configuring-an-analyzer).

> LowerCaseKeywordAnalyzer for jena-text
> --------------------------------------
>
>                 Key: JENA-776
>                 URL: https://issues.apache.org/jira/browse/JENA-776
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Text
>            Reporter: Osma Suominen
>         Attachments: jena-text-lowercase-keyword-analyzer.patch
>
>
> I liked the option to specify Analyzer for jena-text, as implemented in 
> JENA-654. But I'd like to use an analyzer that is otherwise like 
> KeywordAnalyzer but case-insensitive, for use in an autocomplete/typeahead UI 
> widget. Lucene doesn't include such an analyzer, but there are several 
> implementations of the same idea, e.g. in neo4j [1] and stargate [2].
> I created my own implementation of such an analyzer and added code to use it 
> from the assembler. Patch attached.
> This analyzer is now in a new package org.apache.jena.query.text.analyzer, in 
> case other analyzers for jena-text will appear in the future. If you don't 
> like the new package, the class can of course be moved to 
> org.apache.jena.query.text.
> I also added a test for case-insensitivity. To avoid lots of duplicate 
> boilerplate code, I slightly modified and subclassed the existing test for 
> KeywordAnalyzer.
> I'd love to see this in the next version of jena-text and Fuseki. Of course 
> I'll rework the patch if necessary. I can also tweak the web documentation to 
> mention this analyzer.
> -Osma
> [1] 
> https://github.com/apatry/neo4j-lucene4-index/blob/master/src/main/java/org/neo4j/index/impl/lucene/LowerCaseKeywordAnalyzer.java
> [2] 
> https://github.com/tuplejump/stargate-core/blob/master/src/main/java/com/tuplejump/stargate/lucene/CaseInsensitiveKeywordAnalyzer.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-776) LowerCaseKeywordAnalyzer for jena-text

Reply via email to