[jira] [Updated] (LUCENE-5369) Add an UpperCaseFilter

Ryan McKinley (JIRA) Fri, 13 Dec 2013 13:56:49 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ryan McKinley updated LUCENE-5369:
----------------------------------

    Attachment: LUCENE-5369-uppercase-filter.patch

Here is a patch that adds UpperCaseFilter

There are a few others out there:
http://svn.apache.org/repos/asf/uima/addons/trunk/Lucas/src/main/java/org/apache/uima/lucas/indexer/analysis/UpperCaseFilter.java

https://github.ugent.be/Universiteitsbibliotheek/lludss-solr-java/blob/master/src/main/java/lludss/solr/analysis/UpperCaseFilter.java

--------

Given that we would want to steer people to LowerCase, perhaps this should be 
in a different package

I'll wait for +1 from someone who knows more about this than me :)




> Add an UpperCaseFilter
> ----------------------
>
>                 Key: LUCENE-5369
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5369
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>            Priority: Minor
>         Attachments: LUCENE-5369-uppercase-filter.patch
>
>
> We should offer a standard way to force upper-case tokens.  I understand that 
> lowercase is safer for general search quality because some uppercase 
> characters can represent multiple lowercase ones.
> However, having upper-case tokens is often nice for faceting (consider 
> normalizing to standard acronyms)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5369) Add an UpperCaseFilter

Reply via email to