[ 
https://issues.apache.org/jira/browse/LUCENE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974919#comment-13974919
 ] 

Robert Muir commented on LUCENE-5620:
-------------------------------------

I dont like the keyword + repeat stuff, because this was really geared at 
blocking stemming, suddenly it applies to a new filter and breaks existing 
analysis chains.

It also still means modifying each and every filter for this stuff. i dont like 
that, it makes no sense when they can be completely unaware of it. 

i seriously can't imagine a situation where lowercasefilter is doing anything 
more than lowercasing, sorry.

just add a preservesnapshot/restore. you put the snapshot somewhere in your 
chain (e.g. before lowercase), it saves term text and maybe posinc into an 
attribute, then the restore (e.g. after the lowercase) checks if posinc is the 
same (e.g. nothing was deleted in between) and termtext now differs. if it 
differs, add the text as a synonym. 

> LowerCaseFilter.preserveOriginal
> --------------------------------
>
>                 Key: LUCENE-5620
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5620
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Mike Sokolov
>         Attachments: LUCENE-5620.patch
>
>
> Following closely the model of LUCENE-5437 (which worked on 
> ASCIIFoldingFilter), this patch adds the ability to preserve the original 
> token to LowerCaseFilter.  This is useful if you want an all-lowercase search 
> term to match without regard to case, while search terms with uppercase 
> letters match in a case-sensitive manner. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to