[ https://issues.apache.org/jira/browse/LUCENE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974919#comment-13974919 ]
Robert Muir commented on LUCENE-5620: ------------------------------------- I dont like the keyword + repeat stuff, because this was really geared at blocking stemming, suddenly it applies to a new filter and breaks existing analysis chains. It also still means modifying each and every filter for this stuff. i dont like that, it makes no sense when they can be completely unaware of it. i seriously can't imagine a situation where lowercasefilter is doing anything more than lowercasing, sorry. just add a preservesnapshot/restore. you put the snapshot somewhere in your chain (e.g. before lowercase), it saves term text and maybe posinc into an attribute, then the restore (e.g. after the lowercase) checks if posinc is the same (e.g. nothing was deleted in between) and termtext now differs. if it differs, add the text as a synonym. > LowerCaseFilter.preserveOriginal > -------------------------------- > > Key: LUCENE-5620 > URL: https://issues.apache.org/jira/browse/LUCENE-5620 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Mike Sokolov > Attachments: LUCENE-5620.patch > > > Following closely the model of LUCENE-5437 (which worked on > ASCIIFoldingFilter), this patch adds the ability to preserve the original > token to LowerCaseFilter. This is useful if you want an all-lowercase search > term to match without regard to case, while search terms with uppercase > letters match in a case-sensitive manner. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org