[ https://issues.apache.org/jira/browse/LUCENE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597026#comment-13597026 ]
Varun Thacker commented on LUCENE-4817: --------------------------------------- Really useful token filter. You've mentioned that a user should use this with a RemoveDuplicatesTokenFilter, which is needed because if words don't get stemmed there would be duplicates in the same position. So in the Javadocs for KeywordRepeatFilterFactory.java should use RemoveDuplicatesTokenFilter in the example. {code:xml} /** * Factory for {@link KeywordRepeatFilter}. * <pre class="prettyprint" > * <fieldType name="text_keyword" class="solr.TextField" positionIncrementGap="100"> * <analyzer> * <tokenizer class="solr.WhitespaceTokenizerFactory"/> * <filter class="solr.KeywordRepeatFilter"/> * <filter class="solr.PorterStemFilterFactory"/> * <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> * </analyzer> * </fieldType></pre> */ {code} > Add KeywordRepeaterFilter to emit tokens twice once as keyword and once not > as keyword > -------------------------------------------------------------------------------------- > > Key: LUCENE-4817 > URL: https://issues.apache.org/jira/browse/LUCENE-4817 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 4.1 > Reporter: Simon Willnauer > Priority: Minor > Fix For: 5.0, 4.3 > > Attachments: LUCENE-4817.patch, LUCENE-4817.patch > > > if you want to have a stemmed and an unstemmed version of a token one for > recall and one for precision you have to do two fields today in most of the > cases. Yet, most of the stemmers respect the keyword attribute so we could > add a token filter that emits the same token twice once as keyword and once > plain. Folks would most likely need to combine this > RemoveDuplicatesTokenFilter but that way we can have stemmed and unstemmed > version in the same field. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org