[
https://issues.apache.org/jira/browse/LUCENE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597026#comment-13597026
]
Varun Thacker commented on LUCENE-4817:
---------------------------------------
Really useful token filter.
You've mentioned that a user should use this with a
RemoveDuplicatesTokenFilter, which is needed because if words don't get stemmed
there would be duplicates in the same position.
So in the Javadocs for KeywordRepeatFilterFactory.java should use
RemoveDuplicatesTokenFilter in the example.
{code:xml}
/**
* Factory for {@link KeywordRepeatFilter}.
* <pre class="prettyprint" >
* <fieldType name="text_keyword" class="solr.TextField"
positionIncrementGap="100">
* <analyzer>
* <tokenizer class="solr.WhitespaceTokenizerFactory"/>
* <filter class="solr.KeywordRepeatFilter"/>
* <filter class="solr.PorterStemFilterFactory"/>
* <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
* </analyzer>
* </fieldType></pre>
*/
{code}
> Add KeywordRepeaterFilter to emit tokens twice once as keyword and once not
> as keyword
> --------------------------------------------------------------------------------------
>
> Key: LUCENE-4817
> URL: https://issues.apache.org/jira/browse/LUCENE-4817
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 4.1
> Reporter: Simon Willnauer
> Priority: Minor
> Fix For: 5.0, 4.3
>
> Attachments: LUCENE-4817.patch, LUCENE-4817.patch
>
>
> if you want to have a stemmed and an unstemmed version of a token one for
> recall and one for precision you have to do two fields today in most of the
> cases. Yet, most of the stemmers respect the keyword attribute so we could
> add a token filter that emits the same token twice once as keyword and once
> plain. Folks would most likely need to combine this
> RemoveDuplicatesTokenFilter but that way we can have stemmed and unstemmed
> version in the same field.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]