[jira] [Commented] (LUCENENET-337) TokenAttribute for Selectively Including Tokens in Length Norm

Christopher Currens (JIRA) Sun, 17 Jun 2012 14:30:44 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENENET-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393615#comment-13393615
 ]


Christopher Currens commented on LUCENENET-337:
-----------------------------------------------

I'm unsure about it.  It's implemented directly into the DocInverterPerField 
class, which makes me slightly uncomfortable, but by default, the behavior 
won't be changed, since LengthNormAttribute.IncludeInLengthNorm is set to true, 
by default.  I think (but don't actually remember) that the API might be 
outdated, so it would have to be upgraded for 3.0.3.
                
> TokenAttribute for Selectively Including Tokens in Length Norm
> --------------------------------------------------------------
>
>                 Key: LUCENENET-337
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-337
>             Project: Lucene.Net
>          Issue Type: Improvement
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.2
>            Reporter: Michael Garski
>            Priority: Minor
>             Fix For: Lucene.Net 3.0.3
>
>         Attachments: LengthNorm.patch
>
>
> This patch adds functionality to Lucene.Net that allow a TokenFilter to mark 
> a Token as not to be included in the length norm calculation through the use 
> of a new TokenAttribute interface LengthNormAttribute and a corresponding 
> implementation LengthNormAttributeImpl.  This functionality is useful to 
> prevent the increase of the length norm during synonym injection, 
> particularly in cases where there are a large number of synonyms in relation 
> to the number of original tokens.
> Following is an example of how to use the new attribute.
> Within your custom TokenFilter, define a field to persist a reference to the 
> attribute and set it's value in the constructor.  When a the stream advances 
> to a new Token within the call to IncrementToken() the value of the 
> IncludeInLengthNorm property of the attribute is set to false for Tokens 
> which should not be included in the length norm calculation.  It defaults to 
> true and is reset to true after each Token is consumed within 
> DocInverterPerField.ProcessFields.
> {code:title=CustomTokenFilter.cs|borderStyle=solid}
> public class CustomTokenFilter : TokenFilter
> {
>       private LengthNormAttribute lnAttribute;
>       
>       public CustomTokenFilter(TokenStream input) : base(input)
>       {
>               this.lnAttribute = 
> (LengthNormAttribute)AddAttribute(typeof(LengthNormAttribute));
>       }
>               
>       public override bool IncrementToken()
>       {
>               if (input.IncrementToken())
>               {
>                       // make determination that the token is not to be 
>                       // included in the length norm value
>                       // this example marks all tokens to not be 
>                       // included in the length norm value
>                       this.lnAttribute.IncludeInLengthNorm = false;
>                       return true;
>               }
>               else
>               {
>                       return false;
>               }
>       }    
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (LUCENENET-337) TokenAttribute for Selectively Including Tokens in Length Norm

Reply via email to