[ https://issues.apache.org/jira/browse/LUCENE-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554482 ]
Doron Cohen commented on LUCENE-1101: ------------------------------------- Currently Token.clear() is used only for un-tokenized fields in DocmentsWriter - Tokenizer implementations of next(Token) do not call it. I think they can be modified to call it (instead of explicitly reseting just the pos-incr). But since these methods already set the value for start-offset, calling these method might eat the speed-up gained by reusing tokens. But then again, shouldn't tokenizers also reset the payload info? (seems wrong to assume there there's no payload in the input reusable token.) So I guess the right thing to do is to call clear() in all toknizers (3 actually) - will work that path. > Tokenizers should reset positionIncrement to 1 in their next(Token result) > --------------------------------------------------------------------------- > > Key: LUCENE-1101 > URL: https://issues.apache.org/jira/browse/LUCENE-1101 > Project: Lucene - Java > Issue Type: Bug > Affects Versions: 2.3 > Reporter: Doron Cohen > Assignee: Doron Cohen > Fix For: 2.3 > > Attachments: lucene-1101.patch > > > Tokenizers which implement the reuse form of the next method: > next(Token result) > should reset the postionIncrement of the returned token to 1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]