Actually, to reply to myself, the filters that are simply changing
the term text shouldn't be creating a new term anyway - but rather
just setting term.termText = ... on the original term. I'll see
about modifying our core and contrib filters to do this.
Erik
On Sep 22, 2005, at 4:29 PM, Erik Hatcher wrote:
Yonik identified an interesting issue with LUCENE-437 - http://
issues.apache.org/jira/browse/LUCENE-437
I patched the SnowballFilter, but then looked at other filters and
we have the same issue with some of them (like StandardFilter,
GermanStemFilter, GreekLowerCaseFilter, and others that create a
new Token).
To perhaps alleviate this situation in the future, maybe we should
add another constructor to Token:
public Token(String text, int start, int end, String typ, int
positionIncrement)
Or maybe one that clones an existing token:
public Token(Token template, String newText)
where all the metadata for the token (start, end, type, and
position increment) is copied and the newText is used for the Token
text instead. Filters don't generally change offsets, type, or
position increments anyway - the majority change the text for
stemming or lowercasing purposes.
Thoughts?
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]