Hi :)
I wrote a custom token filter which removes special characters.
Sometimes, all characters of the token are removed so the filter
procudes an empty token. I would like to remove this token from the
tokenstream but i'm not sure how to do that.
Is there something missing in my custom token filter or do I need to
chain another custom token filter to remove empty tokens?
Regards :)
ps:
this is the code of my custom filter :
public class SpecialCharFilter extends TokenFilter {
private final CharTermAttribute termAtt =
addAttribute(CharTermAttribute.class);
protected SpecialCharFilter(TokenStream input) {
super(input);
}
@Override
public boolean incrementToken() throws IOException {
if (!input.incrementToken()) {
return false;
}
final char[] buffer = termAtt.buffer();
final int length = termAtt.length();
final char[] newBuffer = new char[length];
int newIndex = 0;
for (int i = 0; i < length; i++) {
if (!isFilteredChar(buffer[i])) {
newBuffer[newIndex] = buffer[i];
newIndex++;
}
}
String term = new String(newBuffer);
term = term.trim();
char[] characters = term.toCharArray();
termAtt.setEmpty();
termAtt.copyBuffer(characters, 0, characters.length);
return true;
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org