Hi :)

I wrote a custom token filter which removes special characters. Sometimes, all characters of the token are removed so the filter procudes an empty token. I would like to remove this token from the tokenstream but i'm not sure how to do that.

Is there something missing in my custom token filter or do I need to chain another custom token filter to remove empty tokens?

Regards :)

ps:

this is the code of my custom filter :

public class SpecialCharFilter extends TokenFilter {

private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);

    protected SpecialCharFilter(TokenStream input) {
        super(input);
    }

    @Override
    public boolean incrementToken() throws IOException {

        if (!input.incrementToken()) {
            return false;
        }

        final char[] buffer = termAtt.buffer();
        final int length = termAtt.length();
        final char[] newBuffer = new char[length];

        int newIndex = 0;
        for (int i = 0; i < length; i++) {
            if (!isFilteredChar(buffer[i])) {
                newBuffer[newIndex] = buffer[i];
                newIndex++;
            }
        }

        String term = new String(newBuffer);
        term = term.trim();
        char[] characters = term.toCharArray();
        termAtt.setEmpty();
        termAtt.copyBuffer(characters, 0, characters.length);

        return true;
    }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to