Standard tokenizer with punctuation output
------------------------------------------

                 Key: LUCENE-889
                 URL: https://issues.apache.org/jira/browse/LUCENE-889
             Project: Lucene - Java
          Issue Type: Improvement
    Affects Versions: 2.1
            Reporter: Karl Wettin
            Priority: Trivial


This patch adds punctuation (comma, period, question mark and exclamation 
point)  tokens as output from the StandardTokenizer, and filters them out in 
the StandardFilter.

(I needed them for text classification reasons.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to