Hi, I am using lucene 3.6 and I am looking to a tokenized that would remove certain characters when they are present at the beginning or at the end of a token.
I initially used the StandardAnalyzer and switched to the WhitespaceAnalyser because it was too agressive for my use case. A few examples: - foo, -> foo (comma at the end) - foo. -> foo (period at the end) - foo!!!! -> foo - foo?! -> foo - ,foo -> foo (comma at the beginning of a word is a typo mistake but should be handled- Is there a configurable tokenizer I could use for this? Thanks, S.