Hi, Is there any exclusion list of characters which can be defined for StandardTokenizer ? In my case, i want to use StandardTokenizer(as it solves many problems of when to tokenization across languages) but i don't want to tokenize the stream on certain characters for example '@'. Is there a way i can provide that input to StandardTokenizer ? I tried to look into the source code, but seems to got lost. Any pointer is really appreciated.
Regards. -- View this message in context: http://lucene.472066.n3.nabble.com/Exclusion-List-for-standard-tokenizer-tp4306511.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org