Hi, I was wondering if there is an possibility to provide features to tokenizer. Sometimes, tokenization might depend on certain factors.
For example, the word 'semi-supervised' shouldn't be tokenized while 'august-september' should be tokenized. Is there any way by which we could add custom features to the Learnable Tokenizer similar to NER. Thanks. Manoj.