Hello, The Tokenizer context generator receives just the token and a index pointing to a character inside the token. Shouldn't it be more effective if it could use a bigger context? It would be useful for example to know if it is the first token candidate of a sentence, or the last token candidate etc.
Does anyone know why it was implemented this why? I am trying to figure out how to pass additional information without breaking compatibility. I don't want to branch either. Thanks, William
