Hi, I have been looking for a while if there is any relevant work that performed tests on the OpenNLP tools (in particular the Lemmatizer, Tokenizer and PoS-Tagger) when used with short and noisy texts such as Twitter data, etc., and/or compared it to other libraries.
By performances, I mean accuracy/precision, rather than time of execution, etc. If anyone can refer me to a paper or a work done in this context, that would be of great help. Thank you very much. Mondher
