Hi again, the Spanish and Dutch NER models are also affected, was just a bit more difficult to figure out because the models internally lower-case the features.
Cheers, -- Richard > On 01.03.2016, at 23:13, Richard Eckart de Castilho <[email protected]> wrote: > > Hi all, > > I noticed that the OpenNLP German POS Tagger maxent model available from > Sourceforge has been trained using the wrong encoding setting. Apparently the > input data was UTF-8, but it was read as ISO8859-1. The perceptron model is > not affected. I only examined NER and POS models, not tokenizer or sentence > splitter models. > > Best, > > -- Richard
