Hi Rodrigo,

On 15.05.2017, at 15:36, Rodrigo Agerri <rage...@apache.org> wrote:
> 
> I cannot reproduce the lemmatizer issue. Could you please share your
> training data?

I have observed the change in behavior via the OpenNlpLemmatizerTrainerTest
in DKPro Core [1]. It happens when I change the OpenNLP version in the POM
from 1.7.2 to 1.8.0 (after including the OpenNLP staging Maven repo of course).
Unfortunately, it's not a simple minimal OpenNLP-only unit test, but it makes 
used
of the respective DKPro Core UIMA components.

The data that is used is the GUM 3.0.0 corpus, specifically the CoNLL files in 
it [2].

The corpus can be downloaded from: 
https://github.com/amir-zeldes/gum/archive/V3.0.0.zip

Cheers,

-- Richard

[1] 
https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499dff6cbcd9ca/dkpro-core-opennlp-asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/opennlp/OpenNlpLemmatizerTrainerTest.java
[2] 
https://github.com/dkpro/dkpro-core/blob/master/dkpro-core-api-datasets-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/datasets/lib/gum-en-conll-3.0.0.yaml

Reply via email to