I realize the training dataset is under copyright, and not easily redistributable. But I'm still interested in knowing which data set was used.
A second question, is there support for continuing to train a model with additional training data? As in loading the available model binary, and supplementing with additional training. (Apologies for needing to ask, as I am still somewhat unfamiliar with the OpenNLP source).
