It was trained on Cast3LB.

Jörn

On 5/17/11 2:31 PM, Jason Baldridge wrote:
Where is the Spanish data and what is the source?

On Tue, May 17, 2011 at 3:00 AM, Jörn Kottmann<[email protected]>  wrote:

Hello Jason,

I do not have the training data in the correct format and I
never took time to convert it.
Another way to solve it would be to wrap the old models in our
new model package.

The sentence detector and tokenizer can now also be trained on
the conll data. Should we do that instead?

To train the tokenizer we need a detokenizer dictionary.

Jörn



On 5/13/11 10:33 PM, Jason Baldridge wrote:

It seems as though the Spanish models for tokenization and sentence
splitting are no longer around, e.g. the models download page only has NER
models:

http://opennlp.sourceforge.net/models-1.5/

But there were models before:

http://opennlp.sourceforge.net/models-1.3/spanish/

Anyone know what happened to them? Sorry if I'm forgetting something...

Jason




Reply via email to