Re: what happened to the Spanish models?

Jörn Kottmann Tue, 17 May 2011 05:53:04 -0700

It was trained on Cast3LB.

Jörn


On 5/17/11 2:31 PM, Jason Baldridge wrote:

Where is the Spanish data and what is the source?

On Tue, May 17, 2011 at 3:00 AM, Jörn Kottmann<[email protected]>  wrote:

Hello Jason,

I do not have the training data in the correct format and I
never took time to convert it.
Another way to solve it would be to wrap the old models in our
new model package.

The sentence detector and tokenizer can now also be trained on
the conll data. Should we do that instead?

To train the tokenizer we need a detokenizer dictionary.

Jörn



On 5/13/11 10:33 PM, Jason Baldridge wrote:

It seems as though the Spanish models for tokenization and sentence
splitting are no longer around, e.g. the models download page only has NER
models:

http://opennlp.sourceforge.net/models-1.5/

But there were models before:

http://opennlp.sourceforge.net/models-1.3/spanish/

Anyone know what happened to them? Sorry if I'm forgetting something...

Jason

Re: what happened to the Spanish models?

Reply via email to