Re: Models in spanish

Jörn Kottmann Wed, 19 Mar 2014 03:38:26 -0700

On 03/19/2014 11:22 AM, Richard Eckart de Castilho wrote:

Of course you could probably always train your own models, at least
for the tokenizer, sentencedetector, and pos tagger. I believe the
AnCora corpus should serve well [1].


Not sure about the chunker though and last time I looked, I believe
the parser was pretty much hard-coded to English.

The chunker can be trained for Spanish with out any modifications. Allyou need is

a training corpus and a tool which can convert it into the OpenNLP format.

The parser needs a head rules file for Spanish, we recently got acontribution for one

and it should soon be possible to train it on Spanish too.

HTH,
Jörn

Re: Models in spanish

Reply via email to