If you like you can take a look at the chapters 6.6 and 6.8 of
http://www.teses.usp.br/teses/disponiveis/45/45134/tde-02052013-135414/publico/WilliamColen_Dissertation.pdf

There I wrote about my experience tuning Portuguese models for POS Tagger
and Chunker.
I tried out many OpenNLP configurations and measured their impact both
using the performance monitor and my final application itself.


2013/10/7 Jörn Kottmann <[email protected]>

> On 10/07/2013 11:00 PM, Michael Schmitz wrote:
>
>> Do you know how many sentences/tokens were annotated for the OpenNLP
>> POS and CHUNK models?  Do you have an idea of the "sweet spot" for
>> number of annotations vs performance?
>>
>
> If the model gets bigger the computations get more complex, but as far as
> I know
> the effect of the model not fitting anymore in the CPU cache is much more
> significant then
> that. I am using hash based int features to reduce the memory footprint in
> the name finder.
>
> I don't have much experience with the Chunker or Pos Tagger in regards to
> performance, but
> it should be easy to do a series of tests, the command line tools have
> built in performance monitoring.
>
> Jörn
>

Reply via email to