> On 13.05.2017, at 22:35, Richard Eckart de Castilho <[email protected]> wrote:
> 
> Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> training data is used during training?
> 
> I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
> this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.

Also, this test which trains and evaluates a lemmatizer model
takes ~8 sec with 1.7.2 and ~170 sec with 1.8.0. Even when only
considering the training phase (no evaluation), the test runs
much faster with 1.7.2 than with 1.8.0.

Here are some details on the training phase.

It seems odd that the events, outcomes, and predicates change that much. 

=== 1.7.2

done. 50697 events
        Indexing...  done.
Sorting and merging events... done. Reduced 50697 events to 12675.
Done indexing.
Incorporating indexed data for training...  
done.
        Number of Event Tokens: 12675
            Number of Outcomes: 389
          Number of Predicates: 13488
...done.
Computing model parameters ...
Performing 10 iterations.
  1:  ... loglikelihood=-302335.58198350534     0.8420616604532812
  2:  ... loglikelihood=-61602.20311717376      0.9492672150225852
  3:  ... loglikelihood=-30747.954089148297     0.9769217113438665
  4:  ... loglikelihood=-19986.853691639506     0.9850484249561118
  5:  ... loglikelihood=-14672.523462458894     0.9881255301102629
  6:  ... loglikelihood=-11572.587093608756     0.9893879322247865
  7:  ... loglikelihood=-9571.242700030467      0.9900783083811665
  8:  ... loglikelihood=-8185.394028944442      0.9906897844053889
  9:  ... loglikelihood=-7174.66904253965       0.9912223602974535
 10:  ... loglikelihood=-6407.4278143846        0.9917746612225575


=== 1.8.0

done. 50697 events
        Indexing...  done.
Sorting and merging events... done. Reduced 50697 events to 26026.
Done indexing.
Incorporating indexed data for training...  
done.
        Number of Event Tokens: 26026
            Number of Outcomes: 7668
          Number of Predicates: 15279
...done.
Computing model parameters ...
Performing 10 iterations.
  1:  ... loglikelihood=-453475.08854769287     1.972503303943034E-5
  2:  ... loglikelihood=-165718.68620632993     0.9509241177978973
  3:  ... loglikelihood=-85388.42871190465      0.9761327100222893
  4:  ... loglikelihood=-56404.00400621838      0.9892104069274316
  5:  ... loglikelihood=-41004.08840359108      0.9938457896916977
  6:  ... loglikelihood=-31539.64788603799      0.9955421425330887
  7:  ... loglikelihood=-25264.889481438582     0.9964889441189814
  8:  ... loglikelihood=-20883.72059438774      0.9972384953744797
  9:  ... loglikelihood=-17699.228362701586     0.9977710712665444
 10:  ... loglikelihood=-15306.654021266759     0.9980669467621358


I also get some differences in f-score for other tests that train models,
but not as significant as when training a lemmatizer model.

-- Richard

Reply via email to