Hello,

I am training a NER model with perceptron classifier (using OpenNLP 1.7.0)

the output of the training is:

Indexing events using cutoff of 0

Computing event counts...  done. 11861603 events
Indexing...  done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 11861603
   Number of Outcomes: 23
 Number of Predicates: 6623489
Computing model parameters...
Performing 300 iterations.
  1:  . (11795234/11861603) 0.9944047191597966
  2:  . (11820243/11861603) 0.9965131188423689
  3:  . (11829329/11861603) 0.9972791198626357
  4:  . (11834935/11861603) 0.9977517372651908
  5:  . (11838996/11861603) 0.9980941024581584
  6:  . (11841501/11861603) 0.9983052880795286
  7:  . (11843704/11861603) 0.998491013398442
  8:  . (11845304/11861603) 0.9986259024180796
  9:  . (11846421/11861603) 0.9987200718149141
 10:  . (11847181/11861603) 0.9987841440992419
 20:  . (11852226/11861603) 0.9992094660392866
 30:  . (11853947/11861603) 0.9993545560410343
 40:  . (11854831/11861603) 0.999429082224384
 50:  . (11855471/11861603) 0.999483037832239
Stopping: change in training set accuracy less than 1.0E-5
Stats: (11846242/11861603) 0.998704981105842
...done.
Compressed 6623489 parameters to 554312
6892 outcome patterns
Indexing events using cutoff of 0

Computing event counts...  done. 6370206 events
Indexing...  done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 6370206
   Number of Outcomes: 23
 Number of Predicates: 3737425
Computing model parameters...
Performing 300 iterations.
  1:  . (6330365/6370206) 0.9937457281601254
  2:  . (6345859/6370206) 0.9961779885925196
  3:  . (6351552/6370206) 0.9970716802564941
  4:  . (6354847/6370206) 0.9975889319748843
  5:  . (6356872/6370206) 0.997906818084062
  6:  . (6358350/6370206) 0.998138835698563
  7:  . (6359611/6370206) 0.9983367884806237
  8:  . (6360473/6370206) 0.9984721059256169
  9:  . (6361138/6370206) 0.9985764981540628
 10:  . (6361532/6370206) 0.9986383485871572
 20:  . (6364161/6370206) 0.9990510510963068
 30:  . (6365106/6370206) 0.9991993979472563
Stopping: change in training set accuracy less than 1.0E-5
Stats: (6360617/6370206) 0.9984947111600473
...done.
Indexing events using cutoff of 0

Computing event counts...  done. 6370114 events
Indexing...  done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 6370114
   Number of Outcomes: 23
 Number of Predicates: 3737390
Computing model parameters...
Performing 300 iterations.
  1:  . (6330266/6370114) 0.9937445389517362
  2:  . (6345810/6370114) 0.9961846836650019
  3:  . (6351374/6370114) 0.9970581374210885
  4:  . (6354747/6370114) 0.9975876412886803
  5:  . (6356872/6370114) 0.9979212302950936
  6:  . (6358429/6370114) 0.998165652922381
  7:  . (6359417/6370114) 0.9983207521874805
  8:  . (6360292/6370114) 0.9984581123665919
  9:  . (6361076/6370114) 0.9985811870870757
 10:  . (6361693/6370114) 0.998678045636232
 20:  . (6364109/6370114) 0.9990573167136413
 30:  . (6365008/6370114) 0.9991984444862368
 40:  . (6365478/6370114) 0.9992722265253023
Stopping: change in training set accuracy less than 1.0E-5
Stats: (6359985/6370114) 0.9984099185666065
...done.
Indexing events using cutoff of 0

Computing event counts...  done. 6370480 events
Indexing...  done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 6370480
   Number of Outcomes: 23
 Number of Predicates: 3737798
Computing model parameters...
Performing 300 iterations.
  1:  . (6330685/6370480) 0.9937532179678769
  2:  . (6346153/6370480) 0.9961812924614786
  3:  . (6351726/6370480) 0.9970561088018485
  4:  . (6355089/6370480) 0.9975840125076917
  5:  . (6357173/6370480) 0.9979111464128292
  6:  . (6358780/6370480) 0.9981634036995642
  7:  . (6359845/6370480) 0.9983305810551167
  8:  . (6360827/6370480) 0.9984847295651191
  9:  . (6361316/6370480) 0.9985614898720347
 10:  . (6362076/6370480) 0.9986807901445417
 20:  . (6364506/6370480) 0.9990622370684784
 30:  . (6365415/6370480) 0.9992049264733583
Stopping: change in training set accuracy less than 1.0E-5
Stats: (6362594/6370480) 0.9987621026986977
...done.
Indexing events using cutoff of 0

Computing event counts...  done. 6370008 events
Indexing...  done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 6370008
   Number of Outcomes: 23
 Number of Predicates: 3737824
Computing model parameters...
Performing 300 iterations.
  1:  . (6330200/6370008) 0.9937507142848172
  2:  . (6345643/6370008) 0.9961750440501802
  3:  . (6351415/6370008) 0.9970811653611737
  4:  . (6354522/6370008) 0.9975689198506501
  5:  . (6356723/6370008) 0.9979144453193779
  6:  . (6358164/6370008) 0.9981406616757781
  7:  . (6359399/6370008) 0.9983345389833106
  8:  . (6360274/6370008) 0.9984719014481614
  9:  . (6360694/6370008) 0.9985378354312899
 10:  . (6361531/6370008) 0.9986692324405244
....
....
....

etc etc is that normal ? The parameters are; *0 cutoff* and *300 iterators*.

The corpus is relative small, it has 20k sentences.

I do not remember an output like that using MAXENT classifier.

Damiano

Reply via email to