Hello, I am training a NER model with perceptron classifier (using OpenNLP 1.7.0)
the output of the training is: Indexing events using cutoff of 0 Computing event counts... done. 11861603 events Indexing... done. Collecting events... Done indexing. Incorporating indexed data for training... done. Number of Event Tokens: 11861603 Number of Outcomes: 23 Number of Predicates: 6623489 Computing model parameters... Performing 300 iterations. 1: . (11795234/11861603) 0.9944047191597966 2: . (11820243/11861603) 0.9965131188423689 3: . (11829329/11861603) 0.9972791198626357 4: . (11834935/11861603) 0.9977517372651908 5: . (11838996/11861603) 0.9980941024581584 6: . (11841501/11861603) 0.9983052880795286 7: . (11843704/11861603) 0.998491013398442 8: . (11845304/11861603) 0.9986259024180796 9: . (11846421/11861603) 0.9987200718149141 10: . (11847181/11861603) 0.9987841440992419 20: . (11852226/11861603) 0.9992094660392866 30: . (11853947/11861603) 0.9993545560410343 40: . (11854831/11861603) 0.999429082224384 50: . (11855471/11861603) 0.999483037832239 Stopping: change in training set accuracy less than 1.0E-5 Stats: (11846242/11861603) 0.998704981105842 ...done. Compressed 6623489 parameters to 554312 6892 outcome patterns Indexing events using cutoff of 0 Computing event counts... done. 6370206 events Indexing... done. Collecting events... Done indexing. Incorporating indexed data for training... done. Number of Event Tokens: 6370206 Number of Outcomes: 23 Number of Predicates: 3737425 Computing model parameters... Performing 300 iterations. 1: . (6330365/6370206) 0.9937457281601254 2: . (6345859/6370206) 0.9961779885925196 3: . (6351552/6370206) 0.9970716802564941 4: . (6354847/6370206) 0.9975889319748843 5: . (6356872/6370206) 0.997906818084062 6: . (6358350/6370206) 0.998138835698563 7: . (6359611/6370206) 0.9983367884806237 8: . (6360473/6370206) 0.9984721059256169 9: . (6361138/6370206) 0.9985764981540628 10: . (6361532/6370206) 0.9986383485871572 20: . (6364161/6370206) 0.9990510510963068 30: . (6365106/6370206) 0.9991993979472563 Stopping: change in training set accuracy less than 1.0E-5 Stats: (6360617/6370206) 0.9984947111600473 ...done. Indexing events using cutoff of 0 Computing event counts... done. 6370114 events Indexing... done. Collecting events... Done indexing. Incorporating indexed data for training... done. Number of Event Tokens: 6370114 Number of Outcomes: 23 Number of Predicates: 3737390 Computing model parameters... Performing 300 iterations. 1: . (6330266/6370114) 0.9937445389517362 2: . (6345810/6370114) 0.9961846836650019 3: . (6351374/6370114) 0.9970581374210885 4: . (6354747/6370114) 0.9975876412886803 5: . (6356872/6370114) 0.9979212302950936 6: . (6358429/6370114) 0.998165652922381 7: . (6359417/6370114) 0.9983207521874805 8: . (6360292/6370114) 0.9984581123665919 9: . (6361076/6370114) 0.9985811870870757 10: . (6361693/6370114) 0.998678045636232 20: . (6364109/6370114) 0.9990573167136413 30: . (6365008/6370114) 0.9991984444862368 40: . (6365478/6370114) 0.9992722265253023 Stopping: change in training set accuracy less than 1.0E-5 Stats: (6359985/6370114) 0.9984099185666065 ...done. Indexing events using cutoff of 0 Computing event counts... done. 6370480 events Indexing... done. Collecting events... Done indexing. Incorporating indexed data for training... done. Number of Event Tokens: 6370480 Number of Outcomes: 23 Number of Predicates: 3737798 Computing model parameters... Performing 300 iterations. 1: . (6330685/6370480) 0.9937532179678769 2: . (6346153/6370480) 0.9961812924614786 3: . (6351726/6370480) 0.9970561088018485 4: . (6355089/6370480) 0.9975840125076917 5: . (6357173/6370480) 0.9979111464128292 6: . (6358780/6370480) 0.9981634036995642 7: . (6359845/6370480) 0.9983305810551167 8: . (6360827/6370480) 0.9984847295651191 9: . (6361316/6370480) 0.9985614898720347 10: . (6362076/6370480) 0.9986807901445417 20: . (6364506/6370480) 0.9990622370684784 30: . (6365415/6370480) 0.9992049264733583 Stopping: change in training set accuracy less than 1.0E-5 Stats: (6362594/6370480) 0.9987621026986977 ...done. Indexing events using cutoff of 0 Computing event counts... done. 6370008 events Indexing... done. Collecting events... Done indexing. Incorporating indexed data for training... done. Number of Event Tokens: 6370008 Number of Outcomes: 23 Number of Predicates: 3737824 Computing model parameters... Performing 300 iterations. 1: . (6330200/6370008) 0.9937507142848172 2: . (6345643/6370008) 0.9961750440501802 3: . (6351415/6370008) 0.9970811653611737 4: . (6354522/6370008) 0.9975689198506501 5: . (6356723/6370008) 0.9979144453193779 6: . (6358164/6370008) 0.9981406616757781 7: . (6359399/6370008) 0.9983345389833106 8: . (6360274/6370008) 0.9984719014481614 9: . (6360694/6370008) 0.9985378354312899 10: . (6361531/6370008) 0.9986692324405244 .... .... .... etc etc is that normal ? The parameters are; *0 cutoff* and *300 iterators*. The corpus is relative small, it has 20k sentences. I do not remember an output like that using MAXENT classifier. Damiano