I tested the perceptron changes with the POS Tagger on the
WSJ corpus (without tag dict), the accuracy improved from 91%
(1.5.1-rc2, 1.5.0 has a bug which prevents perceptron training) to 95%.
I also tested on the dutch alpino dataset, but the accuracy was
identical there.
Jörn
On 4/26/11 9:44 AM, Jörn Kottmann wrote:
Hello,
our next and hopefully last RC is ready for testing, compared to RC 6
it fixes:
OPENNLP-154 normalization in perceptron
OPENNLP-155 unreliable training set accuracy in perceptron
OPENNLP-156 improve command line apps (ModelTrainer and ModelApplier)
for maxent
It can be found here:
http://people.apache.org/~joern/releases/opennlp-1.5.1-incubating/rc7/
Testing is now more or less done, but the test plan should be extended
to document the perceptron improvements compared to RC 6.
Here is the link to the test plan:
https://cwiki.apache.org/confluence/display/OPENNLP/TestPlan1.5.1
Jörn