Hi William,
No, I think it will be fine. The problem only lies in data where there
is back to back names being tagged in the sentences. The unfixed prior
models would invalidly tag them with the wrong type... i.e.: both could
be the same type such as person instead of the different types one
person and the other maybe miscellaneous.
In some of the models; especially the combined Name Finder models that
contained all the tags ... were affected most; since, the likelihood of
back to back tags is higher.
In the English models there were 3 sentences that had improper tags
before ... now have the correct tags with the fixes. This improved the
scores a bit.
It should produce identical models since the problem was with the output
tagging and not with the training of the models.
James
On 3/14/2013 11:00 PM, William Colen wrote:
Hi, James,
Thank you for the warning. It didn't affect the test with the Leipzig
corpus: the output from 1.5.2 and 1.5.3 are identical. Do you think we
should better manually check the output?
Thank you,
William
On Thu, Mar 14, 2013 at 12:09 AM, James Kosin <james.ko...@gmail.com> wrote:
Hi all,
Note, that we will have some discrepancies in the model performance for
some of the tests in the NameFinder models due to OPENNLP-417 that fixes
the back-to-back name tags.
It should really be limited to the combined name tags; but, could also
affect others.
James
On 3/8/2013 9:11 AM, William Colen wrote:
Hi all,
Our second release candidate is ready for testing. RC1 failed to pass the
initial quality check.
The RC 2 can be downloaded from here:
http://people.apache.org/~**colen/releases/opennlp-1.5.3/**rc2/<http://people.apache.org/~colen/releases/opennlp-1.5.3/rc2/>
To use it in a maven build set the version for opennlp-tools or
opennlp-uima to 1.5.3, and for opennlp-maxent to 3.0.3, and add this URL
to
your settings.xml file:
https://repository.apache.org/**content/repositories/**
orgapacheopennlp-005/<https://repository.apache.org/content/repositories/orgapacheopennlp-005/>
The current test plan can be found here:
https://cwiki.apache.org/**OPENNLP/testplan153.html<https://cwiki.apache.org/OPENNLP/testplan153.html>
Please sign up for tasks in the test plan.
The release plan can be found here:
https://cwiki.apache.org/**OPENNLP/**releaseplanandtasks153.html<https://cwiki.apache.org/OPENNLP/releaseplanandtasks153.html>
The RC contains quite some changes, please refer to the contained issue
list for details.
William