Hi Boris,
I think it would be a good idea to mature it a bit more and get everyone
a bit more familiar with the code base.
I created a jira to move the Porter Stemmer over to the tools package:
https://issues.apache.org/jira/browse/OPENNLP-337
This work includes the definition of an interface, and we would need to
write a test for the stemmer so we know it works, should be easy to test.
I just tried to compile the project and still get a couple of errors
would be nice
if you can fix these. It looks like the tests are referencing models
which do not
exist in my file system.
Furthermore it would be nice if you can do the change you did for the
pos tagger
also for the chunker, where you extract the pos tags from the Parse
objects instead
of running the POS Tagger. The Parse object also includes the chunk
information,
so there should be no need to run the chunker.
We would need a bit documentation so that people can understand what it does
and how it can be used.
What do you think?
Jörn
On 11/7/11 11:38 PM, Boris Galitsky wrote:
Hi Jörn
I think the 'similarity' module is in a good shape now, what would
be the next steps?
Regards
Boris