+1 to add pluggable machine learning algorithms
+1 to improve the API and remove deprecated methods in 1.6.0

You can assign related Jira issues to me and I will be glad to help.


On Thu, May 30, 2013 at 11:53 AM, Jörn Kottmann <[email protected]> wrote:

> Hi all,
>
> we spoke about it here and there already, to ensure that OpenNLP can stay
> competitive with other NLP libraries I am proposing to make the machine
> learning pluggable.
>
> The extensions should not make it harder to use OpenNLP, if a user loads a
> model OpenNLP should be capable of setting up everything by itself without
> forcing the user to write custom integration code based on the ml
> implementation.
> We solved this problem already with the extension mechanism, we build to
> support the customization of our components, I suggest that we reuse this
> extension mechanism to load a ml implementation. To use a custom ml
> implementation the user has to specify the class name of the factory in the
> Algorithm field of the params file. The params file is available during
> training and tagging time.
>
> Most components in the tools package use the maxent library to do
> classification. The Java interfaces for this are currently located in the
> maxent package, to be able to swap the implementation the interfaces should
> be defined inside the tools package. To make things easier I propose to
> move the maxent and perceptron implemention as well.
>
> Through the code base we use the AbstractModel, thats a bit unlucky
> because the only reason for this is the lack of model serialization support
> in the MaxentModel interface, a serialization method should be added to it,
> and maybe renamed to ClassificationModel. This will
> break backward compatibility in non-standard use cases.
>
> To be able to test the extension mechanism I suggest that we implement an
> addon which integrates liblinear and the Apache Mahout classifiers.
>
> There are still a few deprecated 1.4 constructors and methods in OpenNLP
> which directly reference interfaces and classes in the maxent library,
> these need to be removed, to be able to move the interfaces to the tools
> package.
>
> Any opinions?
>
> Jörn
>

Reply via email to