big +1!

Tommaso


2013/5/31 William Colen <[email protected]>

> I don't see any issue. People that uses Maxent directly would need to
> change how they use it, but that is OK for a major release.
>
>
>
>
> On Thu, May 30, 2013 at 5:56 PM, Jörn Kottmann <[email protected]> wrote:
>
> > Are there any objections to move the maxent/perceptron classes to an
> > opennlp.tools.ml
> > package as part of this issue? Moving the things would avoid a second
> > interface layer and
> > probably make using OpenNLP Tools a bit easier, because then we are down
> > to a single jar.
> >
> > Jörn
> >
> >
> > On 05/30/2013 08:57 PM, William Colen wrote:
> >
> >> +1 to add pluggable machine learning algorithms
> >> +1 to improve the API and remove deprecated methods in 1.6.0
> >>
> >> You can assign related Jira issues to me and I will be glad to help.
> >>
> >>
> >> On Thu, May 30, 2013 at 11:53 AM, Jörn Kottmann <[email protected]>
> >> wrote:
> >>
> >>  Hi all,
> >>>
> >>> we spoke about it here and there already, to ensure that OpenNLP can
> stay
> >>> competitive with other NLP libraries I am proposing to make the machine
> >>> learning pluggable.
> >>>
> >>> The extensions should not make it harder to use OpenNLP, if a user
> loads
> >>> a
> >>> model OpenNLP should be capable of setting up everything by itself
> >>> without
> >>> forcing the user to write custom integration code based on the ml
> >>> implementation.
> >>> We solved this problem already with the extension mechanism, we build
> to
> >>> support the customization of our components, I suggest that we reuse
> this
> >>> extension mechanism to load a ml implementation. To use a custom ml
> >>> implementation the user has to specify the class name of the factory in
> >>> the
> >>> Algorithm field of the params file. The params file is available during
> >>> training and tagging time.
> >>>
> >>> Most components in the tools package use the maxent library to do
> >>> classification. The Java interfaces for this are currently located in
> the
> >>> maxent package, to be able to swap the implementation the interfaces
> >>> should
> >>> be defined inside the tools package. To make things easier I propose to
> >>> move the maxent and perceptron implemention as well.
> >>>
> >>> Through the code base we use the AbstractModel, thats a bit unlucky
> >>> because the only reason for this is the lack of model serialization
> >>> support
> >>> in the MaxentModel interface, a serialization method should be added to
> >>> it,
> >>> and maybe renamed to ClassificationModel. This will
> >>> break backward compatibility in non-standard use cases.
> >>>
> >>> To be able to test the extension mechanism I suggest that we implement
> an
> >>> addon which integrates liblinear and the Apache Mahout classifiers.
> >>>
> >>> There are still a few deprecated 1.4 constructors and methods in
> OpenNLP
> >>> which directly reference interfaces and classes in the maxent library,
> >>> these need to be removed, to be able to move the interfaces to the
> tools
> >>> package.
> >>>
> >>> Any opinions?
> >>>
> >>> Jörn
> >>>
> >>>
> >
>

Reply via email to