Re: Custom feature generators

Jörn Kottmann Tue, 21 Jun 2011 07:59:09 -0700

On 6/14/11 4:23 AM, [email protected] wrote:

Hi,


Currently we only have implemented custom feature generators that we can
pass from command line only for NameFinder, but it would be very nice to
have it for all tools.
The Thai sentence detector customization is nice and simple, but to do
something for other languages the user would need to branch the code. We
should allow users to pass a factory class name from command line. Maybe we
could do it for every tool that doesn't use sequence feature generator. Also
would be nice to save the factory class name to the model to make sure we
are using the same feature generator during runtime and evaluation.

What do you think? Maybe you have thought a better solution for that.


The first approach OpenNLP come up with to customize the feature generation
of a component is to simply pass in a context generator. Well, that does not
really work with the new model packages and the command line.
We never really came up with a solution to this problem or discussed it.

William suggest that we should use a class name to load a factory class.

And I think we then should also remove the support to pass in a contextgenerator.

I believe it is a good way of solving the issue, since the model canthan be usedby an code which integrates OpenNLP and has an additional jar on theclasspath.

That will for example work well with our UIMA integration.

These models might not be well suited for distribution to a wider groupof peoplesince they always need the factory class which we cannot put inside themodel because

of security issues.

For components where we need to adapt the feature generation to alanguage I stillsuggest that we continue to define default feature generation which isdependent on

the language, as we already do for thai in the sentence detector.

Well, I am not yet sure how it should be done for the parser, doccat andcoref.


Jörn

Re: Custom feature generators

Reply via email to