And what about sequence validators? How to alternate from the default one?

The factory should be used to load custom resources, like a different
implementation of a dictionary, am I right?

Thank you,
William

On Tue, Feb 7, 2012 at 11:57 AM, Joern Kottmann <[email protected]> wrote:

> Yes, lets see what we could do.
>
> The name finder already supports custom feature generation,
> the same feature generation code could be reused by the POS Tagger.
> This is actually already half done.
>
> One of the current limitations is that we cannot store "custom" resources
> in
> a model. If we specify some kind of Factory class it would be nice if it
> can help
> us to locate the Artifact Serializer for a custom resource.
>
> We could define one Factory class per component which is able to influence
> how this component is created from the model.
>
> What do you think?
>
> Jörn
>
> On Tue, Feb 7, 2012 at 2:17 PM, [email protected] <
> [email protected]> wrote:
>
> > Hi,
> >
> > I would like to work on that now, passing a Factory class name to the CLI
> > tools and saving it to the model as a configuration.
> > Do you still think it is a good idea? Or we should find a better way to
> > load custom feature generator and custom sequence validators? I would
> like
> > to do it for SentenceDetector and POS Tagger for now.
> >
> > Thanks,
> > William
> >
> > On Tue, Jun 21, 2011 at 11:58 AM, Jörn Kottmann <[email protected]>
> > wrote:
> >
> > > On 6/14/11 4:23 AM, [email protected] wrote:
> > >
> > >> Hi,
> > >>
> > >> Currently we only have implemented custom feature generators that we
> can
> > >> pass from command line only for NameFinder, but it would be very nice
> to
> > >> have it for all tools.
> > >> The Thai sentence detector customization is nice and simple, but to do
> > >> something for other languages the user would need to branch the code.
> We
> > >> should allow users to pass a factory class name from command line.
> Maybe
> > >> we
> > >> could do it for every tool that doesn't use sequence feature
> generator.
> > >> Also
> > >> would be nice to save the factory class name to the model to make sure
> > we
> > >> are using the same feature generator during runtime and evaluation.
> > >>
> > >> What do you think? Maybe you have thought a better solution for that.
> > >>
> > >
> > > The first approach OpenNLP come up with to customize the feature
> > generation
> > > of a component is to simply pass in a context generator. Well, that
> does
> > > not
> > > really work with the new model packages and the command line.
> > > We never really came up with a solution to this problem or discussed
> it.
> > >
> > > William suggest that we should use a class name to load a factory
> class.
> > > And I think we then should also remove the support to pass in a context
> > > generator.
> > >
> > > I believe it is a good way of solving the issue, since the model can
> than
> > > be used
> > > by an code which integrates OpenNLP and has an additional jar on the
> > > classpath.
> > > That will for example work well with our UIMA integration.
> > >
> > > These models might not be well suited for distribution to a wider group
> > of
> > > people
> > > since they always need the factory class which we cannot put inside the
> > > model because
> > > of security issues.
> > >
> > > For components where we need to adapt the feature generation to a
> > language
> > > I still
> > > suggest that we continue to define default feature generation which is
> > > dependent on
> > > the language, as we already do for thai in the sentence detector.
> > >
> > > Well, I am not yet sure how it should be done for the parser, doccat
> and
> > > coref.
> > >
> > > Jörn
> > >
> >
>

Reply via email to