Re: Chunker - proposal to change API (break compatibility)

William Colen Thu, 10 Nov 2016 17:53:41 -0800

I tried that, but we have an issue with the factories we created. To
customize we extend the factory, but the method we need to override don't
allow using generic.


  public SequenceValidator<String> getSequenceValidator() {

    return new DefaultChunkerSequenceValidator();

  }

I tried to change String to ?, but it breaks a lot of code. I am not sure
if it is a simple change anymore.

Thank you
William

2016-11-10 10:23 GMT-02:00 Joern Kottmann <kottm...@gmail.com>:

> The sequence we have today is usually of type String, but it is generic so
> it could also be about a wrapper object which has the token and tag, e.g.
> TokenWithPos.
> On such a sequence we should be able to use most of the existing interfaces
> without too much change, right?
>
> Jörn
>
> On Thu, Nov 10, 2016 at 10:33 AM, William Colen <william.co...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Today the Chunker sequence is the sentences pos tags.
> >
> > Although we use both the tokens and tags in the context generator, in the
> > current API we ca not use the token in the sequence validator, because we
> > do not have access to it.
> >
> > In Portuguese, I know there will never be some combinations of word + tag
> > in a specific kind of phrase. Today I can not set a rule with this filter
> > to the sequence validator.
> >
> > I know maybe it is better to train the model so it will learn, but the
> hack
> > of adding this rule to the sequence validator is helpful.
> >
> > Do you think we can change it for the release 1.7.0? I already tried this
> > change in a local branch for a personal project and it works (although it
> > was OpenNLP 1.5.3).
> >
> > This would break API backward compatibility, but the exiting models would
> > not be affected.
> >
> > Thank you
> > William
> >
>

Re: Chunker - proposal to change API (break compatibility)

Reply via email to