Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-09 Thread Rodrigo Agerri
Hello, On Wed, Oct 8, 2014 at 7:32 PM, Jörn Kottmann wrote: > You observed this earlier: >> Only one issue remains: The requirement to add -factory parameter for >> the -featuregen parameter to work and its backing-off to default >> features without warning if the -factory param is not used. > >

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-08 Thread Mark G
Rodrigo, thanks for this fix, let me know if you want me to test, or just put a second set of eyes on any particular parts. MG On Wed, Oct 8, 2014 at 2:32 PM, Jörn Kottmann wrote: > Well done, this was a serious regression. > > You observed this earlier: > > Only one issue remains: The requireme

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-08 Thread Jörn Kottmann
Well done, this was a serious regression. You observed this earlier: > Only one issue remains: The requirement to add -factory parameter for > the -featuregen parameter to work and its backing-off to default > features without warning if the -factory param is not used. Is that still the case with

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-08 Thread Rodrigo Agerri
Hello, On Wed, Oct 8, 2014 at 8:17 AM, Jörn Kottmann wrote: > > +1 for the first option. Great, I have commit and close the issue. Thanks! Rodrigo

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-08 Thread Jörn Kottmann
On 10/07/2014 06:40 PM, Rodrigo Agerri wrote: Hello, One question regarding the WordClusterFeatureGenerator implementation which I am using as template for the Brown features and so on. I cannot seem to make it work, it complains all the time that the value of the attribute "dict" I provide is n

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-08 Thread Jörn Kottmann
On 10/06/2014 11:35 PM, Rodrigo Agerri wrote: Hi, On Mon, Oct 6, 2014 at 11:19 PM, Jörn Kottmann wrote: I see two ways to fix this: - The way you suggested, by extracting the XMÖ descriptor from the TokenNameFinderFactory - Or by returning the XML descriptor as part of the TokenNameFinder.get

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-07 Thread Rodrigo Agerri
Hello, One question regarding the WordClusterFeatureGenerator implementation which I am using as template for the Brown features and so on. I cannot seem to make it work, it complains all the time that the value of the attribute "dict" I provide is not an instance of a W2VClassesDictionary: Excep

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
Hi, On Mon, Oct 6, 2014 at 11:19 PM, Jörn Kottmann wrote: > I see two ways to fix this: > - The way you suggested, by extracting the XMÖ descriptor from the > TokenNameFinderFactory > - Or by returning the XML descriptor as part of the > TokenNameFinder.getResources() method. The first one is c

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Jörn Kottmann
Hello, now I understand it much better. Previously it was only possible to provide the XML descriptor as bytes or an instance of a Feature Generator to the train method. The method which accepted the XML descriptor then instantiated it and called the train method which takes a Feature Generator,

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
Hi, On Mon, Oct 6, 2014 at 5:41 PM, Jörn Kottmann wrote: > > Isn't that how it is implemented today? The feature generators can't be > shared > and therefore we have the createFeatureGenerators method in the > TokenNameFinderFactory > which creates a new feature generator every time one is needed

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Jörn Kottmann
On 10/06/2014 04:49 PM, Rodrigo Agerri wrote: As I said, I have issue 717 solved by adding a getter for the featureGenerator in the TokenNameFactory and using that getter to parametrized correctly the creation of the TokenNameFinderModel after training. Isn't that how it is implemented today? T

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
On Mon, Oct 6, 2014 at 4:35 PM, Jörn Kottmann wrote: > On 10/04/2014 12:53 AM, Rodrigo Agerri wrote: >> >> Hi, >> >> As a followed up, it turns out that currently we can provide a feature >> generator via -featuregen parameter if you provide a subclass via the >> -factory parameter only. I do not

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
Cool, As I said, I have issue 717 solved by adding a getter for the featureGenerator in the TokenNameFactory and using that getter to parametrized correctly the creation of the TokenNameFinderModel after training. But maybe another solution is possible, of course. R On Mon, Oct 6, 2014 at 4:46

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Jörn Kottmann
On 10/06/2014 02:04 PM, Rodrigo Agerri wrote: All these problems are solved as per this issue: https://issues.apache.org/jira/browse/OPENNLP-717 Only one issue remains: The requirement to add -factory parameter for the -featuregen parameter to work and its backing-off to default features withou

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Jörn Kottmann
On 10/04/2014 12:53 AM, Rodrigo Agerri wrote: Hi, As a followed up, it turns out that currently we can provide a feature generator via -featuregen parameter if you provide a subclass via the -factory parameter only. I do not know if that is intended. It is not. The featuregen parameter works (

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
Hi, As a followed up, it turns out that currently we can provide a feature generator via -featuregen parameter if you provide a subclass via the -factory parameter only. I do not know if that is intended. Also, I have noticed a very weird behaviour: I pass several descriptors via CLI (starting wit

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-06 Thread Rodrigo Agerri
Hi again, All these problems are solved as per this issue: https://issues.apache.org/jira/browse/OPENNLP-717 Only one issue remains: The requirement to add -factory parameter for the -featuregen parameter to work and its backing-off to default features without warning if the -factory param is no

Re: [opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-03 Thread Jörn Kottmann
On 10/03/2014 11:58 AM, Rodrigo Agerri wrote: I have implemented a number of new features for the name finder. These include Brown clusters features (duplicated per Brown path for each feature activated involving a token) and Clark cluster features (similar to the WordClusterFeatureGenerator curr

[opennlp-dev] TokenNameFinderFactory new features and extension

2014-10-03 Thread Rodrigo Agerri
Hello, I have implemented a number of new features for the name finder. These include Brown clusters features (duplicated per Brown path for each feature activated involving a token) and Clark cluster features (similar to the WordClusterFeatureGenerator currently available) among other local extra