Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-17 Thread William Colen
Would be a pleasure. Let's prepare the next OpenNLP RC and I create a PR
with the update.



2017-05-16 14:36 GMT-03:00 Richard Eckart de Castilho :

> Hi William,
>
> > On 16.05.2017, at 14:35, William Colen  wrote:
> >
> > I cloned DKPro code and tried Rodrigo proposed changes. Your test passes
> > with it.
>
> cool :)
>
> Would you like to contribute the changes to DKPro Core?
>
> Cheers,
>
> -- Richard
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-16 Thread Richard Eckart de Castilho
Hi William,

> On 16.05.2017, at 14:35, William Colen  wrote:
> 
> I cloned DKPro code and tried Rodrigo proposed changes. Your test passes
> with it.

cool :) 

Would you like to contribute the changes to DKPro Core?

Cheers,

-- Richard


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-16 Thread William Colen
Hi Richard,

I cloned DKPro code and tried Rodrigo proposed changes. Your test passes
with it.

Thank you
William

2017-05-15 18:51 GMT-03:00 Rodrigo Agerri :

> Hello Richard,
>
> I have tried with various corpora, including GUM, but I cannot reproduce
> that error.
>
> https://github.com/apache/opennlp/commit/8a3b3b537a30b14c4ffb5eb32ffa41
> d5027bddad
>
> Please note that commit O-904 changed (broke) the lemmatizer API
> substantially to make it uniform between DictionaryLemmatizer and the
> LemmatizerME (e.g., doing the decoding of lemmas internally and so on) so
> that this line for tagging with the LemmatizerME is not required:
>
> https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499d
> ff6cbcd9ca/dkpro-core-opennlp-asl/src/main/java/de/
> tudarmstadt/ukp/dkpro/core/opennlp/OpenNlpLemmatizer.java#L135
>
> Also, that commit changed the LemmaSampleStream and LemmaSample classes, so
> it is possible that is affecting this class:
>
> https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499d
> ff6cbcd9ca/dkpro-core-opennlp-asl/src/main/java/de/
> tudarmstadt/ukp/dkpro/core/opennlp/internal/CasLemmaSampleStream.java
>
> I understand the logic of this class correctly as it stands it will take an
> already encoded SES and will try to encoded it again?
>
> Could you please take a look and see if that could be the problem?
>
> Cheers,
>
> Rodrigo
>
> On Mon, May 15, 2017 at 6:21 PM, Richard Eckart de Castilho <
> r...@apache.org>
> wrote:
>
> > > On 15.05.2017, at 16:35, Joern Kottmann  wrote:
> > >
> > > Richard, I believe I found the problem with the parser, would you mind
> to
> > > take a look?
> > >
> > > This PR should fix it:
> > > https://github.com/apache/opennlp/pull/199
> >
> > The parser test works nicely with the PR.
> >
> > The lemmatizer test still behaves strange.
> >
> > Cheers,
> >
> > -- Richard
> >
> >
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Rodrigo Agerri
Hello Richard,

I have tried with various corpora, including GUM, but I cannot reproduce
that error.

https://github.com/apache/opennlp/commit/8a3b3b537a30b14c4ffb5eb32ffa41
d5027bddad

Please note that commit O-904 changed (broke) the lemmatizer API
substantially to make it uniform between DictionaryLemmatizer and the
LemmatizerME (e.g., doing the decoding of lemmas internally and so on) so
that this line for tagging with the LemmatizerME is not required:

https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499dff6cbcd9ca/dkpro-core-opennlp-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/opennlp/OpenNlpLemmatizer.java#L135

Also, that commit changed the LemmaSampleStream and LemmaSample classes, so
it is possible that is affecting this class:

https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499dff6cbcd9ca/dkpro-core-opennlp-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/opennlp/internal/CasLemmaSampleStream.java

I understand the logic of this class correctly as it stands it will take an
already encoded SES and will try to encoded it again?

Could you please take a look and see if that could be the problem?

Cheers,

Rodrigo

On Mon, May 15, 2017 at 6:21 PM, Richard Eckart de Castilho 
wrote:

> > On 15.05.2017, at 16:35, Joern Kottmann  wrote:
> >
> > Richard, I believe I found the problem with the parser, would you mind to
> > take a look?
> >
> > This PR should fix it:
> > https://github.com/apache/opennlp/pull/199
>
> The parser test works nicely with the PR.
>
> The lemmatizer test still behaves strange.
>
> Cheers,
>
> -- Richard
>
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Joern Kottmann
Good to hear, the parser eval test also had a bug (O-1060), we will fix
this now as well before the next RC,
this should prevent that this happens again.

And thanks again for finding this!

Now we need to find the problem with the lemmatizer before we can build the
next RC.

Jörn



On Mon, May 15, 2017 at 6:21 PM, Richard Eckart de Castilho 
wrote:

> > On 15.05.2017, at 16:35, Joern Kottmann  wrote:
> >
> > Richard, I believe I found the problem with the parser, would you mind to
> > take a look?
> >
> > This PR should fix it:
> > https://github.com/apache/opennlp/pull/199
>
> The parser test works nicely with the PR.
>
> The lemmatizer test still behaves strange.
>
> Cheers,
>
> -- Richard
>
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Richard Eckart de Castilho
> On 15.05.2017, at 16:35, Joern Kottmann  wrote:
> 
> Richard, I believe I found the problem with the parser, would you mind to
> take a look?
> 
> This PR should fix it:
> https://github.com/apache/opennlp/pull/199

The parser test works nicely with the PR.

The lemmatizer test still behaves strange.

Cheers,

-- Richard



Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Joern Kottmann
Richard, I believe I found the problem with the parser, would you mind to
take a look?

This PR should fix it:
https://github.com/apache/opennlp/pull/199

Jörn

On Mon, May 15, 2017 at 4:14 PM, Richard Eckart de Castilho 
wrote:

> Hi Rodrigo,
>
> On 15.05.2017, at 15:36, Rodrigo Agerri  wrote:
> >
> > I cannot reproduce the lemmatizer issue. Could you please share your
> > training data?
>
> I have observed the change in behavior via the OpenNlpLemmatizerTrainerTest
> in DKPro Core [1]. It happens when I change the OpenNLP version in the POM
> from 1.7.2 to 1.8.0 (after including the OpenNLP staging Maven repo of
> course).
> Unfortunately, it's not a simple minimal OpenNLP-only unit test, but it
> makes used
> of the respective DKPro Core UIMA components.
>
> The data that is used is the GUM 3.0.0 corpus, specifically the CoNLL
> files in it [2].
>
> The corpus can be downloaded from: https://github.com/amir-
> zeldes/gum/archive/V3.0.0.zip
>
> Cheers,
>
> -- Richard
>
> [1] https://github.com/dkpro/dkpro-core/blob/
> 89f144a63b214cd584b3cd0e6c499dff6cbcd9ca/dkpro-core-opennlp-
> asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/opennlp/
> OpenNlpLemmatizerTrainerTest.java
> [2] https://github.com/dkpro/dkpro-core/blob/master/dkpro-
> core-api-datasets-asl/src/main/resources/de/tudarmstadt/
> ukp/dkpro/core/api/datasets/lib/gum-en-conll-3.0.0.yaml


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Richard Eckart de Castilho
Hi Rodrigo,

On 15.05.2017, at 15:36, Rodrigo Agerri  wrote:
> 
> I cannot reproduce the lemmatizer issue. Could you please share your
> training data?

I have observed the change in behavior via the OpenNlpLemmatizerTrainerTest
in DKPro Core [1]. It happens when I change the OpenNLP version in the POM
from 1.7.2 to 1.8.0 (after including the OpenNLP staging Maven repo of course).
Unfortunately, it's not a simple minimal OpenNLP-only unit test, but it makes 
used
of the respective DKPro Core UIMA components.

The data that is used is the GUM 3.0.0 corpus, specifically the CoNLL files in 
it [2].

The corpus can be downloaded from: 
https://github.com/amir-zeldes/gum/archive/V3.0.0.zip

Cheers,

-- Richard

[1] 
https://github.com/dkpro/dkpro-core/blob/89f144a63b214cd584b3cd0e6c499dff6cbcd9ca/dkpro-core-opennlp-asl/src/test/java/de/tudarmstadt/ukp/dkpro/core/opennlp/OpenNlpLemmatizerTrainerTest.java
[2] 
https://github.com/dkpro/dkpro-core/blob/master/dkpro-core-api-datasets-asl/src/main/resources/de/tudarmstadt/ukp/dkpro/core/api/datasets/lib/gum-en-conll-3.0.0.yaml

Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Rodrigo Agerri
Hello Richard,

I cannot reproduce the lemmatizer issue. Could you please share your
training data?

Best regards,

Rodrigo

On Sun, May 14, 2017 at 12:08 AM, Richard Eckart de Castilho  wrote:

>
> > On 13.05.2017, at 22:35, Richard Eckart de Castilho 
> wrote:
> >
> > Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> > training data is used during training?
> >
> > I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
> > this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.
>
> Also, this test which trains and evaluates a lemmatizer model
> takes ~8 sec with 1.7.2 and ~170 sec with 1.8.0. Even when only
> considering the training phase (no evaluation), the test runs
> much faster with 1.7.2 than with 1.8.0.
>
> Here are some details on the training phase.
>
> It seems odd that the events, outcomes, and predicates change that much.
>
> === 1.7.2
>
> done. 50697 events
> Indexing...  done.
> Sorting and merging events... done. Reduced 50697 events to 12675.
> Done indexing.
> Incorporating indexed data for training...
> done.
> Number of Event Tokens: 12675
> Number of Outcomes: 389
>   Number of Predicates: 13488
> ...done.
> Computing model parameters ...
> Performing 10 iterations.
>   1:  ... loglikelihood=-302335.58198350534 0.8420616604532812
>   2:  ... loglikelihood=-61602.20311717376  0.9492672150225852
>   3:  ... loglikelihood=-30747.954089148297 0.9769217113438665
>   4:  ... loglikelihood=-19986.853691639506 0.9850484249561118
>   5:  ... loglikelihood=-14672.523462458894 0.9881255301102629
>   6:  ... loglikelihood=-11572.587093608756 0.9893879322247865
>   7:  ... loglikelihood=-9571.242700030467  0.9900783083811665
>   8:  ... loglikelihood=-8185.39402892  0.9906897844053889
>   9:  ... loglikelihood=-7174.66904253965   0.9912223602974535
>  10:  ... loglikelihood=-6407.42781438460.9917746612225575
>
>
> === 1.8.0
>
> done. 50697 events
> Indexing...  done.
> Sorting and merging events... done. Reduced 50697 events to 26026.
> Done indexing.
> Incorporating indexed data for training...
> done.
> Number of Event Tokens: 26026
> Number of Outcomes: 7668
>   Number of Predicates: 15279
> ...done.
> Computing model parameters ...
> Performing 10 iterations.
>   1:  ... loglikelihood=-453475.08854769287 1.972503303943034E-5
>   2:  ... loglikelihood=-165718.68620632993 0.9509241177978973
>   3:  ... loglikelihood=-85388.42871190465  0.9761327100222893
>   4:  ... loglikelihood=-56404.00400621838  0.9892104069274316
>   5:  ... loglikelihood=-41004.08840359108  0.9938457896916977
>   6:  ... loglikelihood=-31539.64788603799  0.9955421425330887
>   7:  ... loglikelihood=-25264.889481438582 0.9964889441189814
>   8:  ... loglikelihood=-20883.72059438774  0.9972384953744797
>   9:  ... loglikelihood=-17699.228362701586 0.9977710712665444
>  10:  ... loglikelihood=-15306.654021266759 0.9980669467621358
>
>
> I also get some differences in f-score for other tests that train models,
> but not as significant as when training a lemmatizer model.
>
> -- Richard
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Joern Kottmann
I now tested a version which doesn't have that change and that one is also
broken, must have been caused by another commit.

Jörn



On Mon, May 15, 2017 at 11:24 AM, Richard Eckart de Castilho  wrote:

> Hi Jörn,
>
> unfortunately, I don't have a comprehensive unit test for the parser.
> But from my perspective, it looks like there is something more serious
> going on than just a bit of parse tree reordering when suddenly a
> determiner (DT, "a") is tagged as a punctuation mark (, ","):
>
> 1.7.2: (ROOT (S (NP (PRP We)) (VP (VBP need) *!*(NP (NP (DT a)*!* (ADJP
>(RB very) (VBN complicated)) (NN example) (NN sentence))
>(, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ contains) (PP
>(IN as) (NP (NP (JJ many) (NNS constituents) (CC and)
>(NNS dependencies)) (PP (IN as) (ADJP (JJ possible)) (. .)))
>
> 1.8.0: (ROOT (S (NP (PRP We)) (VP (VBP need) *!*(, a) (NP (NP*!* (ADJP
>(RB very) (VBN complicated)) (NN example) (NN sentence))
>(, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ contains) (PP
>(IN as) (NP (NP (JJ many) (NNS constituents) (CC and)
>(NNS dependencies)) (PP (IN as) (ADJP (JJ possible)) (. .)))
>
> IMHO punctuation marks and closed word classes like determiners should be
> pretty stable in their labels. There is usually no need for the parser to
> invent a tag and the labels seen in the training data should be sufficient.
>
> Could there maybe be a problem with duplicates being dropped silently
> by the move from the ListHeap to the TreeSet? If duplicate removal
> is not important, then maybe sorting the heap after it has been filled
> would be a better option than using a permanently sorted and de-duplicating
> data structure.
>
> Cheers,
>
> -- Richard
>
> > On 15.05.2017, at 10:39, Joern Kottmann  wrote:
> >
> > Hello Richard,
> >
> > thanks for reporting this. For 1.8.0 we replaced a Heap with a SortedSet
> > [1]. In this commit there is one loop [2] which iterates through the
> parses
> > which will be advanced. The order of the Parsers in the Heap was not so
> > well defined, therefore we decided to sort them by probability.
> > We also noticed that this change is changing the output of the parser
> with
> > the existing models in our SourceForge model eval test [3].
> >
> > After running the evaluation on the OntoNotes4 data set I only got  very
> > small change and decided it is ok to do this. I am not aware of how big
> the
> > change is but is was less than the delta in test case [4] of 0.001.
> >
> > What do you think? Should this be rolled back?
> >
> > Anyway, that said, about the parser, I still need to understand what
> > happened with the lemmatizer.
> >
> > Jörn
> >
> > [1]
> > https://github.com/apache/opennlp/commit/3df659b9bfb02084e78
> 2f1e8b6ec716f56e0611c
> > [2]
> > https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f
> 1e8b6ec716f56e0611c/opennlp-tools/src/main/java/opennlp/tools/parser/
> AbstractBottomUpParser.java#L285
> > [3]
> > https://github.com/apache/opennlp/commit/3df659b9bfb02084e78
> 2f1e8b6ec716f56e0611c#diff-a5834f32b8a41b76a336126e4b13d4f7L349
> > [4]
> > https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f
> 1e8b6ec716f56e0611c/opennlp-tools/src/test/java/opennlp/
> tools/eval/OntoNotes4ParserEval.java#L70
>
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Richard Eckart de Castilho
Hi Jörn,

unfortunately, I don't have a comprehensive unit test for the parser.
But from my perspective, it looks like there is something more serious
going on than just a bit of parse tree reordering when suddenly a
determiner (DT, "a") is tagged as a punctuation mark (, ","):

1.7.2: (ROOT (S (NP (PRP We)) (VP (VBP need) *!*(NP (NP (DT a)*!* (ADJP 
   (RB very) (VBN complicated)) (NN example) (NN sentence)) 
   (, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ contains) (PP 
   (IN as) (NP (NP (JJ many) (NNS constituents) (CC and) 
   (NNS dependencies)) (PP (IN as) (ADJP (JJ possible)) (. .)))

1.8.0: (ROOT (S (NP (PRP We)) (VP (VBP need) *!*(, a) (NP (NP*!* (ADJP 
   (RB very) (VBN complicated)) (NN example) (NN sentence))
   (, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ contains) (PP 
   (IN as) (NP (NP (JJ many) (NNS constituents) (CC and) 
   (NNS dependencies)) (PP (IN as) (ADJP (JJ possible)) (. .)))

IMHO punctuation marks and closed word classes like determiners should be
pretty stable in their labels. There is usually no need for the parser to
invent a tag and the labels seen in the training data should be sufficient.

Could there maybe be a problem with duplicates being dropped silently
by the move from the ListHeap to the TreeSet? If duplicate removal
is not important, then maybe sorting the heap after it has been filled
would be a better option than using a permanently sorted and de-duplicating
data structure.

Cheers,

-- Richard

> On 15.05.2017, at 10:39, Joern Kottmann  wrote:
> 
> Hello Richard,
> 
> thanks for reporting this. For 1.8.0 we replaced a Heap with a SortedSet
> [1]. In this commit there is one loop [2] which iterates through the parses
> which will be advanced. The order of the Parsers in the Heap was not so
> well defined, therefore we decided to sort them by probability.
> We also noticed that this change is changing the output of the parser with
> the existing models in our SourceForge model eval test [3].
> 
> After running the evaluation on the OntoNotes4 data set I only got  very
> small change and decided it is ok to do this. I am not aware of how big the
> change is but is was less than the delta in test case [4] of 0.001.
> 
> What do you think? Should this be rolled back?
> 
> Anyway, that said, about the parser, I still need to understand what
> happened with the lemmatizer.
> 
> Jörn
> 
> [1]
> https://github.com/apache/opennlp/commit/3df659b9bfb02084e782f1e8b6ec716f56e0611c
> [2]
> https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f1e8b6ec716f56e0611c/opennlp-tools/src/main/java/opennlp/tools/parser/AbstractBottomUpParser.java#L285
> [3]
> https://github.com/apache/opennlp/commit/3df659b9bfb02084e782f1e8b6ec716f56e0611c#diff-a5834f32b8a41b76a336126e4b13d4f7L349
> [4]
> https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f1e8b6ec716f56e0611c/opennlp-tools/src/test/java/opennlp/tools/eval/OntoNotes4ParserEval.java#L70



Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-15 Thread Joern Kottmann
Hello Richard,

thanks for reporting this. For 1.8.0 we replaced a Heap with a SortedSet
[1]. In this commit there is one loop [2] which iterates through the parses
which will be advanced. The order of the Parsers in the Heap was not so
well defined, therefore we decided to sort them by probability.
We also noticed that this change is changing the output of the parser with
the existing models in our SourceForge model eval test [3].

After running the evaluation on the OntoNotes4 data set I only got  very
small change and decided it is ok to do this. I am not aware of how big the
change is but is was less than the delta in test case [4] of 0.001.

What do you think? Should this be rolled back?

Anyway, that said, about the parser, I still need to understand what
happened with the lemmatizer.

Jörn

[1]
https://github.com/apache/opennlp/commit/3df659b9bfb02084e782f1e8b6ec716f56e0611c
[2]
https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f1e8b6ec716f56e0611c/opennlp-tools/src/main/java/opennlp/tools/parser/AbstractBottomUpParser.java#L285
[3]
https://github.com/apache/opennlp/commit/3df659b9bfb02084e782f1e8b6ec716f56e0611c#diff-a5834f32b8a41b76a336126e4b13d4f7L349
[4]
https://github.com/apache/opennlp/blob/3df659b9bfb02084e782f1e8b6ec716f56e0611c/opennlp-tools/src/test/java/opennlp/tools/eval/OntoNotes4ParserEval.java#L70

On Sat, May 13, 2017 at 10:35 PM, Richard Eckart de Castilho  wrote:

> Hi all,
>
> > On 11.05.2017, at 18:37, Joern Kottmann  wrote:
> >
> > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> > 1.8.0 Release Candidate 2.
>
> Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> models are used during classification?
>
> E.g. the English parser model seems to create different POS tags now
> for the sentence "We need a very complicated example sentence ,
> which contains as many constituents and dependencies as possible .".
> "a" is now wrongly tagged as "," whereas 1.7.2 tagged it correctly as "DT".
>
> Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> training data is used during training?
>
> I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
> this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.
>
> Cheers,
>
> -- Richard
>
>
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Suneel Marthi
This vote is now cancelled, will push out another RC after fixing the reported 
issues

-1 binding

Sent from my iPhone

> On May 13, 2017, at 8:44 PM, William Colen  wrote:
> 
> With the issues reported by Richard we should cancel the vote and rollback
> the release.
> 
> I change my vote to -1 (binding)
> 
> 2017-05-13 19:08 GMT-03:00 Richard Eckart de Castilho :
> 
>> 
>>> On 13.05.2017, at 22:35, Richard Eckart de Castilho 
>> wrote:
>>> 
>>> Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
>>> training data is used during training?
>>> 
>>> I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
>>> this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.
>> 
>> Also, this test which trains and evaluates a lemmatizer model
>> takes ~8 sec with 1.7.2 and ~170 sec with 1.8.0. Even when only
>> considering the training phase (no evaluation), the test runs
>> much faster with 1.7.2 than with 1.8.0.
>> 
>> Here are some details on the training phase.
>> 
>> It seems odd that the events, outcomes, and predicates change that much.
>> 
>> === 1.7.2
>> 
>> done. 50697 events
>>Indexing...  done.
>> Sorting and merging events... done. Reduced 50697 events to 12675.
>> Done indexing.
>> Incorporating indexed data for training...
>> done.
>>Number of Event Tokens: 12675
>>Number of Outcomes: 389
>>  Number of Predicates: 13488
>> ...done.
>> Computing model parameters ...
>> Performing 10 iterations.
>>  1:  ... loglikelihood=-302335.58198350534 0.8420616604532812
>>  2:  ... loglikelihood=-61602.20311717376  0.9492672150225852
>>  3:  ... loglikelihood=-30747.954089148297 0.9769217113438665
>>  4:  ... loglikelihood=-19986.853691639506 0.9850484249561118
>>  5:  ... loglikelihood=-14672.523462458894 0.9881255301102629
>>  6:  ... loglikelihood=-11572.587093608756 0.9893879322247865
>>  7:  ... loglikelihood=-9571.242700030467  0.9900783083811665
>>  8:  ... loglikelihood=-8185.39402892  0.9906897844053889
>>  9:  ... loglikelihood=-7174.66904253965   0.9912223602974535
>> 10:  ... loglikelihood=-6407.42781438460.9917746612225575
>> 
>> 
>> === 1.8.0
>> 
>> done. 50697 events
>>Indexing...  done.
>> Sorting and merging events... done. Reduced 50697 events to 26026.
>> Done indexing.
>> Incorporating indexed data for training...
>> done.
>>Number of Event Tokens: 26026
>>Number of Outcomes: 7668
>>  Number of Predicates: 15279
>> ...done.
>> Computing model parameters ...
>> Performing 10 iterations.
>>  1:  ... loglikelihood=-453475.08854769287 1.972503303943034E-5
>>  2:  ... loglikelihood=-165718.68620632993 0.9509241177978973
>>  3:  ... loglikelihood=-85388.42871190465  0.9761327100222893
>>  4:  ... loglikelihood=-56404.00400621838  0.9892104069274316
>>  5:  ... loglikelihood=-41004.08840359108  0.9938457896916977
>>  6:  ... loglikelihood=-31539.64788603799  0.9955421425330887
>>  7:  ... loglikelihood=-25264.889481438582 0.9964889441189814
>>  8:  ... loglikelihood=-20883.72059438774  0.9972384953744797
>>  9:  ... loglikelihood=-17699.228362701586 0.9977710712665444
>> 10:  ... loglikelihood=-15306.654021266759 0.9980669467621358
>> 
>> 
>> I also get some differences in f-score for other tests that train models,
>> but not as significant as when training a lemmatizer model.
>> 
>> -- Richard
>> 


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread William Colen
With the issues reported by Richard we should cancel the vote and rollback
the release.

I change my vote to -1 (binding)

2017-05-13 19:08 GMT-03:00 Richard Eckart de Castilho :

>
> > On 13.05.2017, at 22:35, Richard Eckart de Castilho 
> wrote:
> >
> > Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> > training data is used during training?
> >
> > I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
> > this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.
>
> Also, this test which trains and evaluates a lemmatizer model
> takes ~8 sec with 1.7.2 and ~170 sec with 1.8.0. Even when only
> considering the training phase (no evaluation), the test runs
> much faster with 1.7.2 than with 1.8.0.
>
> Here are some details on the training phase.
>
> It seems odd that the events, outcomes, and predicates change that much.
>
> === 1.7.2
>
> done. 50697 events
> Indexing...  done.
> Sorting and merging events... done. Reduced 50697 events to 12675.
> Done indexing.
> Incorporating indexed data for training...
> done.
> Number of Event Tokens: 12675
> Number of Outcomes: 389
>   Number of Predicates: 13488
> ...done.
> Computing model parameters ...
> Performing 10 iterations.
>   1:  ... loglikelihood=-302335.58198350534 0.8420616604532812
>   2:  ... loglikelihood=-61602.20311717376  0.9492672150225852
>   3:  ... loglikelihood=-30747.954089148297 0.9769217113438665
>   4:  ... loglikelihood=-19986.853691639506 0.9850484249561118
>   5:  ... loglikelihood=-14672.523462458894 0.9881255301102629
>   6:  ... loglikelihood=-11572.587093608756 0.9893879322247865
>   7:  ... loglikelihood=-9571.242700030467  0.9900783083811665
>   8:  ... loglikelihood=-8185.39402892  0.9906897844053889
>   9:  ... loglikelihood=-7174.66904253965   0.9912223602974535
>  10:  ... loglikelihood=-6407.42781438460.9917746612225575
>
>
> === 1.8.0
>
> done. 50697 events
> Indexing...  done.
> Sorting and merging events... done. Reduced 50697 events to 26026.
> Done indexing.
> Incorporating indexed data for training...
> done.
> Number of Event Tokens: 26026
> Number of Outcomes: 7668
>   Number of Predicates: 15279
> ...done.
> Computing model parameters ...
> Performing 10 iterations.
>   1:  ... loglikelihood=-453475.08854769287 1.972503303943034E-5
>   2:  ... loglikelihood=-165718.68620632993 0.9509241177978973
>   3:  ... loglikelihood=-85388.42871190465  0.9761327100222893
>   4:  ... loglikelihood=-56404.00400621838  0.9892104069274316
>   5:  ... loglikelihood=-41004.08840359108  0.9938457896916977
>   6:  ... loglikelihood=-31539.64788603799  0.9955421425330887
>   7:  ... loglikelihood=-25264.889481438582 0.9964889441189814
>   8:  ... loglikelihood=-20883.72059438774  0.9972384953744797
>   9:  ... loglikelihood=-17699.228362701586 0.9977710712665444
>  10:  ... loglikelihood=-15306.654021266759 0.9980669467621358
>
>
> I also get some differences in f-score for other tests that train models,
> but not as significant as when training a lemmatizer model.
>
> -- Richard
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Richard Eckart de Castilho

> On 13.05.2017, at 22:35, Richard Eckart de Castilho  wrote:
> 
> Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
> training data is used during training?
> 
> I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
> this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.

Also, this test which trains and evaluates a lemmatizer model
takes ~8 sec with 1.7.2 and ~170 sec with 1.8.0. Even when only
considering the training phase (no evaluation), the test runs
much faster with 1.7.2 than with 1.8.0.

Here are some details on the training phase.

It seems odd that the events, outcomes, and predicates change that much. 

=== 1.7.2

done. 50697 events
Indexing...  done.
Sorting and merging events... done. Reduced 50697 events to 12675.
Done indexing.
Incorporating indexed data for training...  
done.
Number of Event Tokens: 12675
Number of Outcomes: 389
  Number of Predicates: 13488
...done.
Computing model parameters ...
Performing 10 iterations.
  1:  ... loglikelihood=-302335.58198350534 0.8420616604532812
  2:  ... loglikelihood=-61602.20311717376  0.9492672150225852
  3:  ... loglikelihood=-30747.954089148297 0.9769217113438665
  4:  ... loglikelihood=-19986.853691639506 0.9850484249561118
  5:  ... loglikelihood=-14672.523462458894 0.9881255301102629
  6:  ... loglikelihood=-11572.587093608756 0.9893879322247865
  7:  ... loglikelihood=-9571.242700030467  0.9900783083811665
  8:  ... loglikelihood=-8185.39402892  0.9906897844053889
  9:  ... loglikelihood=-7174.66904253965   0.9912223602974535
 10:  ... loglikelihood=-6407.42781438460.9917746612225575


=== 1.8.0

done. 50697 events
Indexing...  done.
Sorting and merging events... done. Reduced 50697 events to 26026.
Done indexing.
Incorporating indexed data for training...  
done.
Number of Event Tokens: 26026
Number of Outcomes: 7668
  Number of Predicates: 15279
...done.
Computing model parameters ...
Performing 10 iterations.
  1:  ... loglikelihood=-453475.08854769287 1.972503303943034E-5
  2:  ... loglikelihood=-165718.68620632993 0.9509241177978973
  3:  ... loglikelihood=-85388.42871190465  0.9761327100222893
  4:  ... loglikelihood=-56404.00400621838  0.9892104069274316
  5:  ... loglikelihood=-41004.08840359108  0.9938457896916977
  6:  ... loglikelihood=-31539.64788603799  0.9955421425330887
  7:  ... loglikelihood=-25264.889481438582 0.9964889441189814
  8:  ... loglikelihood=-20883.72059438774  0.9972384953744797
  9:  ... loglikelihood=-17699.228362701586 0.9977710712665444
 10:  ... loglikelihood=-15306.654021266759 0.9980669467621358


I also get some differences in f-score for other tests that train models,
but not as significant as when training a lemmatizer model.

-- Richard


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Richard Eckart de Castilho
Hi all,

> On 11.05.2017, at 18:37, Joern Kottmann  wrote:
> 
> The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> 1.8.0 Release Candidate 2. 

Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
models are used during classification?

E.g. the English parser model seems to create different POS tags now
for the sentence "We need a very complicated example sentence , 
which contains as many constituents and dependencies as possible .".
"a" is now wrongly tagged as "," whereas 1.7.2 tagged it correctly as "DT".

Should OpenNLP 1.8.0 yield identical results as 1.7.2 when the same
training data is used during training?

I have a test that trains a lemmatizer model on GUM 3.0.0. With 1.7.2,
this model reached an f-score of ~0.96. With 1.8.0, I only get ~0.84.

Cheers,

-- Richard




Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Eldad Yamin
Unsubscribe

On May 11, 2017 19:38, "Joern Kottmann"  wrote:

> The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> 1.8.0 Release Candidate 2.
>
> The RC 2 distributables can be downloaded from here:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2/org/apache/opennlp/opennlp-distr/1.8.0/
>
> The release was made from the Apache OpenNLP 1.8.0 tag at
> https://github.com/apache/opennlp/tree/opennlp-1.8.0
>
> To use it in a maven build set the version for opennlp-tools or
> opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> file:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2
>
> The release was made using the OpenNLP release process, documented on
> the Wiki here:
> https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
>
> The release contains quite some changes, please refer to the contained
> issue list for details.
>
> Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> vote is open for at least the next 72 hours.
>
> Only votes from OpenNLP PMC are binding, but folks are welcome to check
> the release candidate and voice their approval or disapproval. The vote
> passes if at least three binding +1 votes are cast.
>
> [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> [ ] -1 Do not release the packages because...
>
>
> Thanks!
>
> Jörn
>
> P.S. Here is my +1.
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Anthony Beylerian
+1 non-biding

Built and tests passing on macOS Sierra 10.12.4


From: Joern Kottmann 
Sent: Friday, May 12, 2017 1:37 AM
To: dev@opennlp.apache.org
Subject: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
1.8.0 Release Candidate 2.

The RC 2 distributables can be downloaded from here:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2/org/apache/opennlp/opennlp-distr/1.8.0/

The release was made from the Apache OpenNLP 1.8.0 tag at
https://github.com/apache/opennlp/tree/opennlp-1.8.0
[https://avatars1.githubusercontent.com/u/47359?v=3&s=400]<https://github.com/apache/opennlp/tree/opennlp-1.8.0>

apache/opennlp<https://github.com/apache/opennlp/tree/opennlp-1.8.0>
github.com
opennlp - Mirror of Apache OpenNLP




To use it in a maven build set the version for opennlp-tools or
opennlp-uima to 1.8.0 and add the following URL to your settings.xml
file:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2

The release was made using the OpenNLP release process, documented on
the Wiki here:
https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process

The release contains quite some changes, please refer to the contained
issue list for details.

Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
vote is open for at least the next 72 hours.

Only votes from OpenNLP PMC are binding, but folks are welcome to check
the release candidate and voice their approval or disapproval. The vote
passes if at least three binding +1 votes are cast.

[ ] +1 Release the packages as Apache OpenNLP 1.8.0
[ ] -1 Do not release the packages because...


Thanks!

Jörn

P.S. Here is my +1.


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-13 Thread Koji Sekiguchi

+1 non-binding

Download artifacts, built and executed unit tests successfully on Mac OS X 
10.10.5.


On 2017/05/12 1:37, Joern Kottmann wrote:

The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
1.8.0 Release Candidate 2.

The RC 2 distributables can be downloaded from here:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2/org/apache/opennlp/opennlp-distr/1.8.0/

The release was made from the Apache OpenNLP 1.8.0 tag at
https://github.com/apache/opennlp/tree/opennlp-1.8.0

To use it in a maven build set the version for opennlp-tools or
opennlp-uima to 1.8.0 and add the following URL to your settings.xml
file:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2

The release was made using the OpenNLP release process, documented on
the Wiki here:
https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process

The release contains quite some changes, please refer to the contained
issue list for details.

Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
vote is open for at least the next 72 hours.

Only votes from OpenNLP PMC are binding, but folks are welcome to check
the release candidate and voice their approval or disapproval. The vote
passes if at least three binding +1 votes are cast.

[ ] +1 Release the packages as Apache OpenNLP 1.8.0
[ ] -1 Do not release the packages because...


Thanks!

Jörn

P.S. Here is my +1.





Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Daniel Russ
+1 binding
Completed my evaluation on my external code.  All tests passed.


> On May 12, 2017, at 11:51 AM, William Colen  wrote:
> 
> +1 binding
> Executed the complete evaluation suite, both in source distribution and the
> git tag. Integrated and tested with other tools.
> 
> 
> 2017-05-12 9:48 GMT-03:00 Joern Kottmann :
> 
>> The vote is still open and we won't close it before the entire active PMC
>> voted or the time passed.
>> 
>> Jörn
>> 
>> On Fri, May 12, 2017 at 2:29 PM, Daniel Russ  wrote:
>> 
>>> Even though we have enough binding votes to release, can I have a few
>> hours
>>> to complete testing of my code with 1.8.0RC2 before release.
>>> Daniel
>>> 
>>> On May 11, 2017 12:38 PM, "Joern Kottmann"  wrote:
>>> 
 The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
 1.8.0 Release Candidate 2.
 
 The RC 2 distributables can be downloaded from here:
 https://repository.apache.org/content/repositories/
>> orgapacheopennlp-101
 2/org/apache/opennlp/opennlp-distr/1.8.0/
 
 The release was made from the Apache OpenNLP 1.8.0 tag at
 https://github.com/apache/opennlp/tree/opennlp-1.8.0
 
 To use it in a maven build set the version for opennlp-tools or
 opennlp-uima to 1.8.0 and add the following URL to your settings.xml
 file:
 https://repository.apache.org/content/repositories/
>> orgapacheopennlp-101
 2
 
 The release was made using the OpenNLP release process, documented on
 the Wiki here:
 https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
 
 The release contains quite some changes, please refer to the contained
 issue list for details.
 
 Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
 vote is open for at least the next 72 hours.
 
 Only votes from OpenNLP PMC are binding, but folks are welcome to check
 the release candidate and voice their approval or disapproval. The vote
 passes if at least three binding +1 votes are cast.
 
 [ ] +1 Release the packages as Apache OpenNLP 1.8.0
 [ ] -1 Do not release the packages because...
 
 
 Thanks!
 
 Jörn
 
 P.S. Here is my +1.
 
>>> 
>> 



Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread William Colen
+1 binding
Executed the complete evaluation suite, both in source distribution and the
git tag. Integrated and tested with other tools.


2017-05-12 9:48 GMT-03:00 Joern Kottmann :

> The vote is still open and we won't close it before the entire active PMC
> voted or the time passed.
>
> Jörn
>
> On Fri, May 12, 2017 at 2:29 PM, Daniel Russ  wrote:
>
> > Even though we have enough binding votes to release, can I have a few
> hours
> > to complete testing of my code with 1.8.0RC2 before release.
> > Daniel
> >
> > On May 11, 2017 12:38 PM, "Joern Kottmann"  wrote:
> >
> > > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> > > 1.8.0 Release Candidate 2.
> > >
> > > The RC 2 distributables can be downloaded from here:
> > > https://repository.apache.org/content/repositories/
> orgapacheopennlp-101
> > > 2/org/apache/opennlp/opennlp-distr/1.8.0/
> > >
> > > The release was made from the Apache OpenNLP 1.8.0 tag at
> > > https://github.com/apache/opennlp/tree/opennlp-1.8.0
> > >
> > > To use it in a maven build set the version for opennlp-tools or
> > > opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> > > file:
> > > https://repository.apache.org/content/repositories/
> orgapacheopennlp-101
> > > 2
> > >
> > > The release was made using the OpenNLP release process, documented on
> > > the Wiki here:
> > > https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
> > >
> > > The release contains quite some changes, please refer to the contained
> > > issue list for details.
> > >
> > > Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> > > vote is open for at least the next 72 hours.
> > >
> > > Only votes from OpenNLP PMC are binding, but folks are welcome to check
> > > the release candidate and voice their approval or disapproval. The vote
> > > passes if at least three binding +1 votes are cast.
> > >
> > > [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> > > [ ] -1 Do not release the packages because...
> > >
> > >
> > > Thanks!
> > >
> > > Jörn
> > >
> > > P.S. Here is my +1.
> > >
> >
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Joern Kottmann
The vote is still open and we won't close it before the entire active PMC
voted or the time passed.

Jörn

On Fri, May 12, 2017 at 2:29 PM, Daniel Russ  wrote:

> Even though we have enough binding votes to release, can I have a few hours
> to complete testing of my code with 1.8.0RC2 before release.
> Daniel
>
> On May 11, 2017 12:38 PM, "Joern Kottmann"  wrote:
>
> > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> > 1.8.0 Release Candidate 2.
> >
> > The RC 2 distributables can be downloaded from here:
> > https://repository.apache.org/content/repositories/orgapacheopennlp-101
> > 2/org/apache/opennlp/opennlp-distr/1.8.0/
> >
> > The release was made from the Apache OpenNLP 1.8.0 tag at
> > https://github.com/apache/opennlp/tree/opennlp-1.8.0
> >
> > To use it in a maven build set the version for opennlp-tools or
> > opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> > file:
> > https://repository.apache.org/content/repositories/orgapacheopennlp-101
> > 2
> >
> > The release was made using the OpenNLP release process, documented on
> > the Wiki here:
> > https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
> >
> > The release contains quite some changes, please refer to the contained
> > issue list for details.
> >
> > Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> > vote is open for at least the next 72 hours.
> >
> > Only votes from OpenNLP PMC are binding, but folks are welcome to check
> > the release candidate and voice their approval or disapproval. The vote
> > passes if at least three binding +1 votes are cast.
> >
> > [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> > [ ] -1 Do not release the packages because...
> >
> >
> > Thanks!
> >
> > Jörn
> >
> > P.S. Here is my +1.
> >
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Daniel Russ
Even though we have enough binding votes to release, can I have a few hours
to complete testing of my code with 1.8.0RC2 before release.
Daniel

On May 11, 2017 12:38 PM, "Joern Kottmann"  wrote:

> The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> 1.8.0 Release Candidate 2.
>
> The RC 2 distributables can be downloaded from here:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2/org/apache/opennlp/opennlp-distr/1.8.0/
>
> The release was made from the Apache OpenNLP 1.8.0 tag at
> https://github.com/apache/opennlp/tree/opennlp-1.8.0
>
> To use it in a maven build set the version for opennlp-tools or
> opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> file:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2
>
> The release was made using the OpenNLP release process, documented on
> the Wiki here:
> https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
>
> The release contains quite some changes, please refer to the contained
> issue list for details.
>
> Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> vote is open for at least the next 72 hours.
>
> Only votes from OpenNLP PMC are binding, but folks are welcome to check
> the release candidate and voice their approval or disapproval. The vote
> passes if at least three binding +1 votes are cast.
>
> [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> [ ] -1 Do not release the packages because...
>
>
> Thanks!
>
> Jörn
>
> P.S. Here is my +1.
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Jeff Zemerick
+1 non-binding

Built and tested on Ubuntu 16.04 and Amazon Linux 2017.03.0 with OpenJDK8.
NOTICE and LICENSE files look good.
Created and tested a token name finder model.


On Fri, May 12, 2017 at 4:42 AM, Tommaso Teofili 
wrote:

> +1 (binding)
>
> - source distr build succeeds
> - build from tag succeeds
> - signatures and hashes ok
>
> Regards,
> Tommaso
>
> Il giorno ven 12 mag 2017 alle ore 01:11 Suneel Marthi  >
> ha scritto:
>
> > +1 binding
> >
> > 1. Downloaded artifacts and ran thru a clean build - all unit tests pass
> > 2. verified sigs and hashes
> >
> > On Thu, May 11, 2017 at 9:37 AM, Joern Kottmann 
> > wrote:
> >
> > > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> > > 1.8.0 Release Candidate 2.
> > >
> > > The RC 2 distributables can be downloaded from here:
> > > https://repository.apache.org/content/repositories/
> orgapacheopennlp-101
> > > 2/org/apache/opennlp/opennlp-distr/1.8.0/
> > >
> > > The release was made from the Apache OpenNLP 1.8.0 tag at
> > > https://github.com/apache/opennlp/tree/opennlp-1.8.0
> > >
> > > To use it in a maven build set the version for opennlp-tools or
> > > opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> > > file:
> > > https://repository.apache.org/content/repositories/
> orgapacheopennlp-101
> > > 2
> > >
> > > The release was made using the OpenNLP release process, documented on
> > > the Wiki here:
> > > https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
> > >
> > > The release contains quite some changes, please refer to the contained
> > > issue list for details.
> > >
> > > Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> > > vote is open for at least the next 72 hours.
> > >
> > > Only votes from OpenNLP PMC are binding, but folks are welcome to check
> > > the release candidate and voice their approval or disapproval. The vote
> > > passes if at least three binding +1 votes are cast.
> > >
> > > [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> > > [ ] -1 Do not release the packages because...
> > >
> > >
> > > Thanks!
> > >
> > > Jörn
> > >
> > > P.S. Here is my +1.
> > >
> >
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Tommaso Teofili
+1 (binding)

- source distr build succeeds
- build from tag succeeds
- signatures and hashes ok

Regards,
Tommaso

Il giorno ven 12 mag 2017 alle ore 01:11 Suneel Marthi 
ha scritto:

> +1 binding
>
> 1. Downloaded artifacts and ran thru a clean build - all unit tests pass
> 2. verified sigs and hashes
>
> On Thu, May 11, 2017 at 9:37 AM, Joern Kottmann 
> wrote:
>
> > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> > 1.8.0 Release Candidate 2.
> >
> > The RC 2 distributables can be downloaded from here:
> > https://repository.apache.org/content/repositories/orgapacheopennlp-101
> > 2/org/apache/opennlp/opennlp-distr/1.8.0/
> >
> > The release was made from the Apache OpenNLP 1.8.0 tag at
> > https://github.com/apache/opennlp/tree/opennlp-1.8.0
> >
> > To use it in a maven build set the version for opennlp-tools or
> > opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> > file:
> > https://repository.apache.org/content/repositories/orgapacheopennlp-101
> > 2
> >
> > The release was made using the OpenNLP release process, documented on
> > the Wiki here:
> > https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
> >
> > The release contains quite some changes, please refer to the contained
> > issue list for details.
> >
> > Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> > vote is open for at least the next 72 hours.
> >
> > Only votes from OpenNLP PMC are binding, but folks are welcome to check
> > the release candidate and voice their approval or disapproval. The vote
> > passes if at least three binding +1 votes are cast.
> >
> > [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> > [ ] -1 Do not release the packages because...
> >
> >
> > Thanks!
> >
> > Jörn
> >
> > P.S. Here is my +1.
> >
>


Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-11 Thread Suneel Marthi
+1 binding

1. Downloaded artifacts and ran thru a clean build - all unit tests pass
2. verified sigs and hashes

On Thu, May 11, 2017 at 9:37 AM, Joern Kottmann  wrote:

> The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
> 1.8.0 Release Candidate 2.
>
> The RC 2 distributables can be downloaded from here:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2/org/apache/opennlp/opennlp-distr/1.8.0/
>
> The release was made from the Apache OpenNLP 1.8.0 tag at
> https://github.com/apache/opennlp/tree/opennlp-1.8.0
>
> To use it in a maven build set the version for opennlp-tools or
> opennlp-uima to 1.8.0 and add the following URL to your settings.xml
> file:
> https://repository.apache.org/content/repositories/orgapacheopennlp-101
> 2
>
> The release was made using the OpenNLP release process, documented on
> the Wiki here:
> https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
>
> The release contains quite some changes, please refer to the contained
> issue list for details.
>
> Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
> vote is open for at least the next 72 hours.
>
> Only votes from OpenNLP PMC are binding, but folks are welcome to check
> the release candidate and voice their approval or disapproval. The vote
> passes if at least three binding +1 votes are cast.
>
> [ ] +1 Release the packages as Apache OpenNLP 1.8.0
> [ ] -1 Do not release the packages because...
>
>
> Thanks!
>
> Jörn
>
> P.S. Here is my +1.
>


[VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-11 Thread Joern Kottmann
The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP
1.8.0 Release Candidate 2. 

The RC 2 distributables can be downloaded from here:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2/org/apache/opennlp/opennlp-distr/1.8.0/

The release was made from the Apache OpenNLP 1.8.0 tag at
https://github.com/apache/opennlp/tree/opennlp-1.8.0
 
To use it in a maven build set the version for opennlp-tools or
opennlp-uima to 1.8.0 and add the following URL to your settings.xml
file:
https://repository.apache.org/content/repositories/orgapacheopennlp-101
2
 
The release was made using the OpenNLP release process, documented on
the Wiki here:
https://cwiki.apache.org/confluence/display/OPENNLP/Release+Process
 
The release contains quite some changes, please refer to the contained
issue list for details.
 
Please vote on releasing these packages as Apache OpenNLP 1.8.0. The
vote is open for at least the next 72 hours.
 
Only votes from OpenNLP PMC are binding, but folks are welcome to check
the release candidate and voice their approval or disapproval. The vote
passes if at least three binding +1 votes are cast.
 
[ ] +1 Release the packages as Apache OpenNLP 1.8.0
[ ] -1 Do not release the packages because...
 
 
Thanks!

Jörn

P.S. Here is my +1.