[GitHub] opennlp pull request #136: OPENNLP-998 : Fixing Maven build on MacOS

2017-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/opennlp/pull/136 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

Re: Training perceptron model

2017-03-06 Thread Joern Kottmann
Yes, open an issue for the name samples, that should be fixed. Jörn On Mon, Mar 6, 2017 at 2:17 PM, Damiano Porta wrote: > I have to redesign it, reading the wiki you gave me i have noticed that i > should not create two partitions (one for trainiing and one for testing). > It avoids overfittin

Re: Training perceptron model

2017-03-06 Thread Damiano Porta
I have to redesign it, reading the wiki you gave me i have noticed that i should not create two partitions (one for trainiing and one for testing). It avoids overfitting, so i will pass all the data! Thanks Jorn! P.S. Did you read my previous email about the bug in namesamples? Should i open an is

Re: Training perceptron model

2017-03-06 Thread Damiano Porta
Oh I see. Thanks! Basically i have 30k sentences i apply the labels with a script and then i pass 0-15k to train the model (to build the .bin) and 15k-30k to evaluate it. I am trying to build the model with 300 iterations again. 2017-03-06 13:31 GMT+01:00 Joern Kottmann : > You should understan

Re: Training perceptron model

2017-03-06 Thread Joern Kottmann
You should understand how it works, have a look at this wikipedia article, the picture on the right side explains it quite nicely. https://en.wikipedia.org/wiki/Cross-validation_(statistics) The idea is to split the data into n partitions and then use n-1 for training and 1 for testing, this is re

Re: Training perceptron model

2017-03-06 Thread Damiano Porta
Unfortunately not, 100 iterations ~ 30 minutes 300 iterations > 2 days and it is still running... i will block it i still do not understand what number should i set as *folds*. Ok i will set a number > 1 but, should i have to pay more attention to this parameter? if i set 8 or 10 does it matter an

Re: Training perceptron model

2017-03-06 Thread Joern Kottmann
test.evaluate(samples, 1), here the second parameter is the number of folds, usually you use 10 or a number larger than 1. The amount of times you need for training with perceptron is linear to the iterations, if you use 300 instead of 100 it should take three times as long. Jörn On Mon, Mar 6,

Re: Training perceptron model

2017-03-06 Thread Damiano Porta
Jorn, I am training and testing the model via api. If it is not a training problem. How is that possible that the evaluation is taking 2 days (and still running) to evaluate the model? As i told you with 100 iterations i can get the model and the test in ~30 minutes. I only have a doubt about eval

Re: CUDA

2017-03-06 Thread Joern Kottmann
That is correct, we would be happy to merge a PR to change that. Jörn On Mon, Mar 6, 2017 at 10:49 AM, Damiano Porta wrote: > Jorn, i think it is really important. For the moment we should allow more > threads for perceptron training. If remember correctly it is only allowed > for MAXENT classi

Re: Training perceptron model

2017-03-06 Thread Joern Kottmann
Hello, the model is only available after the training finished, hard to guess what you are doing. Do you use the command line? Which command? Jörn On Mon, Mar 6, 2017 at 10:29 AM, Damiano Porta wrote: > Hello Jorn, > I tried with 300 iterations and it takes forever, reducing that number to >

Re: CUDA

2017-03-06 Thread Damiano Porta
Jorn, i think it is really important. For the moment we should allow more threads for perceptron training. If remember correctly it is only allowed for MAXENT classifier, right ? 2017-03-06 10:17 GMT+01:00 Joern Kottmann : > Hello, > > no, we don't support CUDA. At some point we probably add supp

Re: CUDA

2017-03-06 Thread Joern Kottmann
Hello, no, we don't support CUDA. At some point we probably add support for one of the deep learning packages and those usually use CUDA. Jörn On Sat, Mar 4, 2017 at 5:17 PM, Damiano Porta wrote: > Hello everybody, > > does OpenNLP support CUDA parallel computing? > > Damiano >

Re: Training perceptron model

2017-03-06 Thread Damiano Porta
Hello Jorn, I tried with 300 iterations and it takes forever, reducing that number to 100 i can finally get the model in half an hour. The problem with 300 iterations is that i can see the model (.bin) in half an hour too but the computations are still running. So i do not really understand what i

Re: Training perceptron model

2017-03-06 Thread Joern Kottmann
Hello, this looks like output from the cross validator. Jörn On Sun, Mar 5, 2017 at 11:34 AM, Damiano Porta wrote: > Hello, > > I am training a NER model with perceptron classifier (using OpenNLP 1.7.0) > > the output of the training is: > > Indexing events using cutoff of 0 > > Computing even