Thanks, now I understand it better, I think that this information should go
into doc, because the information about these values in the documentation
is not very clear.



2012/7/7 James Kosin <[email protected]>

> Daniel,
>
> The cutoff value is for filtering out 1-time or special cases that may
> hinder training.  If the training data doesn't contain at least 5
> occurrences, when set to 5, in the same context it will ignore the
> training data.  This happens when filtering the data and determining the
> number of outcomes for the model to be trained for to find.
>
> The iterations are how many passes through the training set the trainer
> will attempt before stopping.  With the Maxent models this helps train
> faster by keeping the number of iterations small.  This value really
> depends on the model generated and the type.  Most of the testing is
> done with 100 iterations just to get the training done quickly, due to
> the size of the training data-sets used sometimes.  The key features are
> two numbers that get printed for each run (iteration), they indicate how
> close to trained on the data-set they are.  Be careful, the trick is to
> train and not get the models to memorize the training set... this is
> because the training set is only a snapshot in time... news article
> limitation currently in most of the training; but, too small is also bad.
>
> There is also another stopping point, when the model is completely
> trained and knows the training set.  This happens when the numbers get
> to optimal values... In my guess when the model predicts the training
> set perfectly.
>
> James
>
>
> On 7/5/2012 6:04 AM, Daniel wrote:
> > when I am training nameFinders, I usually use 500 iteratios and 5
> > cutoff, but I really dont know why I should use these values or
> > others, can anybody tell me something about this parameters?
> >
> > Thanks!
>
>
>

Reply via email to