Saurabh,

Are there document boundaries (new lines) in your training data?

Jeff



On Tue, Apr 11, 2017 at 6:07 AM, Saurabh Jain <saurabh4768j...@gmail.com>
wrote:

> Hi All
>
> I am cross validating NameFinder training data using
> TokenNameFinderCrossValidator. Training parameters are as follows:
>
> Train algorithm name: MAXENT
> Trainer Type name: EventModel
> Iteration value: 100
> Cut off value: 5
> Beam size: 5
> No of folds: 3
> Total training instances: 22351
>
> Code snippet:
>
>         try {
>
>         evaluate = new TokenNameFinderCrossValidator("en", entity,
>  trainingParameters, TokenNameFinderFactory.create(null,
>
>        entityExtractionProcessor.getFeatureGenMap().get(entity),
> Collections.emptyMap(), new BioCodec()));
>
>         } catch (InvalidFormatException e) {
>
>                   e.printStackTrace();
>
>         }
>
>         evaluate.evaluate(sampleStream, 3);
>
>
> evaluate method is giving InsufficientTrainingDataException. Can anyone
> suggest me why it is happening as I have passed 22351 training instances
> and if it is 3 folds, then each fold will get around 7000 instances.
>
>
> --
> *Thanks & Regards*
>
>
> *Saurabh Jain *
> *AI Developer*
>
> *Active Intelligence  *
>
> *"*
> *To do a thing yesterday was the best time . Second best time is today .” *
>

Reply via email to