This bug can also be caused by running the evaluator with a very small
amount of training data.

How many sentences do you have in your dataset?

Jörn


On Wed, Jan 27, 2016 at 10:44 PM, Joern Kottmann <[email protected]> wrote:

> Hello,
>
> looks like there are zero input sentences.
>
> Can you post a piece of your training data?
>
> Jörn
>
> On Sat, Jan 23, 2016 at 4:12 PM, Miller, Timothy <
> [email protected]> wrote:
>
>> I'm working with 1.6.0-bin and trying to do sentence detection cross
>> validation and getting an exception:
>>
>>
>> bin/opennlp SentenceDetectorCrossValidator -lang en -folds 5 -data
>> ~/Data/Projects/sentdetect/wsj02to21.raw.words
>>
>> Indexing events using cutoff of 5
>>
>>
>> Computing event counts...  done. 0 events
>>
>> Indexing...  done.
>>
>> Sorting and merging events... Exception in thread "main"
>> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>>
>> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>>
>> at java.util.ArrayList.get(ArrayList.java:429)
>>
>> at
>> opennlp.tools.ml.model.AbstractDataIndexer.sortAndMerge(AbstractDataIndexer.java:89)
>>
>> at
>> opennlp.tools.ml.model.TwoPassDataIndexer.<init>(TwoPassDataIndexer.java:105)
>>
>> at
>> opennlp.tools.ml.AbstractEventTrainer.getDataIndexer(AbstractEventTrainer.java:74)
>>
>> at
>> opennlp.tools.ml.AbstractEventTrainer.train(AbstractEventTrainer.java:91)
>>
>> at opennlp.tools.ml.model.TrainUtil.train(TrainUtil.java:53)
>>
>> at
>> opennlp.tools.sentdetect.SentenceDetectorME.train(SentenceDetectorME.java:326)
>>
>> at
>> opennlp.tools.sentdetect.SDCrossValidator.evaluate(SDCrossValidator.java:103)
>>
>> at
>> opennlp.tools.cmdline.sentdetect.SentenceDetectorCrossValidatorTool.run(SentenceDetectorCrossValidatorTool.java:78)
>>
>> at opennlp.tools.cmdline.CLI.main(CLI.java:224)
>>
>> ?
>>
>> If I just run train and give it a model name everything works ok with the
>> same dataset. Is there an option I'm missing or is there maybe an unknown
>> issue with cross validation?
>>
>> Thanks
>> Tim
>>
>>
>>
>

Reply via email to