Looks like the Cross Validator is failing because you do
not have enough data? On how many sample sentences do you
run it?

We will investigate this further.

Jörn

On 10/6/11 2:26 PM, Nicolas Hernandez wrote:
Please find below the output of two runs which lead to an error:
SentenceDetectorEvaluator without "-misclassified true" parameter and
SentenceDetectorCrossValidator (which gives the same error with or
without "-misclassified true").

I tested on the examples from the documentation and also with my data.
Tell if you want more details or anything

$opennlp SentenceDetectorEvaluator -encoding UTF-8 -model
data/model/fr-sent.bin -data data/test/fr-sent.test
Loading Sentence Detector model ... done (0,013s)
Evaluating ...  in thread "main" java.lang.NullPointerException
        at opennlp.tools.util.eval.Evaluator.evaluateSample(Evaluator.java:80)
        at opennlp.tools.util.eval.Evaluator.evaluate(Evaluator.java:98)
        at 
opennlp.tools.cmdline.sentdetect.SentenceDetectorEvaluatorTool.run(SentenceDetectorEvaluatorTool.java:80)
        at opennlp.tools.cmdline.CLI.main(CLI.java:191)

$opennlp SentenceDetectorCrossValidator -encoding UTF-8 -lang fr -data
data/train/fr-sent.train -misclassified true
Indexing events using cutoff of 5

        Computing event counts...  done. 0 events
        Indexing...  done.
Sorting and merging events... Done indexing.
Incorporating indexed data for training...
Exception in thread "main" java.lang.NullPointerException
        at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
        at opennlp.maxent.GIS.trainModel(GIS.java:256)
        at opennlp.model.TrainUtil.train(TrainUtil.java:182)
        at 
opennlp.tools.sentdetect.SentenceDetectorME.train(SentenceDetectorME.java:283)
        at 
opennlp.tools.sentdetect.SDCrossValidator.evaluate(SDCrossValidator.java:104)
        at 
opennlp.tools.cmdline.sentdetect.SentenceDetectorCrossValidatorTool.run(SentenceDetectorCrossValidatorTool.java:98)
        at opennlp.tools.cmdline.CLI.main(CLI.java:191)



On Thu, Oct 6, 2011 at 1:02 PM, Jörn Kottmann<[email protected]>  wrote:
On 10/6/11 12:42 PM, Nicolas Hernandez wrote:
I try to run the Evaluator and CrossValidator programs of the 1.5.3 in
command line ?

It seems that the SentenceDetector, Tokenizer, PosTagger and the
chunker (at least) throw a java.lang.NullPointerException if the
misclassified parameter is set to false or not present for the
Evaluator programs. The CrossValidator programs do not work at all.

Before looking at it, is something (e.g. global refactoring) planed about
it ?
1.5.3 is the mostly the same version as the 1.5.2 RC 2.

The bugs you describe here should of course not be present, and must be
fixed for the 1.5.2 release. We just did a major refactoring of a lot of cmd
line
code. Looks like a regression.

Can you please give us more details? The stack trace would be helpful and
the
command line arguments you passed in. To find a bug I believe it should be
enough
to get this for one of the mentioned evaluators.

Jörn


Reply via email to