Thanks for clarifying. Once I fixed this I ran into similar errors with different sentences in the training file. It would be really helpful if a line/column number was included in the message. I had a lot of sentences (relatively speaking) so some of the errors were hard to track down.
--- On Mon, 4/22/13, James Kosin <[email protected]> wrote: > From: James Kosin <[email protected]> > Subject: Re: TokenNameFinderTrainer Error: Model not compatible with name > finder! > To: [email protected] > Date: Monday, April 22, 2013, 6:58 PM > Richard, > > The problem is the ',' after then <END> tag. > > <START:term> Avocados <END> , .... > > The error is because <END>, is not an <END> > token with the ',' butted > against it. > > Lower case may seem to work; but, then you don't have any > tokens... and > thereby no data to train. > > James > > On 4/22/2013 8:56 PM, Richard Head Jr. wrote: > > Using 1.5.2. My training data looks like this: > > > > Guacamole Dip: 5 Hass <start:term> Avocados > <end>, <start:term> > > Jalapeno <end> Puree with <start:term> Salt > <end> and <start:term> BHT <end> > (preservative). > > > > Here's the command I'm using: > > > > opennlp TokenNameFinderTrainer -encoding UTF-8 -lang en > -data terms.train -model terms.bin > > > > I found a message on this list acknowledging this as a > bug that should have been fixed in 1.5.1: > http://www.mail-archive.com/[email protected]/msg00162.html > > > > I should also note that the docs and the above message > say that entities should be marked using the > "<START:xxx> <END>" format. When I use uppercase > tags I receive the following error: > > > > Computing event counts... java.io.IOException: > Found unexpected annotation while handling a name sequence: > meal <END>, ###<START:term>### sugar > <END>, > > Incorporating indexed data for training... > > Exception in thread "main" > java.lang.NullPointerException > > at > opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263) > > at > opennlp.maxent.GIS.trainModel(GIS.java:256) > > at > opennlp.model.TrainUtil.train(TrainUtil.java:182) > > at > opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:360) > > at > opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:426) > > at > opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:458) > > at > opennlp.tools.cmdline.namefind.TokenNameFinderTrainerTool.run(TokenNameFinderTrainerTool.java:201) > > at > opennlp.tools.cmdline.CLI.main(CLI.java:191) > > > > > > > > > >
