Hi, I cannot reproduce this error. If I get the training data from the CoNLL 2000 website as it is,
http://www.cnts.ua.ac.be/conll2000/chunking/ It trains perfectly well with default training parameters and obtains 92.40 F1 on the test distributed also in the CoNLL 2000 site. Best, R On Tue, May 17, 2016 at 3:15 PM, [email protected] <[email protected]> wrote: > Dear Apache OpenNLP Project Team, > > I have another error with command line tool: > > - I did exactly as information in site > (https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.chunker.training.tool): > > E:\test\apache-opennlp-1.5.3\bin>opennlp.bat ChunkerTrainerME -model > E:\test\en-chunker.bin -lang en -data E:\test\tmp.txt -encoding UTF-8 > > File test only contains sample sentence as in the site : > > He PRP B-NP > reckons VBZ B-VP > the DT B-NP > current JJ I-NP > account NN I-NP > deficit NN I-NP > will MD B-VP > narrow VB I-VP > to TO B-PP > only RB B-NP > # # I-NP > 1.8 CD I-NP > billion CD I-NP > in IN B-PP > September NNP B-NP > . . O > > And here is the error: > > Computing event counts... done. 0 events > Indexing... done. > Sorting and merging events... Done indexing. > Incorporating indexed data for training... > Exception in thread "main" java.lang.NullPointerException > at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263) > at opennlp.maxent.GIS.trainModel(GIS.java:256) > at opennlp.model.TrainUtil.train(TrainUtil.java:184) > at opennlp.tools.chunker.ChunkerME.train(ChunkerME.java:214) > at > opennlp.tools.cmdline.chunker.ChunkerTrainerTool.run(ChunkerTrainerTo > ol.java:68) > at opennlp.tools.cmdline.CLI.main(CLI.java:222) > > > Another point: The function cannot read more than 2 sentence in one train > file. > > Would you please check these points for me? > > Thank you so much for your help. > > Best regards, > > Trung Tran. > > On 05/17/2016 02:06 PM, [email protected] wrote: >> >> Dear Apache OpenNLP Project Team, >> >> I have an critical issue when training with Chunker tool in Java: >> >> - Firstly, the sample code in documentation site >> (https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.chunker.training.api) >> is not work, both for version 1.5.3 and 1.6.0 >> >> - Secondly, I have to edit the codes myself to (using version 1.5.3): >> >> try { >> Charset charset = Charset.forName("UTF-8"); >> ObjectStream lineStream = new PlainTextByLineStream(new >> FileInputStream(fileChunker), charset); >> ObjectStream<ChunkSample> sampleStream = new >> ChunkSampleStream(lineStream); >> >> chunkerModel = ChunkerME.train("vn", sampleStream, >> TrainingParameters.defaultParams(), new ChunkerFactory()); >> >> modelApacheChunkerPath = >> UtilityHelper.getTemporaryFilePathInsideDir("chunkerModel.bin"); >> OutputStream modelOut = new BufferedOutputStream(new >> FileOutputStream(modelApacheChunkerPath)); >> chunkerModel.serialize(modelOut); >> } catch (FileNotFoundException fe) { >> >> } catch (IOException ie) { >> >> } >> >> - Thirdly, I have the error "java.lang.String cannot be cast to >> opennlp.tools.parser.Parse". The reason is: >> >> + The constructor of class ChunkSampleStream requires >> parameter is "ObjectStream<Parse> in" >> >> + However, the second parameter of method ChunkerME.train is >> "ObjectStream<ChunkSample> in" >> >> I cannot find any way to work around this issue. >> >> Would you please check this point for me? >> >> Thank you so much for your help. >> >> Best regards, >> >> Trung Tran. > >
