Hi,

I cannot reproduce this error. If I get the training data from the
CoNLL 2000 website as it is,

http://www.cnts.ua.ac.be/conll2000/chunking/

It trains perfectly well with default training parameters and obtains
92.40 F1 on the test distributed also in the CoNLL 2000 site.

Best,

R





On Tue, May 17, 2016 at 3:15 PM, [email protected]
<[email protected]> wrote:
> Dear Apache OpenNLP Project Team,
>
> I have another error with command line tool:
>
>     - I did exactly as information in site
> (https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.chunker.training.tool):
>
> E:\test\apache-opennlp-1.5.3\bin>opennlp.bat ChunkerTrainerME -model
> E:\test\en-chunker.bin -lang en -data E:\test\tmp.txt -encoding UTF-8
>
> File test only contains sample sentence as in the site :
>
> He        PRP  B-NP
> reckons   VBZ  B-VP
> the       DT   B-NP
> current   JJ   I-NP
> account   NN   I-NP
> deficit   NN   I-NP
> will      MD   B-VP
> narrow    VB   I-VP
> to        TO   B-PP
> only      RB   B-NP
> #         #    I-NP
> 1.8       CD   I-NP
> billion   CD   I-NP
> in        IN   B-PP
> September NNP  B-NP
> .         .    O
>
> And here is the error:
>
>         Computing event counts...  done. 0 events
>         Indexing...  done.
> Sorting and merging events... Done indexing.
> Incorporating indexed data for training...
> Exception in thread "main" java.lang.NullPointerException
>         at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
>         at opennlp.maxent.GIS.trainModel(GIS.java:256)
>         at opennlp.model.TrainUtil.train(TrainUtil.java:184)
>         at opennlp.tools.chunker.ChunkerME.train(ChunkerME.java:214)
>         at
> opennlp.tools.cmdline.chunker.ChunkerTrainerTool.run(ChunkerTrainerTo
> ol.java:68)
>         at opennlp.tools.cmdline.CLI.main(CLI.java:222)
>
>
> Another point: The function cannot read more than 2 sentence in one train
> file.
>
> Would you please check these points for me?
>
> Thank you so much for your help.
>
> Best regards,
>
> Trung Tran.
>
> On 05/17/2016 02:06 PM, [email protected] wrote:
>>
>> Dear Apache OpenNLP Project Team,
>>
>> I have an critical issue when training with Chunker tool in Java:
>>
>>     - Firstly, the sample code in documentation site
>> (https://opennlp.apache.org/documentation/1.6.0/manual/opennlp.html#tools.chunker.training.api)
>> is not work, both for version 1.5.3 and 1.6.0
>>
>>     - Secondly, I have to edit the codes myself to (using version 1.5.3):
>>
>> try {
>>             Charset charset = Charset.forName("UTF-8");
>>             ObjectStream lineStream = new PlainTextByLineStream(new
>> FileInputStream(fileChunker), charset);
>>             ObjectStream<ChunkSample> sampleStream = new
>> ChunkSampleStream(lineStream);
>>
>>             chunkerModel = ChunkerME.train("vn", sampleStream,
>> TrainingParameters.defaultParams(), new ChunkerFactory());
>>
>>             modelApacheChunkerPath =
>> UtilityHelper.getTemporaryFilePathInsideDir("chunkerModel.bin");
>>             OutputStream modelOut = new BufferedOutputStream(new
>> FileOutputStream(modelApacheChunkerPath));
>>             chunkerModel.serialize(modelOut);
>>         } catch (FileNotFoundException fe) {
>>
>>         } catch (IOException ie) {
>>
>>         }
>>
>>     - Thirdly, I have the error "java.lang.String cannot be cast to
>> opennlp.tools.parser.Parse". The reason is:
>>
>>             + The constructor of class ChunkSampleStream requires
>> parameter is "ObjectStream<Parse> in"
>>
>>             + However, the second parameter of method ChunkerME.train is
>> "ObjectStream<ChunkSample> in"
>>
>> I cannot find any way to work around this issue.
>>
>> Would you please check this point for me?
>>
>> Thank you so much for your help.
>>
>> Best regards,
>>
>> Trung Tran.
>
>

Reply via email to