thanks!
so i must add some line to the train file? or adding other file?
there are some example for the file and for the classification?
sorry i am new of opennlp :)
Il giorno 23/ago/2012, alle ore 15.21, Jörn Kottmann ha scritto:
> The error is thrown because you do not have enough training samples,
> try to run your code with at least 10 to 20 training samples.
>
> Jörn
>
> On 08/23/2012 03:15 PM, andrea maestroni wrote:
>> Hi to all!
>>
>> i try to develop a program in java that take a document,extract the text
>> ,analyze the text and extract the main topic of the document.
>>
>> i think it 's a problem of document categorizer right?
>>
>> i tried the example in the manual page.
>>
>> i have create the training file,i rtf file with the line:
>>
>> GMDecrease Major acquisitions that have a lower gross margin than the
>> existing network also \
>> had a negative impact on the overall gross margin, but it should
>> improve following \
>> the implementation of its integration strategies .
>> GMIncrease The upward movement of gross margin resulted from amounts
>> pursuant to adjustments \
>> to obligations towards dealers .
>> then in my code i use this function for training a model:
>>
>> public static void Train() throws InvalidFormatException, IOException {
>> DoccatModel model = null;
>>
>> InputStream dataIn = null;
>> try {
>> dataIn = new
>> FileInputStream("/Users/andry85mae/Desktop/apache-opennlp-1.5.2-incubating/bin/train.train");
>> ObjectStream<String> lineStream = new
>> PlainTextByLineStream(dataIn, "UTF-8");
>> ObjectStream<DocumentSample> sampleStream = new
>> DocumentSampleStream(lineStream);
>>
>> model = DocumentCategorizerME.train("en", sampleStream);
>> } catch (IOException e) {
>> // Failed to read or parse training data, training failed
>> e.printStackTrace();
>> } finally {
>> if (dataIn != null) {
>> try {
>> dataIn.close();
>> } catch (IOException e) {
>> // Not an issue, training already finished.
>> // The exception should be logged and investigated
>> // if part of a production system.
>> e.printStackTrace();
>> }
>> }
>> }
>> }
>>
>> but i give me an error...
>>
>> java.io.IOException: Empty lines, or lines with only a category string are
>> not allowed!
>> Computing event counts... Incorporating indexed data for training...
>> Exception in thread "main" java.lang.NullPointerException
>> at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
>> at opennlp.maxent.GIS.trainModel(GIS.java:256)
>> at opennlp.model.TrainUtil.train(TrainUtil.java:182)
>> at
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:154)
>> at
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:176)
>> at
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:207)
>> at opennlp_prova.Opennlp_prova.Train(Opennlp_prova.java:55)
>> at opennlp_prova.Opennlp_prova.main(Opennlp_prova.java:96)
>> Java Result: 1
>>
>> what are the error?
>>
>> thank in advance!!!
>>
>>
>