thanks!

so i must add some line to the train file? or adding other file?
there are some example for the file and for the classification?


sorry i am new of opennlp :)

Il giorno 23/ago/2012, alle ore 15.21, Jörn Kottmann ha scritto:

> The error is thrown because you do not have enough training samples,
> try to run your code with at least 10 to 20 training samples.
> 
> Jörn
> 
> On 08/23/2012 03:15 PM, andrea maestroni wrote:
>> Hi to all!
>> 
>> i try to develop a program in java that take a document,extract the text 
>> ,analyze the text and extract the main topic of the document.
>> 
>> i think it 's a problem of document categorizer right?
>> 
>> i tried the example in the  manual page.
>> 
>> i have create the training file,i rtf file with the line:
>> 
>> GMDecrease Major acquisitions that have a lower gross margin than the 
>> existing network also \
>>            had a negative impact on the overall gross margin, but it should 
>> improve following \
>>            the implementation of its integration strategies .
>> GMIncrease The upward movement of gross margin resulted from amounts 
>> pursuant to adjustments \
>>            to obligations towards dealers .
>> then in my code i use this function for training a model:
>> 
>> public static void Train() throws InvalidFormatException, IOException {
>>                  DoccatModel model = null;
>> 
>>         InputStream dataIn = null;
>>         try {
>>             dataIn = new 
>> FileInputStream("/Users/andry85mae/Desktop/apache-opennlp-1.5.2-incubating/bin/train.train");
>>             ObjectStream<String> lineStream = new 
>> PlainTextByLineStream(dataIn, "UTF-8");
>>             ObjectStream<DocumentSample> sampleStream = new 
>> DocumentSampleStream(lineStream);
>> 
>>             model = DocumentCategorizerME.train("en", sampleStream);
>>         } catch (IOException e) {
>>             // Failed to read or parse training data, training failed
>>             e.printStackTrace();
>>         } finally {
>>             if (dataIn != null) {
>>                 try {
>>                     dataIn.close();
>>                 } catch (IOException e) {
>>                     // Not an issue, training already finished.
>>                     // The exception should be logged and investigated
>>                     // if part of a production system.
>>                     e.printStackTrace();
>>                 }
>>             }
>>             }
>>            }
>> 
>> but i give me an error...
>> 
>> java.io.IOException: Empty lines, or lines with only a category string are 
>> not allowed!
>>      Computing event counts...  Incorporating indexed data for training...
>> Exception in thread "main" java.lang.NullPointerException
>>      at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
>>      at opennlp.maxent.GIS.trainModel(GIS.java:256)
>>      at opennlp.model.TrainUtil.train(TrainUtil.java:182)
>>      at 
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:154)
>>      at 
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:176)
>>      at 
>> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:207)
>>      at opennlp_prova.Opennlp_prova.Train(Opennlp_prova.java:55)
>>      at opennlp_prova.Opennlp_prova.main(Opennlp_prova.java:96)
>> Java Result: 1
>> 
>> what are the error?
>> 
>> thank in advance!!!
>> 
>> 
> 

Reply via email to