Hi all,
I am trying to find out how to use the Java API for the chunker.
Below is the console output of my failed attempt (My test program is further
down in the body of this email).
Any idea on what I am doing wrong?
thanks in advance,
Ben
Console output:
------------------------------------
Token Part-Of-Speech
Most JJS
large JJ
cities NNS
in IN
the DT
US NNP
had VBD
morning NN
and CC
afternoon NN
newspapers NNS
. .
opennlp.tools.util.InvalidFormatException: Unkown artifact format: postagger
at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:132)
at opennlp.tools.chunker.ChunkerModel.<init>(ChunkerModel.java:60)
at ChunkerTest.main(ChunkerTest.java:55)
Exception in thread "main" java.lang.NullPointerException
at opennlp.tools.chunker.ChunkerME.<init>(ChunkerME.java:77)
at opennlp.tools.chunker.ChunkerME.<init>(ChunkerME.java:93)
at opennlp.tools.chunker.ChunkerME.<init>(ChunkerME.java:106)
at opennlp.tools.chunker.ChunkerME.<init>(ChunkerME.java:116)
at ChunkerTest.main(ChunkerTest.java:71)
My test program:
import opennlp.tools.postag.POSModel;
import opennlp.tools.postag.POSTaggerME;
import opennlp.tools.chunker.ChunkerModel;
import opennlp.tools.chunker.ChunkerME;
import java.io.*;
public class ChunkerTest {
private static final String POSTAGGER =
"C:\\Users\\Ben\\Desktop\\BenSave\\NPLmodels\\en-pos-maxent.bin";
private static final String CHUNKER =
"C:\\Users\\Ben\\Desktop\\BenSave\\NPLmodels\\en-parser-chunking.bin";
public static void main(String[] args) throws IOException {
InputStream modelIn4 = null;
POSModel modelPostTagger = null;
try {
modelIn4 = new FileInputStream(POSTAGGER);
modelPostTagger = new POSModel(modelIn4);
}
catch (IOException e) {
// Model loading failed, handle the error
e.printStackTrace();
}
finally {
if (modelIn4 != null) {
try {
modelIn4.close();
}
catch (IOException e) {
}
}
}
POSTaggerME myTagger = new POSTaggerME(modelPostTagger);
String sent[] = new String[]{"Most", "large", "cities", "in", "the", "US",
"had",
"morning", "and", "afternoon", "newspapers", "."};
String tags1[] = myTagger.tag(sent);
System.out.println("------------------------------------");
System.out.println("Token\tPart-Of-Speech");
for (int i=0; i < tags1.length;i++) {
System.out.println(sent[i]+"\t"+tags1[i]);
}
/*---------------------------------------------------------*/
InputStream modelIn5 = new FileInputStream(CHUNKER);
ChunkerModel modelChunker = null;
try {
modelChunker = new ChunkerModel(modelIn5);
}
catch (IOException e) {
// Model loading failed, handle the error
e.printStackTrace();
}
finally {
if (modelIn5 != null) {
try {
modelIn5.close();
}
catch (IOException e) {
}
}
}
ChunkerME chunker = new ChunkerME(modelChunker);
String tags2[] = chunker.chunk(sent, tags1);
System.out.println("------------------------------------");
System.out.println("Token\tPart-Of-Speech\tChunks");
for (int i=0; i < tags2.length;i++) {
System.out.println(sent[i]+"\t"+tags1[i]+"\t"+tags2[i]);
}
System.out.println("------------------------------------");
}
}