The parse constructor takes an Incomplete parse node, which you add token nodes.
The tool does the following steps:
1) Split the string into tokens
2) Create a new string which each token separated by a single space
3) Create the "Incomplete parse node".
4) Insert the tokens into the "Incomplete parse node".
5) Parse the "Incomplete parse node" (You need to decide if you only want
the best parse or some number of the parses)
private static Pattern untokenizedParenPattern1 = Pattern.compile("([^
])([({)}])");
private static Pattern untokenizedParenPattern2 =
Pattern.compile("([({)}])([^ ])");
public static Parse[] parseLine(String line, opennlp.tools.parser.Parser
parser, int numParses) {
line = untokenizedParenPattern1.matcher(line).replaceAll("$1 $2");
line = untokenizedParenPattern2.matcher(line).replaceAll("$1 $2");
StringTokenizer str = new StringTokenizer(line);
StringBuffer sb = new StringBuffer();
List<String> tokens = new ArrayList<String>();
while (str.hasMoreTokens()) {
String tok = str.nextToken();
tokens.add(tok);
sb.append(tok).append(" ");
}
String text = sb.substring(0, sb.length() - 1);
/*Create "Incomplete node"*/
Parse p = new Parse(text, new Span(0, text.length()),
AbstractBottomUpParser.INC_NODE, 0, 0);
int start = 0;
int i=0;
for (Iterator<String> ti = tokens.iterator(); ti.hasNext();i++) {
String tok = ti.next();
/*Add token nodes*/
p.insert(new Parse(text, new Span(start, start + tok.length()),
AbstractBottomUpParser.TOK_NODE, 0,i));
start += tok.length() + 1;
}
Parse[] parses;
if (numParses == 1) {
/*Parse*/
parses = new Parse[] { parser.parse(p)};
}
else {
/*Parse*/
parses = parser.parse(p,numParses);
}
return parses;
}
-----Original Message-----
From: Jason Scherer [mailto:[email protected]]
Sent: Wednesday, June 05, 2013 8:37 AM
To: [email protected]
Subject: help using Parser class
Hi, I'm having some trouble figuring out how to use the Parser class.
Sorry if this is a noob question.
The Parser interface doesn't seem to have any parse() method that accepts a
tokenized sentence or string. The parse() methods accept a Parse object, i.e.
the root of a parse tree, which seems weird to me -- if something is already
parsed, why do you need to parse it? The only thing I can think of is that
you're supposed to pass it an empty Parse node and maybe it fills in the
children. But -- how do you tell it what to actually parse?
Meanwhile, the documentation says to use a method from the command line
tool:
String sentence = *"The quick brown fox jumps over the lazy dog ."*; Parse
topParses[] = ParserTool.parseLine(sentence, parser, 1);
Is that really the only way to parse a sentence -- to use a class from the
command line tool? Also, I can't find any information about the ParserTool
class anywhere in the javadocs so this is a nonstarter.
Can anyone on the list point me to some information about how to use this
class? Thanks.