Great! Thank you so much for helping me so much! Am I missing that link in the documentation?
Also, I found some slight differences in the value returned from the OpenNLP API and what is defined. Seems to be a perfect matchup, so I assume that the constants used just need updating or they just changed the definition. Are the rows that are different equivalent values? Penn Treebank Tag Set Definition Produced by OpenNLP API MATCH? Tag Definition Tag # # # MATCH $ Currency $ MATCH '' Double Quote '' MATCH , Comma , MATCH -LRB- Left Bracket -LRB- MATCH -RRB- Right Bracket -RRB- MATCH . Period . MATCH : Colon : MATCH CC Coordinating conjunction CC MATCH CD Cardinal number CD MATCH DT Determiner DT MATCH EX Existential there EX MATCH FW Foreign word FW MATCH IN Preposition or subordinating conjunction IN MATCH JJ Adjective JJ MATCH JJR Adjective, comparative JJR MATCH JJS Adjective, superlative JJS MATCH LS List item marker LS MATCH MD Modal MD MATCH NN Noun, singular or mass NN MATCH NNS Noun, plural NNP DIFFERENT NP Proper noun, singular NNPS DIFFERENT NPS Proper noun, plural NNS DIFFERENT PDT Predeterminer PDT MATCH POS Possessive ending POS MATCH PP Personal pronoun PRP DIFFERENT PP$ Possessive pronoun PRP$ DIFFERENT RB Adverb RB MATCH RBR Adverb, comparative RBR MATCH RBS Adverb, superlative RBS MATCH RP Particle RP MATCH SYM Symbol SYM MATCH TO to TO MATCH UH Interjection UH MATCH VB Verb, base form VB MATCH VBD Verb, past tense VBD MATCH VBG Verb, gerund or present participle VBG MATCH VBN Verb, past participle VBN MATCH VBP Verb, non-3rd person singular present VBP MATCH VBZ Verb, 3rd person singular present VBZ MATCH WDT Wh-determiner WDT MATCH WP Wh-pronoun WP MATCH WP$ Possessive wh-pronoun WP$ MATCH WRB Wh-adverb WRB MATCH `` Double Slanted Quote `` MATCH -----Original Message----- From: Jörn Kottmann [mailto:[email protected]] Sent: Tuesday, October 11, 2011 6:13 AM To: [email protected] Subject: EXTERNAL: Re: POS Tags The English POS Model from the SourceForge download page uses the Penn Treebank Tag Set. Here is a link which list all tags: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-HTMLDemo/PennTreebankTS.html Jörn On 10/11/11 6:56 AM, Fotiadis, Konstantinos wrote: > I am looking around the definition and have not found the definitions for the > POS tags. > > Can you help me with these? > > Example: > "This is not a long sentence. I like turtles. Happiness is great!" > > I then call SentenceDetectorME to detect sentences. Then loop through the > sentences and call Tokenizer on each one. I then pass the token String array > to POSTaggerME to get the POS. Here is my output: > > Number of Sentences=3 > SENTENCE_ID=1 - TOKENS=7 - This is not a long sentence. > TOKEN_ID=1 - POS=DT - This > TOKEN_ID=2 - POS=VBZ - is > TOKEN_ID=3 - POS=RB - not > TOKEN_ID=4 - POS=DT - a > TOKEN_ID=5 - POS=JJ - long > TOKEN_ID=6 - POS=NN - sentence > TOKEN_ID=7 - POS=. - . > SENTENCE_ID=2 - TOKENS=4 - I like turtles. > TOKEN_ID=1 - POS=PRP - I > TOKEN_ID=2 - POS=IN - like > TOKEN_ID=3 - POS=NNS - turtles > TOKEN_ID=4 - POS=. - . > SENTENCE_ID=3 - TOKENS=4 - Happiness is great! > TOKEN_ID=1 - POS=NNP - Happiness > TOKEN_ID=2 - POS=VBZ - is > TOKEN_ID=3 - POS=JJ - great > TOKEN_ID=4 - POS=. - ! > > > Just curious of the definitions... > > Thanks, Kosta
