Great! Thank you so much for helping me so much! Am I missing that link in the 
documentation?



Also, I found some slight differences in the value returned from the OpenNLP 
API and what is defined. Seems to be a perfect matchup, so I assume that the 
constants used just need updating or they just changed the definition. Are the 
rows that are different equivalent values?


Penn Treebank Tag Set Definition

Produced by OpenNLP API

MATCH?

Tag

Definition

Tag



#

#

#

MATCH

$

Currency

$

MATCH

''

Double Quote

''

MATCH

,

Comma

,

MATCH

-LRB-

Left Bracket

-LRB-

MATCH

-RRB-

Right Bracket

-RRB-

MATCH

.

Period

.

MATCH

:

Colon

:

MATCH

CC

Coordinating conjunction

CC

MATCH

CD

Cardinal number

CD

MATCH

DT

Determiner

DT

MATCH

EX

Existential there

EX

MATCH

FW

Foreign word

FW

MATCH

IN

Preposition or subordinating conjunction

IN

MATCH

JJ

Adjective

JJ

MATCH

JJR

Adjective, comparative

JJR

MATCH

JJS

Adjective, superlative

JJS

MATCH

LS

List item marker

LS

MATCH

MD

Modal

MD

MATCH

NN

Noun, singular or mass

NN

MATCH

NNS

Noun, plural

NNP

DIFFERENT

NP

Proper noun, singular

NNPS

DIFFERENT

NPS

Proper noun, plural

NNS

DIFFERENT

PDT

Predeterminer

PDT

MATCH

POS

Possessive ending

POS

MATCH

PP

Personal pronoun

PRP

DIFFERENT

PP$

Possessive pronoun

PRP$

DIFFERENT

RB

Adverb

RB

MATCH

RBR

Adverb, comparative

RBR

MATCH

RBS

Adverb, superlative

RBS

MATCH

RP

Particle

RP

MATCH

SYM

Symbol

SYM

MATCH

TO

to

TO

MATCH

UH

Interjection

UH

MATCH

VB

Verb, base form

VB

MATCH

VBD

Verb, past tense

VBD

MATCH

VBG

Verb, gerund or present participle

VBG

MATCH

VBN

Verb, past participle

VBN

MATCH

VBP

Verb, non-3rd person singular present

VBP

MATCH

VBZ

Verb, 3rd person singular present

VBZ

MATCH

WDT

Wh-determiner

WDT

MATCH

WP

Wh-pronoun

WP

MATCH

WP$

Possessive wh-pronoun

WP$

MATCH

WRB

Wh-adverb

WRB

MATCH

``

Double Slanted Quote

``

MATCH








-----Original Message-----
From: Jörn Kottmann [mailto:[email protected]]
Sent: Tuesday, October 11, 2011 6:13 AM
To: [email protected]
Subject: EXTERNAL: Re: POS Tags



The English POS Model from the SourceForge download page

uses the Penn Treebank Tag Set.



Here is a link which list all tags:

http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-HTMLDemo/PennTreebankTS.html



Jörn



On 10/11/11 6:56 AM, Fotiadis, Konstantinos wrote:

> I am looking around the definition and have not found the definitions for the 
> POS tags.

>

> Can you help me with these?

>

> Example:

> "This is not a long sentence. I like turtles. Happiness is great!"

>

> I then call SentenceDetectorME to detect sentences. Then loop through the 
> sentences and call Tokenizer on each one. I then pass the token String array 
> to POSTaggerME to get the POS. Here is my output:

>

> Number of Sentences=3

> SENTENCE_ID=1 - TOKENS=7 - This is not a long sentence.

>    TOKEN_ID=1 - POS=DT - This

>    TOKEN_ID=2 - POS=VBZ - is

>    TOKEN_ID=3 - POS=RB - not

>    TOKEN_ID=4 - POS=DT - a

>    TOKEN_ID=5 - POS=JJ - long

>    TOKEN_ID=6 - POS=NN - sentence

>    TOKEN_ID=7 - POS=. - .

> SENTENCE_ID=2 - TOKENS=4 - I like turtles.

>    TOKEN_ID=1 - POS=PRP - I

>    TOKEN_ID=2 - POS=IN - like

>    TOKEN_ID=3 - POS=NNS - turtles

>    TOKEN_ID=4 - POS=. - .

> SENTENCE_ID=3 - TOKENS=4 - Happiness is great!

>    TOKEN_ID=1 - POS=NNP - Happiness

>    TOKEN_ID=2 - POS=VBZ - is

>    TOKEN_ID=3 - POS=JJ - great

>    TOKEN_ID=4 - POS=. - !

>

>

> Just curious of the definitions...

>

> Thanks, Kosta


Reply via email to