openNLP has a built-in n-grams feature generator which accepts a window
(i.e. 2 previous tokens + 2 next tokens)
Is this what you want?
Jim
On 11/12/12 11:08, Jean-Philippe Fauconnier wrote:
Hello,
I'm new in the OpenNLP's community. I use the MaxEnt library for a
extraction relation task in a corpus of enumeratives structures.
For example, for this enumerative structure as follows ;
"Under the IAU definitions, there are eight planets :
- earth,
- mars,
- etc.
"
This enumerative structure present a ontological relation "IS-A"
between the classifier "planets" and its items.
I use binaries features, like "has_Classifier" or
"has_Identical_Tokens_In_Items", etc. But,
intuitively, I think that a N-Gram lemmes features could capture most
interesting regularities.
For this purpose, I want implement manullay a N-Gram lemmas feature.
My question is as follows. If n is 3, how I can create a predicate
that takes into account three lemmas?
Is it necessary to hang lemmas them?
For example, with this sentence :
"The little boy eats an apple."
Is that the predicate can be :
"DET_ADJ_N ADJ_N_VER N_VER_DET VER_DET_N DET_N_PONCT myOutcome"
?
Thank you in advance
With regards
J. Fauconnier