Thank you James,
I don't count the token having pattern ".*[A-Za-z0-9]+.*" and check some
cases it works.
The token is not satisfied that pattern can be a punctuation. Is that
pattern enough to cover a keyword?
Can we incorporate Lucene and OpenNLP so that the keyword position and
Named Entity position are compatible?


On Sun, Nov 6, 2011 at 12:22 AM, James Kosin <[email protected]> wrote:

> Tri,
>
> You could just subtract the number of punctuation tokens from the
> offsets you get.
> On 11/5/2011 1:08 PM, Tri Nguyen wrote:
> > On Sat, Nov 5, 2011 at 11:30 PM, Jörn Kottmann <[email protected]>
> wrote:
> >
> >> On 11/5/11 4:53 PM, Tri Nguyen wrote:
> >>
> >>> Obama is correct, but Bill Gates. Since the NameFinderME return the
> token
> >>> index (position in the token array) not the keyword position (the
> keyword
> >>> position in the text). I want to cooperate with keyword position in
> >>> Lucene.
> >>>
> >> What is a keyword position?
> >>
> > It is the order of the word in the text.
> > Ex:
> > Barack: 0
> > Obama: 1
> > president: 3
> > US: 5
> > he: 6
> > 1961: 11
> > Bill: 12
> >
> >> Jörn
> >>
>
>

Reply via email to