Hi Jörn, Obama is correct, but Bill Gates. Since the NameFinderME return the token index (position in the token array) not the keyword position (the keyword position in the text). I want to cooperate with keyword position in Lucene.
Best Regards, Tri. On Sat, Nov 5, 2011 at 10:38 PM, Jörn Kottmann <[email protected]> wrote: > On 11/5/11 4:33 PM, Tri Nguyen wrote: > >> Can I have the word position of a token? >> ex: word position of token [Bill] should be 12. word position of token >> [he] >> should be 6. >> > > Well, looks like I am missing something, but isn't that what > NameFinderME.find returns? > It gives back an array of Spans where each span has a start and end offset > in your input > token array. > > Your sample: > > > [Barack] [Obama] [is] [president] [of] [US] [,] [he] > [was] [born] [August] [4] [,] [1961] [.] [Bill] > [Gates] [found] [Microsoft] [on] [April] [4] [,] > [1975] > [.] > > Here the Span for Obama would be start=0 and end=2. > > Jörn > > >
