Can I have the word position of a token? ex: word position of token [Bill] should be 12. word position of token [he] should be 6.
Regards, Tri. On Sat, Nov 5, 2011 at 10:05 PM, Tri Nguyen <[email protected]> wrote: > Hi Jörn, > > I understand your idea, but I want the word position: > > Barack Obama is president of US, he was born August 4, 1961. Bill Gates > found Microsoft on April 4, 1975. > > Barack Obama: position 0 > > Bill Gates: position 12 > > While token Barack has position 0 and token Bill has position 15. > > [Barack] [Obama] [is] [president] [of] [US] [,] [he] > [was] [born] [August] [4] [,] [1961] [.] [Bill] > [Gates] [found] [Microsoft] [on] [April] [4] [,] > [1975] [.] > > Best Regards, > > Tri. > > > On Sat, Nov 5, 2011 at 8:40 PM, Jörn Kottmann <[email protected]> wrote: > >> On 11/5/11 4:48 AM, Tri Nguyen wrote: >> >>> Hi, >>> >>> Could somebody guide me how to get positions of names in the document >>> like >>> the positions of keywords in Lucene? >>> >>> >> The name finder expects tokenized input, the names can only be mapped to >> tokens, >> but your tokens can be mapped back to character offsets. >> >> Jörn >> > >
