Hi Chris- > To my knoweldge, the character position of the tokens is not preserved by > Lucene - only the ordinal postion of token's within a document / field is > preserved. Thus you need to store this character offset information > separately, say, as Payload data.
Thanks for the information. So adding the OffsetAttribute at index time doesn't embed the offset information in the index - it just makes it available to the TokenFilter? I'll try adding the offset from the attribute to the payload.. In terms of getting access to the payloads is the best way to reconstruct the token stream (as the Highlighter does)? Or is than an easier way to just get access to the payloads? Thanks, -Chris --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org