Michael McCandless-2 wrote: > > > This would actually be a fairly large change: it's a change to the > index format and all APIs that handle offsets during indexing & > searching/retrieving. > >
For now I just changed the offset calculation in DocumentWriter as specified here by the OP: > replace DocumentWriter$FieldData#invertField offset = offsetEnd+1; by > offset = stringValue.length(); > It has side effects as previously mentioned on this list, e.g. if the tokenstream is not backed by a stringValue or the Analyzer does not calculate offsets in the normal way. But for my purposes it works. This issue was also discussed previously http://lucene.markmail.org/search/?q=offset%20documentwriter#query:offset%20documentwriter+page:1+mid:l6jbfmfisyg5zyre+state:results here . Michael McCandless-2 wrote: > > > We could alternatively extend TokenStream so you could query it for > the final offset, then fix indexing to use that value instead of the > endOffset of the last token that it saw. > > Querying the tokenstream for the final offset would good, but then would the change be put into the DocumentWriter directly or available as an option? Chris -- View this message in context: http://www.nabble.com/Incorrect-Token-Offset-when-using-multiple-fieldable-instance-tp15833468p18238566.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]